Hacker Newsnew | past | comments | ask | show | jobs | submit | chuckcode's commentslogin

Would like to see the latency and cost of parsing entire 10M context before throwing out the RAG stack which is relatively cheap and fast.


Thanks for details! Few follow up questions:

- I've seen neural nets using int8 for matrix multiplication to reduce memory size [1]. Do you think something similar could be useful in the ANN space?

- Do you know of any studies using Faiss looking at speed/cost tradeoffs of RAM vs flash vs Disk for storage?

- Are there recommended ways to update Faiss index with streaming data, e.g. updating the vectors continuously?

Seems like more and more use cases for Faiss as neural nets become more and more core to workflows. Would like to try and figure out the configurations that are optimized to minimize carbon usage in addition to latency and recall metrics.

[1] https://arxiv.org/abs/2208.07339

(edit for formatting)


Regarding reduced precision, depending upon what you are trying to do, I think it doesn't work quite as well in similarity search as it does for, say, neural networks.

If you are concerned about recall of the true nearest neighbor (k=1), in many datasets I've seen (especially large ones) in float32 the distance from a query vector to candidate nearest neighbor vectors may only differ by some thousands of ULPs when performing brute force search, which if done in float16 would result in the true nearest neighbor being the same as (or perhaps behind even, due to rounding error) other proximate vectors. If you are performing approximate lookup and you have the luxury of performing reranking (you store the compressed / approximate index for lookup, but return a larger candidate set like k=100 or k=1000 and refine the results based on true distances computed from the uncompressed vectors via brute-force, so you have to keep all the original vectors around) then this problem can go away.

If however you are looking at recall@100 (is the true nearest neighbor reported within the top k=100) or set intersection (of the k=100 approximate nearest neighbors, how much overlap is there with the true set of the 100 nearest neighbors), then this doesn't matter as much.

Certainly a lot of the options in the Faiss library are geared towards compression and quantization anyways (e.g., storing a billion high-dimensional vector dataset in 8/16 GB of memory) so this is a tradeoff as with everything else.

In Faiss there are the scalar quantizer indexes which do store vectors in int4/int8 etc for which int8 GEMM (accumulate in int32) would be great, but using this would require that the query vector itself be quantized to int4/int8 as well. This is the difference between "asymmetric distance computation" (ADC) where you compute the distance between int4 encoded vectors (or product quantized encoded vectors, etc) versus a float32 query vector, where we reconstruct the database vectors in floating point and compare in floating point, versus symmetric distance computation (you have to convert the query vector to int4 say, and compare in the quantized regime). ADC tends to work a lot better than symmetric computation, so this is why we don't use pure int8 GEMM, but maybe in many applications (NN inference, say, instead of image database search) the non-ADC comparison would be ok.


Thanks for the very helpful and detailed reply!


Definitely not unusual. I think it is pretty common for executive team in addition to founders. My feeling is that VCs and founders need to find a way to partially cash out rank and file employees along the way if they want start up model to succeed long term. Many senior engineers are reluctant to to join startups at this point as even if startup is successful it can be a long time before they have the money in their pocket. Employees at Reddit, Stripe, Instacart, Databricks, and many others have been waiting over a decade for company success to hit their wallet.

Sometimes the executive team gets stock options rather than RSUs so they own the stock and can sell to secondary parties. VCs and founders would like them to sell to known parties rather than sell on private market (Facebook crossing 500 investors threshold was one important reason for IPO timing [1]).

[1] https://web.archive.org/web/20120517045249/http://blogs.reut...


> My feeling is that VCs and founders need to find a way to partially cash out rank and file employees along the way if they want start up model to succeed long term.

They also need to adjust comp structure. I've had a few offers to be engineer #1 at this hot startup or that, some of which I thought had a good chance of success. Assuming the best case scenario I always found the following to be true:

1. Founders would outearn me 100:1

2. Given any reasonable assumptions around likelihood we fail, the E(V) of the offer was less than I make now.

3. Founders could, at any time before IPO, completely screw me out of my paper gains e.g. via an acquisition.

I've always pointed out it's possible to make me an offer I'll take, the resources are there, but no one bites. This is absolutely killing a lot of otherwise potentially great startups. I've seen some fantastic business ideas fall down in execution because neither founders nor investors could bear to let key employees share more of the spoils and be equally protected from the downsides.


Everyone feels this way. It feels like such a market inefficiency (which a company could arbitrage to get top talent). But the fact that this inefficiency never goes away makes me think I’m missing something. I usually just chalk it up to the majority of engineers not understanding the basics of startup equity (or expected value and failure risk… idk). Regardless, the end result seems to be that there’s an adequate supply of labor. So much so that the pressure which would correct this inefficiency is sufficiently mitigated. It makes no sense from a risk/reward perspective that an “founding” engineer is making 1/100th of a founder. They both have the same level of risk.


I think it's worth remembering that these kinds of inefficiencies can last for years, maybe decades before correcting, this post has some good examples: https://danluu.com/nothing-works/

I think there are a few things that explain the phenomena:

1. You're asking rich and powerful people to give up control. Everyone hates giving up control, but my experience is that people accustomed to control hate it more.

2. It's not familiar, which means it feels risky.

3. It objectively lowers total payout in the best and worst case scenarios for power, and it's not actually clear the E(V) for the capital class goes up. It might, but no one knows apriori, and it's expensive to test.

This mix of low information, a small number of potential actors, and feelings of anxiety around uncertainty and loss of control is a pretty potent mix for inefficiencies like this to persist IMO.


Alternatively (which is probably a variant of 3), you’re asking someone to give you 10x or more the shares of what they have reason to think their next-best alternative is.

If an employee #1 share grant isn’t for you (as it isn’t for me), there are plenty of other capable engineers out there who are willing to take a lower stake for a variety of reasons.


> you’re asking someone to give you 10x or more the shares of what they have reason to think their next-best alternative is.

I don't think this is an accurate representation of my views. I'm happy to take more in the 2-3x range so long as I'm protected from the founders cashing out while screwing me out of my paper gains.

> there are plenty of other capable engineers out there who are willing to take a lower stake for a variety of reasons.

Maybe, but they can't always find them! Businesses rarely fail for one reason but maybe 1/3 of the startups that approached me and failed were bogged down by poor technical decisions made early on (what's that? your engineering lead with 3 years of experience set up a totally custom kubernetes cluster and it's slowing down your execution? Man that's rough!)


That link is an interesting read. Thanks!


Investors and founders hate down rounds. You want a pricing to be as beneficial as possible to all involved. Startups are extremely risky and getting a deal done at any price is an accomplishment.

Now you want to give a lot of small shareholders liquidity. The pricing is far more often. The whims of the world plus poorly negotiated deals can make a small sale be far below the last investment price. Psychologically this is bad even for wealthy investors and investing is heavily based on psychology.

No one wants to let that happen except the employees. The founders, the board, and the earlier investors don’t want anything that can jeopardize their argument for what their shares are worth.


I don't think there is really a correctable inefficiency here for a number of reasons.

In regards to the very first 1-2 employees, from what I have seen is that they are either a) fairly inexperienced (and therefore willing to take lower compensation), b) very excited about the technology and willing to work for less compensation, or c) very experienced and compensated well and/or given a ton of shares to the point where they may even be considered a founder.

So if you are employee #1 and you are very experienced and can negotiate for a nice chunk of shares, you almost always will get pulled into the founding team. It is not uncommon for startups to have minority founders that have 5-10% of the shares.

And that means when you hear about an early employee complaining about their compensation, they are most likely going to be from category a or b. You aren't going to be hearing from category c, because they were treated well and might even consider themselves founders.


"Employees at Reddit, Stripe, Instacart, Databricks, and many others have been waiting over a decade for company success to hit their wallet."

And to add to that, should they decide to leave and have to exercise their options, they'll be on the hook for a huge tax bill had they not been allowed to early exercise and file 83b letters.


Do you think part of this is that Netflix has assumed zero effort from user model? My experience has been that Netflix does an ok job of recommendations, but fails at overall discovery experience. There is no way for me to drive or view content from different angles easily. I end up googling for expert opinions or hitting up rotten tomatoes to get better reviews. Netflix knows a ton about me and their content, but seems to do a poor job of making their content browseable/discoverable overall. I do like their "more like this" feature where I can see similar titles.


Perhaps its because its a niche that isn't worth investing the resources into? Sometimes narrow problem spaces are harder for a company to justify because of the cost to reward ratio. I agree with what you want and would like it myself for music (Spotify, Amazon Music, etc.) but it's a complex problem (recommenders, custom UI and the glue between) that is hard to justify compared to incremental small improvements to existing general purpose recommendations.


It just seems like there's a paucity of signal from which Netflix could come up with anything intelligent. Movies are many hours long, and there are many reasons I could be watching something. What does it mean that I allowed a movie to play to completion? Was I even paying attention? Did I decide I hated it 3/4s of the way through, but finished it just because I cared about the plot?

TikTok, on the other hand, has way more data. Things like time-to-swipe, shares, comments presumably form the basis of some sentiment metric.


Google TV has the best content discovery I've come across so far. Recommendations across most streaming services based on overall similar movies, different slices of the genre, and movies with similar directors/cast members. Plus as soon as you select another movie, you can see all the same "similar" recommendations for that movie.


>Do you think part of this is that Netflix has assumed zero effort from user model?

Talking w/a friend who works at Netflix, it sounds like this is a warranted assumption. The way he told it, they were tearing their hair out at one point b/c users wouldn't put much into it.


What I don't understand about their response is: why not make it configurable? Admittedly this is my philosophy for almost every product I work on - "make it maximally configurable, but make the defaults maximally sane" – but I'm baffled every time I hear someone talking about this 'dilemma'.

You just keep your simple interface, but allow the power users to, say, click through to a particular menu and change their setting – the setting in this case being ~"let me provide feedback / configure how recommendations work". For that kind of user, finding a 'cheat code' is actually a gratifying product experience anyway.


I think its because the complexity of allowing configurability isn't always worth it. Verifying it works for all configurations becomes exponentially harder.

I believe it can also have performance implications especially for things like recommender systems where you are depending a lot on caching, pre computation and training.


I agree, but as aleksiy123 suggests there is an additional complexity burden and it is a long journey to teach users to make use of a new technology. I think a lot of "advanced" features get de-prioritized as not many people use them and it seems like resources could be better spent helping the masses. I think that the importance of "advanced" features is often under rated by traditional engagement models. Wikipedia is a great example of where less than 1% of users click on the edit button, but that 1% adds all the value for the other 99%.


I don't disagree!


I'm a little surprised that the author is missing the critical innovation of crypto which is digital trust and observability. Sure it is easy to make an argument that one cryptocurrency or another is a bubble, but don't underestimate the importance of being able to distribute work and verify trust at scale.

Just look at how git has transformed software development by mapping code to a hash. Or how DNS + SSL has transformed how people trust and transact online. Is the scalable future one where people and organizations trust their data to the cloud or other 3rd parties with no way to verify integrity?

Do people think that the future of human agreements is signatures on little pieces of paper managed by courts and lawyers? Personally I think it will be digital. Given it is digital, do you think there will be some "centralized database" run by government or a commercial entity that can be trusted as single point of failure? Personally I sure hope that there is some way to distribute and verify data integrity even if it isn't full blown proof of work. I'd sure prefer something more like git where I can see if two branches are the same even from different sources.


You're exactly right.

Reading this article and the other HN comments, it makes me think our community is aging and becoming stuck in old ways of thinking.

It's surprising and disappointing to see it happen to this group, but I guess it's inevitable.


That sounds like regular old cryptography, which I don't think anyone would argue isn't important. It's a very different discussion than "crypto" a.k.a cryptocurrencies.


Author is making an argument against any blockchain or distributed ledger. To quote from article. "Any application that could be done on a blockchain could be better done on a centralized database. Except crime."

I'd like to see people look past the noise of cryptocurrencies to see how important digital trust and new applications of cryptography will be as we try to scale the ability of humans to work together effectively at scale.


So, can you name one single successful application of blockchain in the real world? Apart from crime.

I believe this is exactly the point of the author - there's lots of handwavy bubblebabble about this technology, but we're now well into the second decade of blockchain/distributed ledgers, and yet to see one single non-criminal real world application.


I'd argue that git is a distributed blockchain that has been pretty impactful, it just doesn't use proof of work for validation.

If it needs to be a company Ripple uses blockchain to help companies move currency safely around the globe. (https://ripple.com/)


It's hard to believe $2.6T worth of crypto is all for criminal applications. Maybe you've overlooked something.


From just before the dot-com bubble burst to when it had fully deflated, the tech stocks had lost some $5T of stock market valuation - in 2002 dollars. So no, it's completely feasible that there is $2.6T of hot air speculative investment in cryptos. In fact, considering how divorced from practical and technical reality all of the proposed crypto schemes so far have been, that sounds like a low estimate.

There is some value in providing shadow banking to the global criminal underworld, but I don't think it will be the next technological revolution.


Which is dumb because he works on a digital coin on a distributed ledger


Thats all true but still there is the hard to deny fact that a federated trust based system is more efficient. See tls in web. With all the flaws it works remarkably well.


Thanks for this comment, sums up exactly my experience with python. Do you have a pointer to a larger write up of these issues? I really enjoy python, but agree with you that it isn't well suited for performance or scale necessarily. I find that people really get attached to it though and I'd like to have a good reference for them to help explain why issues like interpreter performance, GIL and typing make a difference on large projects even if not their pet ones.


Great point. I'm not one to over optimize, but seems like parsing messages for internet sized apps is worth spending a little effort to save energy and environment.


Don't get distracted by the click bait title. Effect size should be captured by statistical significance (larger effects are less likely to happen by chance). Author is really complaining that the original study didn't report enough data to check their analysis or do alternative analysis methods. Better title for article would be "Hard to peer review when you don't share the data"


Note the point in the essay that statistical significance is meaningless if the model does not correspond to reality — which, in this case as in many, they very much do not.


I see a lot of suspicion in thread below, which I very much understand.

I'd like to take a minute though to express my frustration with the banks that refuse to supply any sort of limited APIs. How is it 2021 and I still can't give my tax person read only access to a specific year of transactions? Plaid and others trust issue would be so much easier if the banks had any sort of control over sharing aside from none or authorized to do anything.


Your banks would need to create APIs with fine grained access to do the things you describe.

Go ahead and explain to a bank who has a STAGE COACH in their logo what an API is and why they need one with fine grained access.


+1 to the reality that most productivity tips come off as hopelessly naive to the reality of life with kids, sick parents/spouses, customer demands, etc.

Few things that have helped me:

- Have something like google calendar that reminds you so don't forget something important. Avoid turning small tasks into huge ones by missing a critical deadline

- Change the game when it is stacked against you. Most productivity advice is written about what can you do alone in the current situation (game) to improve, when changing situations could make things much better. I see too many people get stuck in bad relationship, bad job, etc. trying to make it work. Find places where the tide is rising and raising all the boats rather than working against you.

- Be kind and realistic to yourself. Look around, life is challenging and nobody wins everthing. Coach yourself the same way you would coach your kids with kindness and empathy and setting them up for success rather than something impossible.

(edit formatting)


They're good tips. I went on a productivity binge late last year, and all the books I read boil down to a few things.

1) Get as much information out of your head as possible. Store it somewhere you can find it later, but the specifics don't matter.

2) Segment your tasks into something you can reasonably accomplish in a given time. If you don't get it done you either misunderstood the task or the timeslots, review and try againt later.

3) Start. Just start. It doesn't even have to be something you had on your list. Just decide to do a thing, then do it.

Add analogies and anecdotes, bribe the NYT Best Seller list, and book yourself on the Today Show. Being a productivity writer is easy!


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: