Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Avoid ORDER BY; sort in the app instead.

If you have unlimited memory, network bandwidth isn't a problem and the additional garbage-collection won't kill your performance.

Otherwise, yeah, totally.



> If you have unlimited memory, network bandwidth isn't a problem and the additional garbage-collection won't kill your performance.

News flash: your database server doesn't have unlimited memory either.

About network performance - if you have such a large result set that you're worried about network performance, than the sorting work in the database is going to be very non-trivial as well. App servers scale out inexpensively - SQL Server does not.


It doesn't take more bandwidth to transfer unsorted than sorted data, and the memory cost will presumably be the same for sorting on the server as on the client. Filtering should always happen in the database for the sake of bandwidth and performance, but this doesn't apply to sorting the resultset since sorting doesn't change the amount of data.

(Unless the sorting is an intermediate step before further filtering, in which case the advice obviously does not apply.)


Does this same problem occur with PostgreSQL?


No - since the metric used in the article is in terms of licensing-$, not a technical reason.

> "SQL Server Enterprise Edition goes for about $7,000 per CPU core – not per processor, but per core. A two-socket, 6-core-each server rings up at around $84k – and that’s just the licensing costs, not the hardware costs. You can buy a heck of a lot of application servers (even ones with 256GB or more of memory) for $84k.

> Consume all of the query results as fast as possible into memory in your app, and then sort."

BUT - if you are using Postgres, you might be able to use BRIN indexes as a cheap index to speed up this kind of operation, without needing a full B-tree type index.


Licensing costs is just a proxy for CPU time on the database, so in that regard it is still relevant to other RDBM systems. You could scale the database processors (vertical or horizontal), but usually it's easier to scale the app servers.

Anyway, the articles point is being taken a little too absolutely in this comment thread. He wasn't suggesting pulling in an entire multimillion row table and sorting it to extract 50 rows. Just that avoiding sorts in the database where it makes sense and deferring to the app can be a better solution. What those situations are could be an article itself.


The page for ORDER BY on the Postgres website, seems to indicate that it is made efficient by design for the common cases of ORDER BY ASC and DESC at least - but you have to use B-tree indexing for this.

Further reading seems to indicate that if you have WHERE, ORDER BY or GROUP BY statements they all have to be covered by (the same?) B-tree index.

Exhaustive testing from a DBA on StackExchange seems to take a query from over 1200ms down to ~7ms for the case where you can do this.

see: http://dba.stackexchange.com/questions/105316/how-to-speed-u...


That's true for single-table queries, but less so for multi-table joins. The RDBMS may need to sort your tables (even multiple times) in order to join them together, or the WHERE/GROUP/ORDER can be on combinations of fields and conditions.


I don't understand this. Isn't sorting an in-place procedure that does not need to allocate on the heap?


Depends on the sort. A naively implemented mergesort needs a temporary array of the same size as the the array being sorted. But whilst that takes up a chunk of memory for a while, it shouldn't be the sort of thing that makes GC choke.

And anyway, not many people will be using a naively implemented mergesort in this situation!


Sure, as long as you're okay with mutable data structures and the headaches they cause with threading.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: