Recent blog posts
Here's a noob-to-noob optimization trick. Suppose you have a database table with, say, 200,000 records, and you regularly select on multiple criteria. The rule for selection is to put the most specific WHERE clause first, and the least specific last. The goal is to cut down the search set to something small, and then search through the smaller set. Get all the queries using this order, then create a composite index over the keys to speed up the search even more.
Here are some before and after shots, based on real queries (from sf-active):
select * from tb where display='t' and parent_id=0 and id > 198000 limit 0,30
select * from tb where id > 198000 and parent_id=0 and display='t' limit 0,30
This revision will now cause the first clause to eliminate most of the rows from the table, leaving only around 2,000 rows to scan. The second clause, parent_id, eliminates 50% of the remainder. Display='t' is the least selective clause.
Also, it wasn't noted, but there are already indexes for `display` and `parent_id`. So we aren't starting with absolutely nothing.
select * from tb where display='t' and parent_id=0 limit 0,30
select * from tb where parent_id=0 and display='t' limit 0,30
Also do this:
alter table tb add index (parent_id, display)
That looks virtually identical. Again, this is a real-world situation, where the query was built-up dynamically. The optimization here is that I created an index that will speed up the select. The index matches the order of the query, so the query optimizer will be able to find the optimization easily.
Additionally, it would be a good thing to put all the clauses in all the queries into this order, from most specific to least specific, to gain the maximum optimization. I suspect the query optimizer already does this automatically, but, being meticulous about this seems like good mental discipline.
The real-world effect of this simple optimization, which took around two hours to complete, was dramatic. The slow query had been bogging down the server, with queries taking thousands of seconds to execute (or in our situation, to time-out, and require the admin to go in and kill the thread). Now, the query barely shows up in the process list, and the real-world speed feels like it takes less than five seconds to execute through the web (meaning, it includes dns lookup, tcp connection, and page rendering). Typically, it takes one second, and feels pretty fast.