I was running an SQL query today and it was sooooo slow. So slow, in fact, that it never returned. I asked the DBA, Reed, who built the table what might be up, and he informed me that it was not indexed. And proceeded to show me some cool stuff I could do to actually get my query to return. In the end, it was a comparison between the “not in” operator and a “left join”.
My original query was thus, names changed to protect the innocent:
1 2 3 4 5 6 7
I was trying to query the temp_legacy_attachments to get all rows that didn’t have a record in the new_attachments table. It never returned, and so Reed told me to give this one a try:
1 2 3 4 5 6 7 8
So, instead of using “not in” a set, I select all the legacy rows, then outer join to the new_attachment rows and filter where a column on the new attachment set is null (it’s the smaller/less-available set).
I thought it was pretty sweet. No magic bullet, though, as Reed tells me that there is a fair amount of debate over the performance difference between the two methods. You just have to try it and find out. For me, in this case, the outer join was more awesome.
Another savvy DBA, Bill, has graced us with another method yet:
1 2 3 4 5 6 7 8 9 10 11 12