Poorly written SQL statements can cause significant performance problems in your database environment. According to some experts, poorly written SQL can cause up to 70% of performance problems overall. Adding resources can mask many issues surrounding poorly written SQL, but comes with a cost. Is writing good, quality SQL (including block code—stored procedures, packages, functions, etc.) a dying art? And if it’s so important, why is that?
Why Focus on Tuning SQL Statements?
Most major RDBMS engines are licensed according to the number of processors (cores) they run on or have run on (in a virtualized environment), so the cost of throwing additional hardware at a performance issue is not just confined to the cost of the additional hardware. This hidden cost enhances the importance of tuning SQL so that your systems can do more with less.
Reasons We Don’t Focus on Tuning SQL Statements
Adding hardware is a common solution I’ve seen implemented by many companies. However, it’s not the only practice driving the lack of focus on tuning SQL. Others include these facts:
- Developers generally spend their time using algebraic logic. When asked to write SQL, it requires them to switch to a set-based logic (think Venn diagrams). This context change means having to think differently than they do during the vast majority of their coding time.
- When still in development, it’s most important to make sure the correct results are achieved (either from a query or a transaction) when running SQL against a database. Development environments are not usually scaled to the size of production systems, so tests don’t tend to reveal performance issues due to poorly written SQL early on. As a result, there’s seemingly no need to worry about it.
- The use of object relational mapping (ORM) tools has gained popularity. These tools don’t always produce optimally tuned SQL for our unique database environments.
- Developers (I’m guilty as well) like to reuse code. When done blindly, this can cause SQL statements to be tweaked just enough to achieve the necessary results without exploring implications to scalability or performance.
- It’s tough to write good SQL against poor data models. This can mean huge tables since inception, incorrect use of indexing, and suboptimal normalization for the application purpose, to name just a few issues.
- There is a lack of understanding about driving tables. This lack of understanding when doing joins across multiple tables as to which ones should be used as driving tables (joined earlier in the optimizer plan so that fewer rows are required to be evaluated downstream) is not uncommon.
- There is a lack of understanding about how to read execution plans. How can SQL be tuned if you don’t understand how it’s gone sideways? I’d include in this item the questionable assumption that the optimizer cost associated with a step in the plan is accurate.
- With a focus on productivity (which, in development, usually takes the form of lines of code), doing it right versus doing more is rarely rewarded.
Now It’s in Production!
As noted, there are many reasons for a lack of focus on tuning SQL in your database environments. However, you may now be experiencing a significant hit to performance in production. But how can that be? This could be an entire column on its own, but I’ll note just a few common causes for issues not being discovered until code hits production:
- Smaller datasets are used in pre-production environments.
- It’s very difficult to mimic production load in pre-production environments.
- There is a lack of understanding of true use cases in production.
- It’s expensive to work around the first two issues in this list.
The Benefits of Good SQL
Writing good SQL is at risk of dying in development and database environments. Many reasons contribute to the evaporation of this skill. Even database engine vendors aren’t incentivized to help correct poorly written SQL as it can actually help put money in their pockets through additional licensing costs when hardware is used to mitigate the impact. Often, SQL statement problems aren’t discovered until released into the wild in production. This can cause contention between those who write the code and those tasked with supporting/responding to production concerns. I’d like to see a revival in focusing on SQL tuning, which could lead to benefits such as decreased TCO for IT organizations, decreased tension between IT disciplines, and the joy of creating something to be proud of instead of just checking another item off of the to-do list.