Years ago, when I first read one of these benchmarks, I was surprised to see that SQL Server was running with undocumented startup switches and trace flags. Nowadays, Microsoft has made amazing progress improving their documentation for SQL Server 2008 R2. If you haven’t ever used trace flags and startup switches, you can learn a lot from them. For example, in a recent TPC-H benchmark published by HP running on SQL Server 2008R2, you can see all of the parameter settings used in the course of the benchmark.
Here’s what that most recent benchmark used to start SQL Server:
SQLSERVR -c -x - E –T834 –T836 –T2301
Here’s what the case-sensitive switches and trace flags mean:
-c
Shortens startup time when starting SQL Server from the command prompt, by telling SQL Server to skip the step used to start SQL Server as a Microsoft Windows service.
-x
Disables most of the important, built-in monitoring features such as PerfMon counters, CPU and cache-hit ration statistics, and a wide variety of internal system metadata. This option forces you to lose a boatload of troubleshooting and performance tuning information.
-E
Increases the number of extents that are allocated for each file in a filegroup. Good for data warehouses with relatively few users running data scans, but probably bad for OLTP applications.
-T834
Tells SQL Server to use Microsoft Windows large-page allocations for memory allocated into the buffer pool.
-T836
Forces SQL Server to size the buffer pool at startup based on the value of the max server memory option instead of based on the total physical memory.
-T2301
Enables advanced optimizations specific to data warehouse queries for processing very large data sets.
These are the sort of configuration settings which are very specialized for the benchmark workload. DBAs running generalized workloads, OLTP workloads, and even instances of SQL Server supporting a variety of different types of databases with different kinds of workloads should NOT use these settings.
In a sense, the benchmark disclosures imply one of the fundamental lessons that most enterprise DBAs know by heart. That is, you should know the needs of your database workload intimately. When this is the case, you can immediately tell whether a given configuration tweak, such as –T834, can help your application or not.
If, on the other hand, you’re the sort of DBA who sees a performance tip and applies it without a full testing process, you’re doing it wrong! I can’t tell you how many times I’ve spoken at conferences and gotten a question that goes like this “I read a blog post/magazine article/conference talk that recommended I enable XYZ. I gave it a try on one of my slow production servers. It didn’t make performance any better or made things worse. Why is that?” There’s so much wrong with this sort of question, but it’s indicative of a very prevalent mindset among enterprise DBA:
The DBA didn’t have quantitative metrics of performance before and after the change. How can s/he every really know whether performance is better or not?
The DBA saw a tip for a specific workload and first didn’t verify or test whether it’d help with their specific workload.
The DBA didn’t have a rollback or undo process to recover in case of a very bad result from the change.
The DBA is violating the first rule of DBA-ness “You must protect the data as the main asset of the organization”. Changes should be managed and scheduled, not rolled out haphazardly.
Having said all of that, if you are the sort of DBA who rigorously monitors and controls change within your databases, you can take a look at the TPC benchmark disclaimers to see if there’s a new tip or trick that you haven’t tested yet to see if they’ll help performance.
So, what about you? Have you ever enabled a trace flag or switch without enough testing? How did you resolve that situation? On the other hand, have you rolled out a trace flag or switch that really improved performance? How did you find out about the tweak? What was your process for rolling it out into production?
I’d love to hear back from you. Enjoy!