When you decide to undertake your own benchmarking project, it’s a strongly recommended best practice to write up a benchmarking plan. For example, the dev architect might propose that the main OLTP application could support much faster reporting without putting further stress on the base tables by using indexed views, that is materialized views, on SQL Server – but doesn’t really know for sure whether it will help or not. That sounds like a great use case for a benchmark!
But I can tell you from hard won experience that many weeks of hard work have ruined because the person or team performing the benchmark failed to prepare properly. Many of these ruinous mistakes were caused by forgetting the cardinal rule of benchmarks – a benchmark must produce results that are both reliable and repeatable so that we can foster conclusions that are predictable and actionable. Keeping the “reliable and repeatable” mantra in mind necessitates a few extra steps.
Step 1: Isolate the Benchmark Hardware and OS
It seems like an obvious requirement to me – that the benchmark hardware and OS should be isolated from other production, development or test environments. Sure, it sounds easy enough. But in today’s world of consolidated disk storage and consolidated servers, not to mention server virtual machines, this can be both expensive and really hard to do. If possible, you want your benchmarking environment to share literally no components with other active systems. As you can guess, this need for isolation is one of the most expensive elements of a good benchmark. (The other expense is the man-hours invested in the project).
Imagine, for example, if you were benchmarking on a computer that was also a user’s workstation. How could you be certain that a specific performance anomaly wasn’t actually the user’s Outlook application sync’ing with the Exchange server? That sort of shared environment can completely invalidate a benchmark.
Other elements of isolation that are usually necessary are 1) separate servers to drive load against the benchmarked environment, and 2) separate servers to collect and store performance metrics. These may not be needed for very casual benchmarks. But for any serious benchmark, these two components of the test cause enough resource consumption to skew the results of the benchmark (sometimes very badly so) if they are not isolated.
Step 2: Capture All Documentation for the Environment
Again, it seems more than obvious to me that you’d want a full set of documentation for a given environment. But I’ve seen many teams attempt to analyze their benchmark results, only to discover that they have no idea what the Windows registry settings are for the server they were testing on. Did the server enable Lock Pages in Memory? Was the NIC set with Jumbo Frames?
In the same way, they may not be certain about how the SQL Server instance-wide configuration settings, tempdb configuration, user database settings, or the exact transactions running during the benchmark. Be sure to capture all of the following bits of information in a document:
- Windows configuration details, including the Windows Registry.
- Hardware configuration details, including memory, storage subsystem, CPU (including power settings and hyperthreading), NIC settings, and HBA settings.
- SQL Server configuration settings at the instance-level, database-level, and connection-level. Also be sure to catalog the sp_configure settings, as well as the sp_dboptions settings for the user database and system databases.
- The database schema, including index statistics and metadata.
- The SQL transactional workload, including an analysis to ensure that the SQL transactional workload properly stresses the right components of the system. This is an important part to validate ahead of time. I’ve encountered many teams who, upon analysis of their benchmarks, realized that they were running boatloads of SELECT statements for a benchmark of write speed. Doh!
- Full descriptions of all of the metrics and instrumentation used to monitor the benchmark.
One other element that is often overlooked in a benchmarking document is a clearly defined goal statement. The goal statement tells why the benchmark is needed and the hypothesis that it is designed to test. (Note that you should only be testing one hypothesis at a time!) The benchmark goal statement, like a corporate mission statement, helps the team focus priorities and punt on activities that are useful, but don’t help accomplish the goal.
Again, the documentation should be evaluated against the benchmark goal statement to be ensure that they all mesh properly. For example, your description of the metrics collected is worth double checking. If you’re performing a benchmark to ensure that 1,000 users can happily work simultaneously, you should collect not only performance metrics but also information about locking, blocking, and deadlocking.
Step 3: Create a Test Plan in Step-by-Step Detail
The test plan should essentially be a paper representation of everything that the physical test will encompass. It should walk through the test in step-by-step detail describing when and how:
- The work load is invoked and scaled.
- The performance metrics are collected and stored.
- How long the test should run and when the test is considered complete.
- How to respond to failures or new test runs.
- Who is the test lead and who makes the call to abort a long-running test.
The last step is especially important. For example, let’s say a specific benchmark has been running for two hours when it appears to be having some problems. Who’s going to give the order to stop the test, recalibrate, and then restart it? And once that order has been given, who will you go about recalibrating everything in the test to ensure a consistent result set? Will you restore a VM as a “blank slate” for the test server or perhaps just restore a specific database? Will you reboot all servers between tests, or cycle SQL Server, or perhaps just flush the caches and performance metrics counters and then restart the test?
In summary, you’re ready for a real benchmark after you’ve isolated your environment, documented the parameters of the benchmark, and written the test (and retest) plan. In the next couple of columns, I’ll tell you about free benchmarking tools that can help you generate a scalable workload on your database servers.
Follow me on Twitter and Google.