Optimizing Microsoft SQL Server for Better Performance

To ensure user satisfaction, it’s crucial for applications and websites to operate seamlessly. In high-stakes environments, even a few milliseconds of delay can lead to significant issues. With databases constantly expanding, swift data retrieval and storage are paramount. Optimizing our database server for peak performance is non-negotiable.

This article provides a practical, step-by-step guide to the fundamentals of SQL Server performance tuning.

#1 Identifying Bottlenecks

It’s important to remember that Microsoft SQL Server (SQL Server or MSSQL) is a complex software system. If we encounter performance issues, we need to diagnose the root cause. Let’s delve into how to enhance SQL Server’s performance.

Our goal is to ensure rapid and accurate data exchange with SQL Server. Should any issues arise, there are two primary areas to investigate:

The hardware and installation configuration, as SQL Server has specific requirements that need to be met.
The correctness and efficiency of the provided T-SQL code.

Despite being proprietary, Microsoft offers various tools and resources to help us understand and optimize SQL Server.

Assuming the hardware and installation are sound but performance remains subpar, we need to investigate potential software-related issues. Examining the behavior of different threads through wait statistics is crucial. SQL Server utilizes threads for every user request, essentially acting as mini-programs within the larger SQL Server application. It’s important to note that these threads are distinct from the operating system threads; they pertain to SQLOS threads, which are part of SQL Server’s internal pseudo operating system.

We can calculate wait statistics using the sys.dm_os_wait_stats Dynamic Management View (DMV). This view offers insights into the current state of SQL Server. Numerous scripts are available online for querying this view, but a particularly user-friendly and comprehensive one is Paul Randal’s script:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
WITH [Waits] AS
(SELECT
[wait_type],
[wait_time_ms] / 1000.0 AS [WaitS],
([wait_time_ms] - [signal_wait_time_ms]) / 1000.0 AS [ResourceS],
[signal_wait_time_ms] / 1000.0 AS [SignalS],
[waiting_tasks_count] AS [WaitCount],
100.0 * [wait_time_ms] / SUM ([wait_time_ms]) OVER() AS [Percentage],
ROW_NUMBER() OVER(ORDER BY [wait_time_ms] DESC) AS [RowNum]
FROM sys.dm_os_wait_stats
WHERE [wait_type] NOT IN (
N'BROKER_EVENTHANDLER', N'BROKER_RECEIVE_WAITFOR',
N'BROKER_TASK_STOP', N'BROKER_TO_FLUSH',
N'BROKER_TRANSMITTER', N'CHECKPOINT_QUEUE',
N'CHKPT', N'CLR_AUTO_EVENT',
N'CLR_MANUAL_EVENT', N'CLR_SEMAPHORE',
N'DBMIRROR_DBM_EVENT', N'DBMIRROR_EVENTS_QUEUE',
N'DBMIRROR_WORKER_QUEUE', N'DBMIRRORING_CMD',
N'DIRTY_PAGE_POLL', N'DISPATCHER_QUEUE_SEMAPHORE',
N'EXECSYNC', N'FSAGENT',
N'FT_IFTS_SCHEDULER_IDLE_WAIT', N'FT_IFTSHC_MUTEX',
N'HADR_CLUSAPI_CALL', N'HADR_FILESTREAM_IOMGR_IOCOMPLETION',
N'HADR_LOGCAPTURE_WAIT', N'HADR_NOTIFICATION_DEQUEUE',
N'HADR_TIMER_TASK', N'HADR_WORK_QUEUE',
N'KSOURCE_WAKEUP', N'LAZYWRITER_SLEEP',
N'LOGMGR_QUEUE', N'ONDEMAND_TASK_QUEUE',
N'PWAIT_ALL_COMPONENTS_INITIALIZED',
N'QDS_PERSIST_TASK_MAIN_LOOP_SLEEP',
N'QDS_CLEANUP_STALE_QUERIES_TASK_MAIN_LOOP_SLEEP',
N'REQUEST_FOR_DEADLOCK_SEARCH', N'RESOURCE_QUEUE',
N'SERVER_IDLE_CHECK', N'SLEEP_BPOOL_FLUSH',
N'SLEEP_DBSTARTUP', N'SLEEP_DCOMSTARTUP',
N'SLEEP_MASTERDBREADY', N'SLEEP_MASTERMDREADY',
N'SLEEP_MASTERUPGRADED', N'SLEEP_MSDBSTARTUP',
N'SLEEP_SYSTEMTASK', N'SLEEP_TASK',
N'SLEEP_TEMPDBSTARTUP', N'SNI_HTTP_ACCEPT',
N'SP_SERVER_DIAGNOSTICS_SLEEP', N'SQLTRACE_BUFFER_FLUSH',
N'SQLTRACE_INCREMENTAL_FLUSH_SLEEP',
N'SQLTRACE_WAIT_ENTRIES', N'WAIT_FOR_RESULTS',
N'WAITFOR', N'WAITFOR_TASKSHUTDOWN',
N'WAIT_XTP_HOST_WAIT', N'WAIT_XTP_OFFLINE_CKPT_NEW_LOG',
N'WAIT_XTP_CKPT_CLOSE', N'XE_DISPATCHER_JOIN',
N'XE_DISPATCHER_WAIT', N'XE_TIMER_EVENT')
AND [waiting_tasks_count] > 0
)
SELECT
MAX ([W1].[wait_type]) AS [WaitType],
CAST (MAX ([W1].[WaitS]) AS DECIMAL (16,2)) AS [Wait_S],
CAST (MAX ([W1].[ResourceS]) AS DECIMAL (16,2)) AS [Resource_S],
CAST (MAX ([W1].[SignalS]) AS DECIMAL (16,2)) AS [Signal_S],
MAX ([W1].[WaitCount]) AS [WaitCount],
CAST (MAX ([W1].[Percentage]) AS DECIMAL (5,2)) AS [Percentage],
CAST ((MAX ([W1].[WaitS]) / MAX ([W1].[WaitCount])) AS DECIMAL (16,4)) AS [AvgWait_S],
CAST ((MAX ([W1].[ResourceS]) / MAX ([W1].[WaitCount])) AS DECIMAL (16,4)) AS [AvgRes_S],
CAST ((MAX ([W1].[SignalS]) / MAX ([W1].[WaitCount])) AS DECIMAL (16,4)) AS [AvgSig_S]
FROM [Waits] AS [W1]
INNER JOIN [Waits] AS [W2]
ON [W2].[RowNum] <= [W1].[RowNum]
GROUP BY [W1].[RowNum]
HAVING SUM ([W2].[Percentage]) - MAX ([W1].[Percentage]) < 95; -- percentage threshold
GO

When running this script, prioritize the top rows in the results as they represent the most prevalent wait types.

Understanding wait types is key to effective decision-making. A valuable resource for learning about different wait types is Microsoft documentation.

Consider a scenario with excessive PAGEIOLATCH_XX waits. This indicates that a thread is stalled while waiting for data pages to be read from disk into the buffer (a memory block). It’s crucial to thoroughly investigate the underlying cause. This doesn’t automatically signify a deficient I/O subsystem or insufficient memory. While increasing either might provide a temporary fix, a long-term solution requires understanding why so much data is being read from disk. Ask yourself: What SQL commands are causing this? Are we retrieving excessive data? Could we reduce data reads using filters like where clauses? Are table scans or index scans causing excessive reads? Is it possible to convert them to more efficient index seeks by implementing or refining indexes? Are our SQL queries being misinterpreted by the SQL Optimizer (another internal SQL Server component)?

Approach the problem from various angles and employ different test cases to devise appropriate solutions. Each wait type demands a specific remedy. Thorough research by the database administrator is vital before implementing any changes. However, in many cases, identifying and optimizing problematic T-SQL queries can resolve 60 to 70 percent of performance bottlenecks.

#2 Pinpointing Problematic Queries

As previously mentioned, start by identifying problematic queries. This T-SQL code snippet will pinpoint the 20 most poorly performing queries:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
SELECT TOP 20
total_worker_time/execution_count AS Avg_CPU_Time
,Execution_count
,total_elapsed_time/execution_count as AVG_Run_Time
,total_elapsed_time
,(SELECT
SUBSTRING(text,statement_start_offset/2+1,statement_end_offset
) FROM sys.dm_exec_sql_text(sql_handle)
) AS Query_Text
FROM sys.dm_exec_query_stats
ORDER BY Avg_CPU_Time DESC

It’s important to analyze the results carefully. A query might have a high maximum average run time but only execute once. Its overall impact on the server would be insignificant compared to a query with a moderate average run time that runs frequently throughout the day.

#3 Query Optimization

Fine-tuning T-SQL queries is crucial. The essence lies in writing efficient T-SQL code and strategically implementing indexes to assist the SQL optimizer in generating the most efficient execution plan. With each SQL Server release, the optimizer becomes more advanced, compensating for suboptimal queries and addressing previous bugs. However, regardless of the optimizer’s intelligence, if our T-SQL code doesn’t clearly convey our intent, it can’t perform optimally.

SQL Server employs sophisticated search and sorting algorithms. Understanding these algorithms can provide insights into SQL Server’s behavior. For a deeper dive into these algorithms, refer to The Art of Computer Programming by Donald Knuth.

When examining queries for optimization, utilize the execution plan. It provides a visual representation of how SQL Server interprets and executes the query.

A comprehensive explanation of execution plans is beyond the scope of this article. However, here are some key points to consider:

Identify operators with the highest query cost.
Determine why a particular operator is consuming a significant cost. In most cases, scans are more expensive than seeks. Investigate why a scan (table scan or index scan) is occurring instead of an index seek. Implementing appropriate indexes on table columns often resolves this, but like any complex system, there are exceptions. For instance, scans can outperform seeks on smaller tables.
Familiarize yourself with the approximately 78 operators that represent actions and decisions within the SQL Server execution plan. Refer to the Microsoft documentation for a deeper understanding, enabling you to take appropriate optimization measures.

#4 Execution Plan Reuse

Even with well-defined indexes and efficient T-SQL code, performance can suffer if execution plans aren’t reused. After optimizing queries, ensure the execution plan is reused whenever possible. Reusing plans saves valuable CPU cycles that would otherwise be spent recalculating them.

This query helps determine the reuse frequency of execution plans, with usecounts indicating the number of reuses:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
SELECT [ecp].[refcounts]
, [ecp].[usecounts]
, [ecp].[objtype]
, DB_NAME([est].[dbid]) AS [db_name]
, [est].[objectid]
, [est].[text] as [query_ext]
, [eqp].[query_plan]
FROM sys.dm_exec_cached_plans ecp
CROSS APPLY sys.dm_exec_sql_text ( ecp.plan_handle ) est
CROSS APPLY sys.dm_exec_query_plan ( ecp.plan_handle ) eqp

Parameterized stored procedures are the most effective way to encourage execution plan reuse. If stored procedures aren’t feasible, consider sp_executesql, which executes T-SQL statements with parameterized values, increasing the likelihood of plan reuse.

Remember, there are instances where recompiling the plan is preferable. SQL Server, like any complex program, doesn’t have one-size-fits-all solutions.

Consider these two queries:

select name from table where name = 'sri';
select name from table where name = 'pal';

Assuming a non-clustered index on the name column, with half the table containing ‘sri’ and a small number of rows containing ‘pal’, the optimal approach varies. For the first query, a table scan is more efficient as half the table shares the same value. For the second query, an index scan is preferable as only a few rows contain ‘pal’.

This demonstrates that even similar queries might benefit from different execution plans. Careful analysis is crucial before making decisions. Use the “recompile” option in stored procedures when you want to prevent plan reuse.

Even when using stored procedures or sp_executesql, plan reuse isn’t guaranteed. Situations where plans aren’t reused include:

Modifications to indexes used by the query (changes or deletions)
Changes to the statistics, structure, or schema of tables accessed by the query
Explicit use of the “recompile” option
A high volume of data modifications (insertions, updates, or deletions)
Mixing DDL (Data Definition Language) and DML (Data Manipulation Language) statements within a single query

#5 Eliminating Redundant Indexes

Once queries are optimized, review index usage. Index maintenance demands significant CPU and I/O resources. Every data insertion triggers index updates, so removing unused indexes is beneficial.

SQL Server provides the dm_db_index_usage_stats DMV for analyzing index statistics. This T-SQL code snippet retrieves index usage statistics:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
SELECT 
OBJECT_NAME(IUS.[OBJECT_ID]) AS [OBJECT NAME],
DB_NAME(IUS.database_id) AS [DATABASE NAME],
I.[NAME] AS [INDEX NAME],
USER_SEEKS,
USER_SCANS,
USER_LOOKUPS,
USER_UPDATES
FROM SYS.DM_DB_INDEX_USAGE_STATS AS IUS
INNER JOIN SYS.INDEXES AS I
ON I.[OBJECT_ID] = IUS.[OBJECT_ID]
AND I.INDEX_ID = IUS.INDEX_ID

Identify indexes with minimal or no usage and consider removing them to enhance performance.

#6 SQL Server Installation and Database Configuration

During database setup, store data and log files separately. This is because data file access is non-sequential, while log file access is sequential. Storing them on the same drive hinders optimization.

Consult your hardware and networking teams regarding Storage Area Network (SAN) setup. Vendor recommendations are helpful but often insufficient. Collaborate to determine the optimal separate storage strategy for data and log files.

#7 Avoid SQL Server Overload

Ensuring a smooth-running production server that effectively serves customers is paramount. Maintaining separate databases (ideally on separate machines) for different environments is crucial:

Production
Development
Testing
Analytical

A production database requires full recovery mode recovery mode, while a simple recovery mode suffices for other environments.

Testing on a production database strains the transaction log, indexes, CPU, and I/O. Separate databases for each environment mitigate this. Separate machines further reduce CPU and I/O load.

#8 Transaction Log, tempdb, and Memory

Allocate sufficient free space for the log file to prevent autogrow operations, which can be time-consuming and block other operations. Use DBCC SQLPERF(logspace) to monitor log file size and usage for each database.

For optimal tempdb performance, allocate it to a separate disk. Maximize its initial size to minimize autogrow occurrences, which can hinder performance.

As mentioned earlier, dedicate a separate machine to SQL Server, ideally without other applications. Allocate sufficient memory to the operating system, with additional allocation for cluster environments (around 2GB is often sufficient).

For mission critical environments, a millisecond delay in getting information can be a deal breaker.

SQL Server Performance Tuning: A Wise Investment

This article provides a high-level overview of basic SQL Server performance tuning. Following these steps typically yields a 40 to 50 percent performance improvement. Advanced tuning requires a more in-depth exploration of each concept discussed.