Today I got a request to resolve the slow performance of a SQL Server. As per the application team the performance was drastically down for the .Net application which was running smoothly one week back but now everything they were running was way too slow.
On inquiring more about any changes that they might have made at their end got to know that they had an Oracle server residing in East Asia location, which was queried using the Linked server, and about a week back they had migrated the server to US location. (there were some more changes made to the server but nothing that could have hammered down the performance by that scale)
Used the below query to find the wait stats information of the instance, GItHub Link:
WITH [Waits] AS (SELECT wait_type, wait_time_ms/ 1000.0 AS [WaitS], (wait_time_ms - signal_wait_time_ms) / 1000.0 AS [ResourceS], signal_wait_time_ms / 1000.0 AS [SignalS], waiting_tasks_count AS [WaitCount], 100.0 * wait_time_ms / SUM (wait_time_ms) OVER() AS [Percentage], ROW_NUMBER() OVER(ORDER BY wait_time_ms DESC) AS [RowNum] FROM sys.dm_os_wait_stats WITH (NOLOCK) WHERE [wait_type] NOT IN ( N'BROKER_EVENTHANDLER', N'BROKER_RECEIVE_WAITFOR', N'BROKER_TASK_STOP', N'BROKER_TO_FLUSH', N'BROKER_TRANSMITTER', N'CHECKPOINT_QUEUE', N'CHKPT', N'CLR_AUTO_EVENT', N'CLR_MANUAL_EVENT', N'CLR_SEMAPHORE', N'CXCONSUMER', N'DBMIRROR_DBM_EVENT', N'DBMIRROR_EVENTS_QUEUE', N'DBMIRROR_WORKER_QUEUE', N'DBMIRRORING_CMD', N'DIRTY_PAGE_POLL', N'DISPATCHER_QUEUE_SEMAPHORE', N'EXECSYNC', N'FSAGENT', N'FT_IFTS_SCHEDULER_IDLE_WAIT', N'FT_IFTSHC_MUTEX', N'HADR_CLUSAPI_CALL', N'HADR_FILESTREAM_IOMGR_IOCOMPLETION', N'HADR_LOGCAPTURE_WAIT', N'HADR_NOTIFICATION_DEQUEUE', N'HADR_TIMER_TASK', N'HADR_WORK_QUEUE', N'KSOURCE_WAKEUP', N'LAZYWRITER_SLEEP', N'LOGMGR_QUEUE', N'MEMORY_ALLOCATION_EXT', N'ONDEMAND_TASK_QUEUE', N'PARALLEL_REDO_DRAIN_WORKER', N'PARALLEL_REDO_LOG_CACHE', N'PARALLEL_REDO_TRAN_LIST', N'PARALLEL_REDO_WORKER_SYNC', N'PARALLEL_REDO_WORKER_WAIT_WORK', N'PREEMPTIVE_HADR_LEASE_MECHANISM', N'PREEMPTIVE_SP_SERVER_DIAGNOSTICS', N'PREEMPTIVE_OS_LIBRARYOPS', N'PREEMPTIVE_OS_COMOPS', N'PREEMPTIVE_OS_CRYPTOPS', N'PREEMPTIVE_OS_PIPEOPS', N'PREEMPTIVE_OS_AUTHENTICATIONOPS', N'PREEMPTIVE_OS_GENERICOPS', N'PREEMPTIVE_OS_VERIFYTRUST', N'PREEMPTIVE_OS_FILEOPS', N'PREEMPTIVE_OS_DEVICEOPS', N'PREEMPTIVE_OS_QUERYREGISTRY', N'PREEMPTIVE_OS_WRITEFILE', N'PREEMPTIVE_XE_CALLBACKEXECUTE', N'PREEMPTIVE_XE_DISPATCHER', N'PREEMPTIVE_XE_GETTARGETSTATE', N'PREEMPTIVE_XE_SESSIONCOMMIT', N'PREEMPTIVE_XE_TARGETINIT', N'PREEMPTIVE_XE_TARGETFINALIZE', N'PWAIT_ALL_COMPONENTS_INITIALIZED', N'PWAIT_DIRECTLOGCONSUMER_GETNEXT', N'QDS_PERSIST_TASK_MAIN_LOOP_SLEEP', N'QDS_ASYNC_QUEUE', N'QDS_CLEANUP_STALE_QUERIES_TASK_MAIN_LOOP_SLEEP', N'REQUEST_FOR_DEADLOCK_SEARCH', N'RESOURCE_QUEUE', N'SERVER_IDLE_CHECK', N'SLEEP_BPOOL_FLUSH', N'SLEEP_DBSTARTUP', N'SLEEP_DCOMSTARTUP', N'SLEEP_MASTERDBREADY', N'SLEEP_MASTERMDREADY', N'SLEEP_MASTERUPGRADED', N'SLEEP_MSDBSTARTUP', N'SLEEP_SYSTEMTASK', N'SLEEP_TASK', N'SLEEP_TEMPDBSTARTUP', N'SNI_HTTP_ACCEPT', N'SP_SERVER_DIAGNOSTICS_SLEEP', N'SQLTRACE_BUFFER_FLUSH', N'SQLTRACE_INCREMENTAL_FLUSH_SLEEP', N'SQLTRACE_WAIT_ENTRIES', N'WAIT_FOR_RESULTS', N'WAITFOR', N'WAITFOR_TASKSHUTDOWN', N'WAIT_XTP_HOST_WAIT', N'WAIT_XTP_OFFLINE_CKPT_NEW_LOG', N'WAIT_XTP_CKPT_CLOSE', N'WAIT_XTP_RECOVERY', N'XE_BUFFERMGR_ALLPROCESSED_EVENT', N'XE_DISPATCHER_JOIN', N'XE_DISPATCHER_WAIT', N'XE_LIVE_TARGET_TVF', N'XE_TIMER_EVENT') AND waiting_tasks_count > 0) SELECT MAX (W1.wait_type) AS [WaitType], CAST (MAX (W1.Percentage) AS DECIMAL (5,2)) AS [Wait Percentage], CAST ((MAX (W1.WaitS) / MAX (W1.WaitCount)) AS DECIMAL (16,4)) AS [AvgWait_Sec], CAST ((MAX (W1.ResourceS) / MAX (W1.WaitCount)) AS DECIMAL (16,4)) AS [AvgRes_Sec], CAST ((MAX (W1.SignalS) / MAX (W1.WaitCount)) AS DECIMAL (16,4)) AS [AvgSig_Sec], CAST (MAX (W1.WaitS) AS DECIMAL (16,2)) AS [Wait_Sec], CAST (MAX (W1.ResourceS) AS DECIMAL (16,2)) AS [Resource_Sec], CAST (MAX (W1.SignalS) AS DECIMAL (16,2)) AS [Signal_Sec], MAX (W1.WaitCount) AS [Wait Count], CAST (N'https://www.sqlskills.com/help/waits/' + W1.wait_type AS XML) AS [Help/Info URL] FROM Waits AS W1 INNER JOIN Waits AS W2 ON W2.RowNum <= W1.RowNum GROUP BY W1.RowNum, W1.wait_type HAVING SUM (W2.Percentage) - MAX (W1.Percentage) < 99 -- percentage threshold OPTION (RECOMPILE);
The server was recently rebooted and the main waits for the server were PREEMPTIVE_OLEDBOPS and OLEDB.
The design of the application hitting the databases was such that it was querying on Linked server every minute. A Query like:
SELECT col1, col2, col3 FROM LinkedServer.Database.Schema.Table WHERE col1 > 100
There could be many reasons for the wait type and one is query running a Select on a very big table from a Linked Server and that too with a WHERE clause. When executing such query SQL Server first fetch all the data from the target and then do the filtration, and as you can imagine when are querying big tables having a million of records of which the need is for 10 rows, SQL Server don’t have any other option but to first get all the million records locally and then apply the filter.
To resolve it we changed the query to use OPENQUERY function and the performance improved drastically from taking 1-2 hours to 1 minute.