Tuesday, October 21, 2014

Consolidate your Reports with Dynamic Data Sources

I love writing SSRS reports that assist me in consuming data across my instances.  In some cases I set up subscriptions to my reports so I don't forget to review the data periodically.  But how many times have you created a really nice report and made a copy of it so you could have it query data from another instance?  I was modifying a report today and since I had three copies (one for three different instances), I set out to find a better solution.

If you write reports, you know that when you create a data source, it is tied to an instance of SQL Server.  If you want your report to be able to query more than one data source, you will want to create what is called a "dynamic data source".  Here is one way you can go about doing that.  Note that this solution is tailored around the fact that I have databases in each of my instances with the same schema to harvest performance related data, for example, but you might have a report that queries a DMV - that would be another use case.

Create Shared Dataset
The first thing I'm going to start with is a blank report.  I'm then going to create a Shared Dataset in my Visual Studio Solution called CMSInstances.  Creating a shared dataset allows you to use it in all the reports in your solution.  I prefixed it with CMS because I'm going to get a list of instances from my Central Management Server (CMS).  If you are not using a CMS, you should be!  But if that were the case you would simply specify the parameter values individually, keeping in mind that if you add/remove instances - you will have to edit every report!

Configure the shared dataset to connect to your CMS instance and paste the following query in the query window:

SELECT  g.[name] + ' - ' + s.[name] AS [SQLAlias]
       ,[server_name] AS [InstanceName]
FROM    [msdb].[dbo].[sysmanagement_shared_server_groups_internal] g
        INNER JOIN [msdb].[dbo].[sysmanagement_shared_registered_servers_internal] s ON g.[server_group_id] = s.[server_group_id]
WHERE   g.[server_type] = 0
        AND g.[is_system_object] = 0

Create your Data Source
Now that we have our shared dataset created, we can go to our report and add it as a data source.  In the Report Data window, right-click Data Sources and add a new one.  Call it "DynamicDataSource". Next, in the embedded connections section click the expression button.

Enter the following expression:
="Data Source=" & Parameters!InstanceName.Value

Create your Report's Dataset
Because we created the shared dataset prior, we can use that dataset over and over in any report we author.  To add that shared dataset to this report, simply right-click Datasets, give it a Name and select the shared dataset.

Create your Parameter
Now that you've created the reports dataset to populate your instance pull-down, we can create the Parameter.  Right-click on Parameters and Add Parameter.  For Name type in InstanceName and Prompt just type in Instance.  Next, click on Available Values and configure it as follows:

You're all done!
You should now be able to Preview the report and your Instance pull-down will be populated with all the instances configured in your CMS.  If you group your servers by environment, you can further refine the query as needed.  Any queries using this data source would obviously have to be drawing their data from similar databases and table names.

Thursday, July 17, 2014

Using Replication to keep your Admin Scripts Current

We all have handy scripts that we want present in our SQL Server database instances.  Examples would include sp_WhoIsActive, sp_Blitz, sp_BlitzCache, Ola's maintenance scripts or custom scripts like Estimated Time to Complete.  But how do you keep all your instances updated when new versions are released?  You certainly don't want to run the updated scripts on each instance one-by-one, especially if you manage quite a few instances.  One way to keep them in sync is by using a Central Management Server (CMS) and running the updated script on multiple instances at once.  However, what if there are some instances that you don't really want or need to deploy the scripts to?  That was my situation, so I opted to use SQL Server Replication to keep my scripts in sync.

I personally have an instance that I call my "UTILITY" instance.  I use this instance as my CMS.  It's also where I store all my monitoring databases like Spotlight, utility databases like ClearTrace and most importantly - my central DBADB.  I have a DBADB in every instance I administer.  It's a place to keep all your DBA related scripts and tables (maintenance logging tables, sp_BlitzCache output, harvested wait stat data, etc.) and you can have peace of mind that nobody else is going to go in there and mess with them.  I then created a publication on my DBADB, selected only the stored procedures that I want to deploy and set up subscriptions to all the target instances.  With just a few clicks of the mouse, I was able to deploy out the latest and greatest version of all my administrative scripts.

These are the ones I personally replicate out - if you have other useful scripts, feel free to share!

Wednesday, June 18, 2014

Reporting on sp_BlitzCache Results

Regularly examining the plan cache in your database instances is a must, as there is a wealth of useful information that helps you determine which statements and/or stored procedures are your biggest performance problems.  Doing such analysis has become much easier with the release of sp_BlitzCache by Brent Ozar Unlimited.  One of the great features of sp_BlitzCache is the fact that you can log the results to a table, allowing for more effective analysis of the results.  I've decided to take it one step further and put together an SSRS report to help dive into the data.  Get the Report RDL.

To harvest the data, I created a scheduled job in SQL Server which is set to run nightly at 7pm.  I keep all of my utility scripts and data in a database called DBADB.  Each instance has its own such database.

EXEC DBADB.dbo.sp_BlitzCache
@results = 'expert',
@output_database_name = 'DBADB',
        @output_schema_name = 'dbo',
@output_table_name = 'BlitzCache'

The report I wrote is based off expert mode results.  There is a plethora of data in the report so it does get fairly wide.  I don't display the values in QueryText or QueryPlan on the report, but in Figure 3 you'll note I provided the syntax to obtain the query plan and/or SQL text in the header.  Copy in the plan_handle or sql_handle, respectively.  Another option is to directly query your BlitzCache table for it.

There are four parameters on the report: Database Name, Sample Time, Query Type and Warnings.  Most column headers allow you to sort so you can zero in on a particular performance metric of concern.  

Figure 1 - Parameters

Figure 2 - Report Data - these are the first set of columns in the report:

Tuesday, May 13, 2014

SQL Server Spinlock Reference

I've been troubleshooting some performance issues of late, and it was brought to my attention that there is yet another DMV that may shed some light on the subject: sys.dm_os_spinlock_stats.

First, what is a spinlock?
Per Microsoft's whitepaper Diagnosing and Resolving Spinlock Contention on SQL Server: "Spinlocks are lightweight synchronization primitives which are used to protect access to data structures.  They are generally used when it is expected that access to a given data structure will need to be held for a very short period of time."

How does a spinlock differ than a latch?
Per the whitepaper: "The main difference between a latch and a spinlock is the fact that spinlocks will spin (execute a loop) for a period of time checking for availability of a data structure while a thread attempting to acquire access to a structure protected by a latch will immediately yield if the resource is not available. Yielding requires context switching of a thread off the CPU so that another thread can execute. This is a relatively expensive operation and for resources that are held for a very short duration it is more efficient overall to allow a thread to execute in a loop periodically checking for availability of the resource."

What condition might indicate spinlock contention?
Per the whitepaper:
1. A high number of spins and backoffs are observed for a particular spinlock type.


2. The system is experiencing heavy CPU utilization or spikes in CPU consumption. In heavy CPU scenarios one may also observe high signal waits on SOS_SCHEDULER_YEILD (reported by the DMV sys.dm_os_wait_stats).


3. The system is experiencing very high concurrency.


4. The CPU usage and spins are increased disproportionate to throughput.
Options and workarounds for handling spinlock contention:
Per the whitepaper:
  1. Fully Qualified Names: Fully qualifying names of all objects will result in removing the need for SQL Server to execute code paths that are required to resolve names. We have observed contention points also on the SOS_CACHESTORE spinlock type encountered when not utilizing fully qualified names in calls to stored procedures. Failure to fully qualify these the names results in the need for SQL Server to lookup the default schema for the user which results in a longer code path required to execute the SQL.
  2. Parameterized Queries: Another example is utilizing parameterized queries and stored procedure calls to reduce the work needed to generate execution plans. This again results in a shorter code path for execution.
  3. LOCK_HASH Contention: Contention on certain lock structure or hash bucket collisions is unavoidable in some cases. Even though the SQL Server engine partitions the majority of lock structures, there are still times when acquiring a lock results in access the same hash bucket. For example, an application the accesses the same row by many threads concurrently (i.e. reference data). This type of problems can be approached by techniques which either scale out this reference data within the database schema or leverage NOLOCK hints when possible.
The diagram below provides a conceptual view of the spinlock algorithm:

Here are some additional resources I found on the topic:

Paul Randal
SOS_SCHEDULER_YIELD waits and the LOCK_HASH spinlock, April 2011
Advanced SQL Server performance tuning, April 2011

Thomas Kejser
Thread Synchronization In SQL Server, November 2011
Diagnosing And Fixing SOS_OBJECT_STORE Spins for Singleton INSERTS, May 2011
Secret Scale Sauce: Heavy Singleton INSERT Spinlocks, May 2011

Arvind Shyamsundar
Exotic spinlocks: XTS_MGR, January 2014

Fabricio Catae
Spinlock (Part I), July 2010
Spinlock (Part II), July 2010
Spinlock (Part III), July 2010
Spinlock Contention, December 2009

Wednesday, May 7, 2014

SQL Server Wait Stats Harvesting and Reporting

One the the keys - if not THE key - to identifying SQL Server performance issues is by investigating wait statistics.  As is the case with various SQL Server performance metrics, however, is the fact that these stats are cumulative.  That means you can't just query sys.dm_os_wait_stats and determine the root cause of an issue at a particular point in time unless your issue is happening right now.  If that were the case, you run one of the various scripts to calculate the differential of multiple sys.dm_os_wait_stats executions (example from Brent Ozar).  But what if your issue occurred off-hours, during the night, while you were traveling or in a meeting?

While I realize there are various ways to go about analyzing wait stats, I decided to put together a solution that was very easy to manage, deploy and write custom reports from.  You will need to configure a Central Management Server (CMS) and register your target servers - the servers you want to deploy this solution to.

Once you have your CMS configured, you will create a SQL Server Agent job ("Collect and Cycle Wait Stats", for example).  The job is going to have four steps:
  1. Check for DBADB database, which is the database I keep all DBA related scripts and data in
    • If not present, create it
    • Get script here
  2. Check for table that will house the wait stats
    • If not present, create it
    • Get script here
  3. Collect wait stats
    • I personally harvest the wait stats hourly but you may adjust as you see fit
    • This is a pretty typical query you will find in most blogs
    • Get script here
  4. Clear wait stats
    • By clearing the wait stats, I know each hourly harvest are the wait stats just for that hour and I don't have to do any differential calculations
    • If I want to know wait stats to a point in time, I can add them up accordingly
    • Get script here
  5. Report on it!
    • Preview of report below
    • Get the RDL here
By checking for and creating the database and table if they aren't present, you can deploy this by going to your CMS checking a box to target another instance.

Below is a preview of a report I just put together.  I'll likely update it in the coming days/weeks to better highlight anomalies, but this works for now.  You choose a date and the top 95% of waits are summarized by hour, which you can further drill into by clicking the plus sign by the hour on the left.  I added a few helpful reminders/notes from Paul Randal and Joe Sack's blogs.  As you can see, I had a CPU issue on this particular day at or around 10am as indicated by the elevated/irregular SOS_SCHEDULER_YIELD signal waits.