Tuesday, January 14, 2014

Using SQL Server to Report on your IIS Log Data

I shared this post on a SQL Tuesday but am going to modify it a bit so folks better understand what it's all about.  In this post I'm going to share how you can set up a very cost effective process to report on your IIS log data.

Where are your IIS logs?
The first thing you need to do is locate the correct log directory for the website you are interested in.  If you open IIS and click on the Sites, you will see associated ID for that site.  Remember that, then in Windows Explorer browse to C:\inetpub\logs\LogFiles\W3SVC<ID>.  So if the ID of your website is 2, you would find your logs in the W3SVC2 folder (I will use this as the example going forward).

Copy the logs
The next thing I did was to create a batch file and place it in the same folder as the logs.  You don't have to do it this way, but I chose to.  There will be a single line in this batch file:

robocopy C:\inetpub\logs\LogFiles\W3SVC2\ \\DB_SERVER\SHARE\IISLogs\ /maxage:1

I used Windows Scheduler to kick off this batch file every 10 minutes, and since we roll our logs over nightly this will get the current and previous days (maxage:1 means one day).  The reason for the frequency may vary depending on how accurate you want your data.  If our websites have an issue, I prefer to have pretty current data to dive into versus having to open the log itself.

Create the database
Now you'll want to set up your IISLogs database - the schema for the table is as follows:
CREATE TABLE [dbo].[LogData](
[LogFilename] [varchar](255) NOT NULL,
[RowNumber] [int] NOT NULL,
[date] [datetime] NULL,
[time] [datetime] NULL,
[sIp] [varchar](255) NULL,
[csMethod] [varchar](255) NULL,
[csUriStem] [varchar](255) NULL,
[csUriQuery] [varchar](255) NULL,
[sPort] [int] NULL,
[csUsername] [varchar](255) NULL,
[cIp] [varchar](255) NULL,
[csUserAgent] [varchar](255) NULL,
[scStatus] [int] NULL,
[scSubstatus] [int] NULL,
[scWin32Status] [int] NULL,
[timeTaken] [int] NULL,
 CONSTRAINT [PK_LogData] PRIMARY KEY CLUSTERED 
(
[LogFilename] ASC,
[RowNumber] ASC
)

CREATE TABLE [dbo].[IISDashboard](
[LogFilename] [varchar](255) NOT NULL,
[LogDate] [date] NOT NULL,
[Server] [varchar](10) NOT NULL,
[Records] [int] NOT NULL,
[CallsAbandoned] [int] NOT NULL,
[CallsAnswered] [int] NOT NULL,
[ID30Plus] [int] NOT NULL,
[ID10Plus] [int] NOT NULL,
[ID2Minus] [int] NOT NULL,
[ID200Errors] [int] NOT NULL,
[ID500Errors] [int] NOT NULL,
[GoogleBotHits] [int] NOT NULL,
[GoogleBotLoadTime] [int] NOT NULL,
[BingBotHits] [int] NOT NULL,
[BingBotLoadTime] [int] NOT NULL,
[MSNBotHits] [int] NOT NULL,
[MSNBotLoadTime] [int] NOT NULL,
[BotHits] [int] NOT NULL,
[BotLoadTime] [int] NOT NULL,
 CONSTRAINT [PK_IISDashboard] PRIMARY KEY CLUSTERED 
(
[LogFilename] ASC,
[LogDate] ASC,
[Server] ASC
)

Import the logs
Now that the logs are over on the database server, it's time to import them.  I use logparser from Microsoft to do this.  You'll want to create a SQL Server Agent job and run it every 10 minutes or so, making sure you don't bump heads with the copy job from the IIS server.  Stagger them accordingly.  You'll note the script dynamically creates the file name for me - the rest is pretty straight forward.  Yes, I have xp_cmdshell enabled - if you don't allow the use of xp_cmdshell, you may have to cook up an alternative way to import the data.  You'll also note I clear out the current day and reload it all - that's much easier than trying to figure out what's already been imported from a text file and what has not.  Shortly after midnight I reload the entire previous day (the reason I copy two logs each time) - that ensures I have the entire day of data loaded for that log file.

DECLARE @TodaysDate VARCHAR(6)
   ,@LogFileName VARCHAR(100)
   ,@LogParserCommand VARCHAR(1000)
   ,@FileCheckCommand VARCHAR(1000)
   ,@FileCheck TINYINT

SELECT  @TodaysDate = CONVERT(VARCHAR(6),GETDATE(),12)
SET @LogFileName = 'I:\PathTo\IISLogs\u_ex' + @TodaysDate + '.log'

DELETE  FROM IISLogs.dbo.LogData
WHERE   LogFilename = @LogFileName
            
SET @FileCheckCommand = 'DIR "' + @LogFileName + '" /B'
EXEC @FileCheck = master.dbo.xp_cmdshell @FileCheckCommand,NO_OUTPUT
IF @FileCheck = 1 
    PRINT 'File does not exist.'
ELSE 
    BEGIN
        PRINT 'Importing ' + @LogFileName
        SET @LogParserCommand = 'C:\"Program Files (x86)"\"Log Parser 2.2"\LogParser "SELECT * INTO LogData FROM '
            + @LogFileName
            + '" -i:W3C -o:SQL -server:DB_SERVER\INSTANCE -database:IISLogs -driver:"SQL Server" -username:<username> -password:<password>'
        EXEC master..xp_cmdshell @LogParserCommand
    END

-- CLEANUP
EXEC master.dbo.xp_cmdshell 'del I:\PathTo\IISLogs\*.* /F /Q'

Report on the data
Well, the hard part is over.  The data is in SQL Server, and now the fun begins - doing something with it.  Personally I set up an SSRS report that I receive and review each morning.  It takes me a few seconds to identify if there are any performance issues that require investigation.  The primary report is below, as is the query.  I actually roll up the data each iteration of the scheduled job so the dashboard is quick (versus plowing through tons of data each refresh).  You'll note I have two web servers (whited out under Page Hits) load balanced, but this process is very repeatable regardless of how many servers you have.  I also make some pieces of data appear to be hyperlinks, but it's really just blue font and underlined so people know they can further dive into the data.  The call data in the report comes from our I3 phone system - I have removed those from the query to avoid confusion.  If you wish to have a copy of the RDL, feel free to contact me.









DECLARE @Today DATE = GETDATE()
   
DELETE  FROM IISLogs.dbo.IISDashboard
WHERE   LogDate = @Today

INSERT  INTO IISLogs.dbo.IISDashboard
        ( 
         LogFilename
        ,LogDate
        ,Server
        ,Records
        ,CallsAbandoned
        ,CallsAnswered
        ,ID30Plus
        ,ID10Plus
        ,ID2Minus
        ,ID200Errors
        ,ID500Errors
        ,GoogleBotHits
        ,GoogleBotLoadTime
        ,BingBotHits
        ,BingBotLoadTime
        ,MSNBotHits
        ,MSNBotLoadTime
        ,BotHits
        ,BotLoadTime
        )
        SELECT  LEFT(LogFilename,16) AS LogFileName
               ,CONVERT(DATE,DATEADD(hh,DATEDIFF(hh,GETUTCDATE(),GETDATE()),
                                     [Date] + CAST(CONVERT(TIME,[Time]) AS DATETIME))) AS 'LogDate'
               ,SUBSTRING(LogFilename,CHARINDEX('SERVER_PREFIX',LogFilename),9) AS 'Server'
               ,COUNT(*) AS 'Records'
               ,SUM(CASE WHEN timeTaken > 30000 THEN 1
                         ELSE 0
                    END) AS '30Plus'
               ,SUM(CASE WHEN timeTaken > 10000 THEN 1
                         ELSE 0
                    ENDAS '10Plus'
               ,SUM(CASE WHEN timeTaken < 2000 THEN 1
                         ELSE 0
                    ENDAS '2Minus'
               ,SUM(CASE scStatus
                      WHEN 200 THEN 1
                      ELSE 0
                    ENDAS '200Errors'
               ,SUM(CASE scStatus
                      WHEN 500 THEN 1
                      ELSE 0
                    ENDAS '500Errors'
               ,SUM(CASE WHEN csUserAgent LIKE '%googlebot%' THEN 1
                         ELSE 0
                    ENDAS 'GoogleBotHits'
               ,AVG(CASE WHEN csUserAgent LIKE '%googlebot%' THEN timeTaken
                         ELSE 0
                    ENDAS 'GoogleBotLoadTime'
               ,SUM(CASE WHEN csUserAgent LIKE '%bingbot%' THEN 1
                         ELSE 0
                    ENDAS 'BingBotHits'
               ,AVG(CASE WHEN csUserAgent LIKE '%bingbot%' THEN timeTaken
                         ELSE 0
                    ENDAS 'BingBotLoadTime'
               ,SUM(CASE WHEN csUserAgent LIKE '%msnbot%' THEN 1
                         ELSE 0
                    ENDAS 'MSNBotHits'
               ,AVG(CASE WHEN csUserAgent LIKE '%msnbot%' THEN timeTaken
                         ELSE 0
                    ENDAS 'MSNBotLoadTime'
               ,SUM(CASE WHEN csUserAgent LIKE '%bot%' THEN 1
                         ELSE 0
                    ENDAS 'BotHits'
               ,AVG(CASE WHEN csUserAgent LIKE '%bot%' THEN timeTaken
                         ELSE 0
                    ENDAS 'BotLoadTime'
        FROM    IISLogs.dbo.LogData
        WHERE   CONVERT(DATE,DATEADD(hh,DATEDIFF(hh,GETUTCDATE(),GETDATE()),
                                     [Date] + CAST(CONVERT(TIME,[Time]) AS DATETIME))) = @Today
                AND csUriStem LIKE '%.aspx%'
                AND DATEPART(dw, CONVERT(DATE,DATEADD(hh,DATEDIFF(hh,GETUTCDATE(),GETDATE()),
                                                  [Date] + CAST(CONVERT(TIME,[Time]) AS DATETIME)))) BETWEEN 2 AND 6
        GROUP BY LEFT(LogFilename,16)
               ,CONVERT(DATE,DATEADD(hh,DATEDIFF(hh,GETUTCDATE(),GETDATE()),
                                     [Date] + CAST(CONVERT(TIME,[Time]) AS DATETIME)))
               ,SUBSTRING(LogFilename,CHARINDEX('SERVER_PREFIX',LogFilename),9)

Detailed Statistics for SSRS Reports

I've had this report for years but until today never really had a conversation about the statistical value you can find in the Reporting Services database.  When a user contacts  you and claims that a report is "slow", where do you start?  Do you fire up a profiler trace and ask them to run it again?  Do you open the report and examine the query's execution plan for optimization opportunities?  Perhaps you weren't aware, but there is some really good statistical information in the Reporting Services database.  I put together the following report to assist me with my SSRS tuning investigation and efforts.  You can get the RDL for the report here.

(Note #1: I pulled the margins in to hide some account and report path information - better to be safe than sorry!)


The query/dataset to populate the report:

SELECT TOP 1000
        UserName
       ,Name
       ,Path
       ,[Status]
       ,Format
       ,TimeStart
       ,TimeEnd
       ,ByteCount
       ,ByteCount / 1024 AS KB
       ,ByteCount / 1024 / 1024 AS MB
       ,[RowCount]
       ,TimeDataRetrieval
       ,TimeProcessing
       ,TimeRendering
       ,[Parameters]
FROM    [ReportServer$SQL01].dbo.ExecutionLog AS ex
       ,[ReportServer$SQL01].dbo.Catalog AS cat
WHERE   ex.ReportID = cat.ItemID
        AND Name NOT LIKE '_My Reports%'
        AND ( UserName = @UserName
              OR @UserName = '--- ALL USERS ---' )
        AND [Status] IN (@ReportStatus)
        AND ( [Name] = @ReportName
              OR @ReportName = '--- ALL REPORTS ---' )
ORDER BY TimeStart DESC

SSRS Subscriptions and their SQL Server Agent job names

For those that run SQL Server Reporting Services and have a high volume of report subscriptions, you probably know the pain of trying to figure out which jobs in SQL Server Agent kick off which subscriptions in Reporting Services.  This is a query that you can put into a report of its own so you can pull it up and reference in the event you want to kick off a job (and subscription) manually:

SELECT  Schedule.ScheduleID AS JobName
       ,[Catalog].Name AS ReportName
       ,Subscriptions.Description AS Recipients
       ,[Catalog].Path AS ReportPath
       ,StartDate
       ,Schedule.LastRunTime
FROM    [ReportServer$SQL01].dbo.ReportSchedule
        INNER JOIN [ReportServer$SQL01].dbo.Schedule ON ReportSchedule.ScheduleID = Schedule.ScheduleID
        INNER JOIN [ReportServer$SQL01].dbo.Subscriptions ON ReportSchedule.SubscriptionID = Subscriptions.SubscriptionID
        INNER JOIN [ReportServer$SQL01].dbo.[Catalog] ON ReportSchedule.ReportID = [Catalog].ItemID
                                                         AND Subscriptions.Report_OID = [Catalog].ItemID

Friday, January 10, 2014

Importing SQLIO Data

In a previous post I shared with you how you might use an SSRS report to analyze your SQLIO results.  What I didn't share was how you might get that data into SQL Server to begin with, so I will share how I go about that.

[Update: You can get the RDL here]

The first thing you'll want to do is set up your SQLIO database, along with the related objects that you will need to import the results from the text file(s).  I have provided the necessary scripts below.  Upon creating those objects, there are a series of simple scripts you have to run to import the data.  Note: I did not write these scripts myself, but I did alter them a bit to include some information I felt necessary to do proper analysis:

USE SQLIO
GO

--STEP 1: import the data by changing to the appropriate txt file below.
TRUNCATE TABLE dbo.SQLIO_Import_Step1 ;

BULK INSERT dbo.SQLIO_Import_Step1
  FROM 'C:\Program Files (x86)\SQLIO\SQLIO_Results\results-timestamp.txt' ;

TRUNCATE TABLE dbo.SQLIO_Import ;

INSERT  INTO dbo.SQLIO_Import
        ( ResultText )
        SELECT  ISNULL(ResultText, '')
        FROM    dbo.SQLIO_Import_Step1 ;
     
--STEP 2: ETL the data into the live table for reporting purposes.    
EXECUTE [dbo].[USP_Import_SQLIO_TestPass] '[DATABASE INSTANCE]', 'D:', 2, 15000, 'RAID 10', '2014/1/10', 'EMC CX3 Model 20', '', 64, 'NTFS', '65536'

Tables needed:
CREATE TABLE [dbo].[SQLIO_Import]
    (
     [RowID] [int] IDENTITY(1,1) NOT NULL
    ,[ParameterRowID] [int] NULL
    ,[ResultText] [varchar](MAXNULL
    ,CONSTRAINT [PK_SQLIO_Import] PRIMARY KEY CLUSTERED ( [RowID] ASC )
    )

CREATE TABLE [dbo].[SQLIO_Import_Step1](
[ResultText] [varchar](MAXNULL
)

CREATE TABLE [dbo].[SQLIO_TestPass]
    (
     [TestPassID] [int] IDENTITY(1,1) NOT NULL
    ,[ServerName] [nvarchar](50) NOT NULL
    ,[LogicalDrive] [varchar](10) NOT NULL
    ,[DriveQty] [int] NOT NULL
    ,[DriveRPM] [int] NOT NULL
    ,[DriveRaidLevel] [nvarchar](10) NOT NULL
    ,[TestDate] [datetime] NOT NULL
    ,[SANmodel] [nvarchar](50) NOT NULL
    ,[SANfirmware] [nvarchar](50) NULL
    ,[PartitionOffset] [int] NULL
    ,[Filesystem] [nvarchar](50) NULL
    ,[FSClusterSizeBytes] [int] NULL
    ,[SQLIO_Version] [nvarchar](20) NULL
    ,[Threads] [int] NULL
    ,[ReadOrWrite] [nchar](1) NULL
    ,[DurationSeconds] [int] NULL
    ,[SectorSizeKB] [int] NULL
    ,[IOpattern] [nvarchar](50) NULL
    ,[IOsOutstanding] [int] NULL
    ,[Buffering] [nvarchar](50) NULL
    ,[FileSizeMB] [int] NULL
    ,[IOs_Sec] [decimal](18,0) NULL
    ,[MBs_Sec] [decimal](18,0) NULL
    ,[LatencyMS_Min] [int] NULL
    ,[LatencyMS_Avg] [int] NULL
    ,[LatencyMS_Max] [int] NULL
    ,CONSTRAINT [PK_SQLIO_TestPass] PRIMARY KEY CLUSTERED ( [TestPassID] ASC )
    )

Stored Procedure:
CREATE PROCEDURE [dbo].[USP_Import_SQLIO_TestPass]
    @ServerName NVARCHAR(50)
   ,@LogicalDrive VARCHAR(10)
   ,@DriveQty INT
   ,@DriveRPM INT
   ,@DriveRaidLevel NVARCHAR(10)
   ,@TestDate DATETIME
   ,@SANmodel NVARCHAR(50)
   ,@SANfirmware NVARCHAR(50)
   ,@PartitionOffset INT
   ,@Filesystem NVARCHAR(50)
   ,@FSClusterSizeBytes INT
AS
    SET nocount OFF

    IF @TestDate IS NULL
        SET @TestDate = GETDATE()

  /* Add a blank record to the end so the last test result is captured */
    INSERT  INTO dbo.SQLIO_Import
            ( ParameterRowID,ResultText )
    VALUES  ( 0,'' );
                             
  /* Update the ParameterRowID field for easier querying */
    UPDATE  dbo.sqlio_import
    SET     parameterrowid = ( SELECT TOP 1
                                        rowid
                               FROM     dbo.sqlio_import parm
                               WHERE    parm.resulttext LIKE '%\%'
                                        AND parm.rowid <= upd.rowid
                               ORDER BY rowid DESC
                             )
    FROM    dbo.sqlio_import upd
       
  /* Add new SQLIO_TestPass records from SQLIO_Import */
    INSERT  INTO dbo.sqlio_testpass
            (servername
            ,logicaldrive
            ,driveqty
            ,driverpm
            ,driveraidlevel
            ,testdate
            ,sanmodel
            ,sanfirmware
            ,partitionoffset
            ,filesystem
            ,fsclustersizebytes
            ,sqlio_version
            ,threads
            ,readorwrite
            ,durationseconds
            ,sectorsizekb
            ,iopattern
            ,iosoutstanding
            ,buffering
            ,filesizemb
            ,ios_sec
            ,mbs_sec
            ,latencyms_min
            ,latencyms_avg
            ,latencyms_max
            )
            SELECT  @ServerName
                   ,@LogicalDrive
                   ,@DriveQty
                   ,@DriveRPM
                   ,@DriveRaidLevel
                   ,@TestDate
                   ,@SANmodel
                   ,@SANfirmware
                   ,@PartitionOffset
                   ,@Filesystem
                   ,@FSClusterSizeBytes
                   ,( SELECT    REPLACE(resulttext,'sqlio ','')
                      FROM      dbo.sqlio_import impsqlio_version
                      WHERE     imp.rowid + 1 = impsqlio_version.rowid
                    ) AS sqlio_version
                   ,( SELECT    LEFT(resulttext,( CHARINDEX(' threads',resulttext) ))
                      FROM      dbo.sqlio_import impthreads
                      WHERE     imp.rowid + 3 = impthreads.rowid
                    ) AS threads
                   ,( SELECT    UPPER(SUBSTRING(resulttext,( CHARINDEX('threads ',resulttext) ) + 8,1))
                      FROM      dbo.sqlio_import impreadorwrite
                      WHERE     imp.rowid + 3 = impreadorwrite.rowid
                    ) AS readorwrite
                   ,( SELECT    SUBSTRING(resulttext,( CHARINDEX(' for',resulttext) ) + 4,
                                          ( CHARINDEX(' secs ',resulttext) ) - ( CHARINDEX(' for',resulttext) ) - 4)
                      FROM      dbo.sqlio_import impdurationseconds
                      WHERE     imp.rowid + 3 = impdurationseconds.rowid
                    ) AS durationseconds
                   ,( SELECT    SUBSTRING(resulttext,7,( CHARINDEX('KB',resulttext) ) - 7)
                      FROM      dbo.sqlio_import impsectorsizekb
                      WHERE     imp.rowid + 4 = impsectorsizekb.rowid
                    ) AS sectorsizekb
                   ,( SELECT    SUBSTRING(resulttext,( CHARINDEX('KB ',resulttext) ) + 3,
                                          ( CHARINDEX(' IOs',resulttext) ) - ( CHARINDEX('KB ',resulttext) ) - 3)
                      FROM      dbo.sqlio_import impiopattern
                      WHERE     imp.rowid + 4 = impiopattern.rowid
                    ) AS iopattern
                   ,( SELECT        SUBSTRING(resulttext,( CHARINDEX('with ',resulttext) ) + 5,
                                          ( CHARINDEX(' outstanding',resulttext) ) - ( CHARINDEX('with ',resulttext) )
                                          - 5)
                      FROM      dbo.sqlio_import impiosoutstanding
                      WHERE     imp.rowid + 5 = impiosoutstanding.rowid
                    ) AS iosoutstanding
                   ,( SELECT    REPLACE(CAST(resulttext AS NVARCHAR(50)),'buffering set to ','')
                      FROM      dbo.sqlio_import impbuffering
                      WHERE     imp.rowid + 6 = impbuffering.rowid
                    ) AS buffering
                   ,( SELECT    SUBSTRING(resulttext,( CHARINDEX('size: ',resulttext) ) + 6,
                                          ( CHARINDEX(' for ',resulttext) ) - ( CHARINDEX('size: ',resulttext) ) - 9)
                      FROM      dbo.sqlio_import impfilesizemb
                      WHERE     imp.rowid + 7 = impfilesizemb.rowid
                    ) AS filesizemb
                   ,( SELECT    RIGHT(resulttext,( LEN(resulttext) - 10 ))
                      FROM      dbo.sqlio_import impios_sec
                      WHERE     imp.rowid + 11 = impios_sec.rowid
                    ) AS ios_sec
                   ,( SELECT    RIGHT(resulttext,( LEN(resulttext) - 10 ))
                      FROM      dbo.sqlio_import impmbs_sec
                      WHERE     imp.rowid + 12 = impmbs_sec.rowid
                    ) AS mbs_sec
                   ,( SELECT    RIGHT(resulttext,( LEN(resulttext) - 17 ))
                      FROM      dbo.sqlio_import implatencyms_min
                      WHERE     imp.rowid + 14 = implatencyms_min.rowid
                    ) AS latencyms_min
                   ,( SELECT    RIGHT(resulttext,( LEN(resulttext) - 17 ))
                      FROM            dbo.sqlio_import implatencyms_avg
                      WHERE     imp.rowid + 15 = implatencyms_avg.rowid
                    ) AS latencyms_avg
                   ,( SELECT    RIGHT(resulttext,( LEN(resulttext) - 17 ))
                      FROM      dbo.sqlio_import implatencyms_max
                      WHERE     imp.rowid + 16 = implatencyms_max.rowid
                    ) AS latencyms_max
            FROM    dbo.sqlio_import imp
                    INNER JOIN dbo.sqlio_import impfulltest ON imp.rowid + 20 = impfulltest.rowid
                                                               AND impfulltest.resulttext = ''
            WHERE   imp.rowid = imp.parameterrowid
            ORDER BY imp.parameterrowid
         
  /* Empty out the ETL table */
    DELETE  dbo.sqlio_import