Home > Active Directory, Cloud, SCOM 2012, System Center Family, Tips&Tricks > Monitoring SAN Performance with SCOM 2012

Monitoring SAN Performance with SCOM 2012


Monitoring SAN performance is on everyone’s list of essential monitoring requirements. When it comes to monitoring the SAN, the most important question is what perspective will provide the most accurate data? For example, monitoring a website directly from the server hosting the website will not likely provide a good indication of what end users may be experiencing.

The same can be said for SANs. You can monitor the performance of the SAN from the perspective of the SAN but will that accurately represent the experience of the end users – and in the case of SANs – the end users are the servers using the SAN.

I always monitor storage from the perspective of the servers. We deal with this subject on a daily basis and employ a simple, yet effective, solution for monitoring our customer’s storage technology – regardless of vendor. It is highly customizable and flexible and ensures transparency into the performance of your SAN.

This blog has two purposes: First is to clearly demonstrate I have no desktop publishing skills at all and second, to walk thru how to monitor the performance of typical SAN storage, specifically, storage not presented as logical disks (c:\, d:\ etc.) to the server. We will implement monitoring of two separate LUNS residing on a small SAN connected to a 3 node Hyper-V failover cluster and build a corresponding dashboard.

Physical Disks

So how do you monitor disks that appear to be invisible because the disks don’t have a logical drive letter? Its actually very easy – you simply need to configure SCOM 2012 to discover the underlying physical disks. Once you can “see” them, monitoring is a snap.

Scenario

The environment consists of a 3-node Hyper-V failover cluster running Windows Server 2008 R2 SP1. It is comprised of:

Components

1- 3 x Dell PE R610 Servers
2- Dell MD3200 PowerVault
3- 6 x SAS 7K RPM 500 GB – RAID 5
4- 6 x SAS 15K RPM 300 GB – RAID 10
5- The RAID 5 LUN hosts a CSV for non-production virtual servers.
6- The RAID 10 LUN hosts a CSV for production virtual servers.

Monitoring Objectives

We want to monitor the performance of both CSVs and collect the following performance metrics:

Metrics

1- Write Bytes Per Second
2- Read Bytes Per Second
3- Disk Reads Per Second
4- Disk Writes Per Second
5- Average Disk Seconds Per Transfer

physical disk performance counters available in SCOM 2012. An * denotes the performance collection rule is enabled by default; all others must be explicitly enabled via an override.

1- Performance Counters
2- % Physical Disk Idle Time 2008
3- Average Physical Disk Read Queue Length 2008
4- Average Physical Disk Write Queue Length 2008
5- Physical Disk Average Disk Queue Length 2008
6- Physical Disk Average Disk Queue Length 2008
7- Physical Disk Average Disk Seconds per Transfer 2008*
8- Physical Disk Average Disk Seconds per Write 2008
9- Physical Disk Current Disk Queue Length 2008*
10- Physical Disk Bytes per Second 2008
11- Physical Disk Read Bytes Per Second 2008
12- Physical Disk Reads per Second 2008
13- Physical Disk Split I/O Per Second 2008
14- Physical Disk Write Bytes Per Second 2008
15- Physical Disk Writes per Second 2008

1. Create a New Management Pack

For this effort. Lets call it ‘CSV Performance’

2. Configure SCOM 2012 to Discover the Physical Disks

This will ‘expose’ the disks making them available to be monitored by SCOM 2012. You could enable the discovery for all servers but I recommend explicitly enabling discovery only for the servers whose disks you want to monitor:

3. Create a New Group

This group will contain the servers, which make up the cluster. It is these servers that have access to the SAN disks. Physical disk discovery will be enabled for this group. Let’s call it ‘Cluster Nodes’. Be sure to save it to the ‘CSV Performance’ management pack. Also, when adding the servers to the group, be sure to search for type ‘Windows Computer’. See Figures A & B

Figure A:
2

Figure B:
1

4. Turn on Physical Disk Discovery

This is done in ‘Authoring / Management Pack Objects / Object Discoveries’. Change scope to ‘Windows Server 2008 Physical Disk’. The name of the discovery rule is ‘Discover Windows Physical Disks’. Override the discovery rule for the group ‘Cluster Nodes’. Discovery may take up to 24 hours but usually will take much less time. Save the override to the CSV Performance management pack. See Figure C:

Figure C:

3

5. Verify the Disks were Discovered

Create a new ‘State’ view in ‘My Workspace’. When creating the view, select ‘Windows Server 2008 Physical Disk‘ for the ‘Show data related to:’ field and ‘Cluster Nodes’ for ‘Show data contained in a specific group:’ Lastly, be sure to personalize the view and choose ‘Model’. See Figure D:

Figure D:

1

If discovery completed successfully, you will be presented with a view similar to:

1

In our environment, each node has 3 internal disks, RAID 5. This accounts for 9 of the disks. The 3rd node of the cluster is in reserve; it owns no resources so only 2 nodes have access to each CSV accounting for 4 more disks for a total of 13.

So now we need to identify which disks are our SAN disks. In this case, the SAN disks are those with a Model name of ‘Dell MD32xx Multi-Path’. We have now isolated the two disks (LUNS) we want to monitor, Disk 1 and Disk

3

You will also want to know which CSV is hosted on which disk. You can use the Failover Cluster Manager utility to do this by mapping capacity to volume number.

6. Create another group called ‘CSV Disks’

Populate it with Disk 1 and Disk 2. Again, save it to the CSV Performance management pack created earlier.

A. Go to ‘Authoring / Groups’ and create a new group called ‘CSV Disks’

B. Navigate to ‘Explicit Members’ / ‘Add/Remove Objects…’ and select ‘Windows Server 2008 Physical Disk’ in the ‘Search for:’ field.

C. In the ‘Filter by part of the name (optional):’ enter Disk 1 (or whatever disk number corresponds to your environment) and add the items returned. The path should be the name of the server whose disks you want to monitor. Repeat for Disk 2. See Figures E & F:

Figure E:

5

Figure F:

u

D. Add Disk 1 and Disk 2 for all applicable nodes:

1

Now we have exposed the correct drives and neatly grouped them in a group called ‘SAN Disks’. See Figure G:

Figure G:

4

7. Enable the Performance Counters

Go to ‘Authoring / Management Pack Objects / Rules’ and change the scope to ‘Windows Server 2008 Physical Disks’. See Figure F:

Figure F:

earlier in this blog lists all physical disk performance counters available in SCOM 2012. They must be explicitly enabled via an override, which targets the group called CSV Disks, which should be stored in the CSV Performance management pack. Those underlined are the ones we will be using. Figure F also shows we changed the sampling frequency to 60 seconds.

8. Verify Performance Data is Being Collected

You can create a ‘Performance’ view in ‘My Workspace’ using the same values used for the ‘State’ view created earlier. See Figure G & H:

Figure G:

Figure H:

9. Create New Dashboard

Choose ‘Grid Layout’ and give it a name. Let’s call it ‘Cluster Disk Performance’. Choose ‘5 Cells’ and a layout template of your choosing then click ‘Create’.

A. For each cell, add a Performance widget

B. Use a name similar to the performance metric so for the first cell, lets use ‘Disk Writes/sec’

C. For ‘group or object’, select the ‘CSV Disks’ group created earlier.

D. For performance counter, ‘PhysicalDisk’ will be the only item available in the dropdown then select the ‘Disk Writes/sec’ counter.

E. Choose a desired time range and legend values.

F. Repeat for the other 4 performance counters.

Sample Dashboard for SAN Performance

I have been having a lot of struggles (as you have seen) with formatting so click HERE for a better image:

Please note the columns in the legends or your dashboard are sortable. Just click on the field name, say Average Value, and it will sort ascending or descending. IMHO this one simple feature is what makes the dashboards in SCOM 2012 super useful.

This blog focused on disk performance but I want to point out the when working with CSVs, capacity is also a crucial metric. Fortunately the Windows Server management pack provides capacity metrics for Cluster Shared Volume Disk Capacity. You can easily modify your dashboard to incorporate this metric.

Advertisements
  1. September 27, 2013 at 3:24 pm

    This works great! Except for one issue that I think everyone will run into eventually. Since you have to identify your CSV’s by their disk number, that number has to stay the same for the monitors to be accurate. But that number can change when the system is booted. So sometimes my monitor will end up monitoring a quorum disk or some other physical disk. Has anyone found a way around this issue?

  2. Trig
    April 18, 2014 at 11:00 am

    That ‘s excellent work!!!!!!!!!!!!Thank u very very much!! I have learn Role、 Counter、Group、Coutomer Views .etc!! You gave every detail step,never learned for other blog~!

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: