Archive

Archive for the ‘System Center Family’ Category

Service Manager 2012 Self Service Portal Empty “doesn´t show the content”

March 13, 2013 Leave a comment

One of the big news in the SCSM2012  is the new Self-Service Portal with the Service Catalog.

If you have installed it successfully, but can´t see anything when you click on the menus, maybe you have the same scenario that I had? This happens(or can happen) when you installs the SSP with a SSL certificate.

So if you have the same problem, it can look like this:

1

 

To solve it, open the IIS on the server where you installed the SSP.
Open Sites\Service Manager Portal and double click on Application Settings:2

Double click on the Application Setting called SMPortal_WebContentServer_URL:

3-300x81

 

Under Value, change the server name to FQDN.

4-300x84

 

Go to your SSP again and refresh the site. Now you will hopefully see the content:

5-300x138

Demo OpsMgr 2012 network monitoring with Network device simulator

February 25, 2013 Leave a comment

There has been discussion lately around running SCOM 2012 and other SC products in the cloud for DEMO and POC purposes. One problem with running SCOM in a cloud solution is not having access to network device/s. There is a solution to this. You can run a network device emulator. This is available as software and will simulate a full network device that SCOM can then discover and monitor. You will find several network device emulation software packages out there with from a quick internet search. Here is a good tool that is free. It is called Xian SNMP Device Simulator and can be downloaded here:

http://www.jalasoft.com/xian/snmpsimulator

SNMP Simulator Screenshot

The devices it can simulate are:

  • Cisco Switches
  • Cisco Router
  • Cisco Firewalls
  • Cisco VPN Concentrators
  • Cisco Wireless devices
  • 3Com Switches
  • HP Pro curve Switches
  • F5 Big Ip Nortel
  • APC UPS

You can simulate up to 15 devices with Xian SNMP Device Simulator . This same company has another tool that can be used as well that actually simulates network devices and traffic. This tool is called Xian NetFlow Simulator  and can be downloaded here:

http://www.jalasoft.com/xian/xiannetflowsimulator

The Xian NetFlow Simulator is not free but you can obtain a trial. The Xian NetFlow Simulator sends packets between a given source and a destination. You could use SCOM to monitor actual network traffic using this second tool.

The first tool has been around for some time but I thought I would post about it again with talk of running SCOM in the cloud for demos. Here is a reference to an old blog post on setting up Xian SNMP Device Simulator and monitoring it with SCOM.

Test/Demo OpsMgr 2012 network monitoring with Jalasoft’s network device simulator

Monitoring the Hybrid Microsoft Cloud

February 25, 2013 1 comment

he Microsoft Hybrid cloud, as it stands currently, is a mixture of a Hyper-V private cloud with an Azure public cloud, managed by System Center App Controller (formerly Concero).  One of the key pieces of the Microsoft solution is monitoring the health of the application (that the business really cares about) using System Center Operations Manager (OpsMgr).

Management packs make monitoring of Hyper-V, Windows, SQL, Exchange, CRM, hardware, storage, etc, easy.  You can put together end user perspective monitoring from the basic ping test to the advanced synthetic transaction, build service-centric distributed application models, and provide SLA monitoring of the LOB applications.  That’s got the private cloud covered.

There is also a management pack for Azure.  This allows you to monitor the availability, health, and performance of your public cloud services.  Let’s face it – even if Microsoft does/did provide a monitoring solution within Azure – can you really use a monitoring solution that is a part of the thing you are monitoring, i.e. the Microsoft public cloud?  I say no – and that’s the first reason why you should use OpsMgr and this management pack.  The second reason is that it allows you to integrate your monitoring of public and private clouds, giving you that mythical single pane of glass for monitoring.

  • The features of this management pack are:
  • Discovers Windows Azure applications.
  • Provides status of each role instance.
  • Collects and monitors performance information.
  • Collects and monitors Windows events.
  • Collects and monitors the .NET Framework trace messages from each role instance.
  • Grooms performance, event, and the .NET Framework trace data from Windows Azure storage account.
  • Changes the number of role instances via a task.

The prerequisites of it are:

  • The management group must be running Operations Manager 2007 R2 Cumulative Update 3.
  • The Windows Azure role must be published with full trust level. For more information about Windows Azure trust levels, see Windows Azure Partial Trust Policy Reference.
  • Windows Azure Diagnostics must be enabled. For more information about Windows Azure Diagnostics, see Implementing Windows Azure Diagnostics.
  • Windows Azure Diagnostics must be configured to forward diagnostic data to a Windows Azure storage account. For more information about configuring Windows Azure Diagnostics, see Transferring Diagnostic Data to Windows Azure Storage.
  • Microsoft .NET Framework version 2.0 or newer must be installed on the computer that you designate as the proxy agent when you configure the Monitoring Pack for Windows Azure Applications.

SCOM – Enable Agent Proxy Setting for all Installed Agents

February 16, 2013 1 comment

This is a quick post and mostly for my own reference but some people may find it interesting or useful.

When deploying SCOM agents in an environment, there is an ‘Agent Proxy’ setting that is disabled by default on all newly installed agents titled:

Allow this agent to act as a proxy and discover managed objects on other computers

If you install an agent onto for example, an Active Directory, SQL or Exchange server and leave this setting disabled, then SCOM will detect the agent as only being of the ‘Windows Server’ class and will not allow discovery of Active Directory, Exchange or SQL roles and attributes.

This setting is disabled by default as there is a potential risk associated by allowing an agent to discover external managed objects.

When installing a new SCOM solution, I tend to deploy agents to all of the servers that I know will need this setting switched on first (Exchange, AD, SQL, Hyper-V etc.). I then run a powershell command that turns this setting on for all of these agents in one quick swoop!!

Once all of the agents that I want to have this enabled on have it enabled, then I install the remaining Windows agents and leave the setting as its default of ‘disabled’.

Here’s how to do it:

Go to the ‘Security’ tab within the newly installed agent from the SCOM Administration console tab and check to see if the settings is disabled as below

setagentproxy0

Open up the ‘Operations Manager’ shell from a SCOM Management Server with administrative permissions as below:

setagentproxy1

setagentproxy2

When you have the Operations Manager Shell window opened as above, copy the script below into it and hit ‘Enter’

## Enable Agent Proxy for all agents where it is disabled
$NoProxy = get-agent | where {$_.ProxyingEnabled -match “false”}
$NoProxy|foreach {$_.ProxyingEnabled = $true}
$NoProxy|foreach {$_.ApplyChanges()}

Updated 5th May 2012: The script above will only work on SCOM 2007 R1/R2 and not SCOM 2012. See below for the SCOM 2012 equivalent:

## Enable Agent Proxy for all agents where it is disabled
$NoProxy = get-scomagent | where {$_.ProxyingEnabled -match “false”}
$NoProxy|foreach {$_.ProxyingEnabled = $true}
$NoProxy|foreach {$_.ApplyChanges()}

Updated (again!) 24th August 2012 – My good buddy Bob Cornelissen (fellow co-author of Mastering System Center 2012 Operations Manager and SCOM/OpsMgr ninja warrior) has just posted an even easier one-liner PowerShell command to enable agent proxy for all of your machines. Check out his post here and see his script below:

Get-SCOMAgent | where {$_.ProxyingEnabled.Value -eq $False} | Enable-SCOMAgentProxy

Once you have run the script above in the Operations Manager Shell window, go back to the ‘Agents’ window and open up your agents ‘Security’ tab again. You should now see that all agents present when you ran the powershell command have changed their ‘Agent Proxy’ setting to enabled!!

setagentproxy3

Easy!!

Keep in mind that this is just a simple powershell script that will enable the setting for all agents so if you want to specifically enable just a small amount and not the whole lot of them, then this isn’t the script for you!!

Operations Manager (SCOM) 2012 Upgrade Planning

January 11, 2013 Leave a comment

Since many organisations with SCOM 2007 will already be thinking about the upgrade to 2012 when it’s released, now is a good enough time as any to start planning the migration.

To help with the migration, Microsoft have recently released some process flow diagrams to help with the migration.

These can be found here and do a really good job with laying out the processes that need to be thought about and should hopefully help with a smooth migration as they’re very comprehensive.

OpsMgr MP Update: Lync Server 2010 MP version 4.0.7577.203

December 18, 2012 Leave a comment

Download: http://www.microsoft.com/en-us/download/details.aspx?id=12375

Changes in This Update

Version 4.0.7577.203 of the Lync Server 2010 Management Pack includes the following changes:

Added functionality to support co-existence of management packs during migration from Lync Server 2010 to Lync Server 2013. For more information, see Coexistence with Lync Server 2013 Management Packs in the guide.
Fixed an issue that caused alerts from non-Windows computers that were not used by Lync.

Mostly this is about co-existing with a migration to Lync 2013, however the second bullet affect a lot of people, especially Unix/Linux machines where they generated an alert about a run-as account.

How to Fine Tune the Monitoring of ConfigMgr SCCM 2012 with SCOM OpsMgr Management Pack

December 13, 2012 Leave a comment

SCOM Management Pack for Configuration Manager 2012 is available. This post will help to know more about the critical classes which need to be monitored via SCCM 2012 Management Pack. This may also help to understand the registry keys and event IDs involved in the monitoring process. The details of registry keys and event ids will be very helpful at the time of troubleshooting CM 2012 issues. Note that, I’ve not included performance monitoring and threshold settings details in this post.

In my experience, we waste loads of time in implementing and fine tuning SCCM 2007 MP. Implementing Management Pack directly into production environment is not very good approach. The best method is to implement the MP in lab environment and configure and fine tune it. Once you’re convinced with the alerts then move to production environment. Read the installation guide of the Management Pack and that should be the first step you need to take before the implementation of MP.

SCCM 2007 Management Pack won’t work with ConfigMgr 2012. CM 2012 MP can be used with SCOM 2007 R2 or later and System Center Configuration Manager 2012.

Before going into details of classes, I just wanted share an excellent blog post from Kevin Holman on CM 2012 MP improvements. As per his analysis there are loads of improvements in the management pack for CM 2012. The biggest problem with ConfigMgr 2007 MP is that it just converted from MOM 2005. Hence it came with lots of bugs. Following are the improvements highlighted as part of SCCM 2012 MP.

NO SCRIPTS in the Monitoring, Decrease in Lines of code, Decrease in Number of workflows, Disabled Workflows out of the box and Well documented guide.

The details of Critical Classes in ConfigMgr 2012 Management Pack :-

Fallback status point is monitored via the registry key “HKLM\SOFTWARE\Microsoft\SMS\Operations Management\Components\ SMS_FALLBACK_STATUS_POINT\ Availability State”

Management point is being monitored through HTTP responses, IIS and SMS Agent Host service. Along with this SCOM will monitor the threshold settings on all the threads of Management Point.

a) Management Point HTTP Response Monitor Registry: HKLM\SOFTWARE\Microsoft\SMS\Operations Management\Components\ SMS_MP_CONTROL_MANAGER\ 65AC53A5-8C79-4DF9-AE79-A53F689C2222\ Severity
b) IIS Service Availability Monitor on Management Point NT Service: W3SVC
c) Management Point Availability Monitor Registry: HKLM\SOFTWARE\Microsoft\SMS\Operations Management\SMS Server Role\{Role Name}\Availability State
d) SMS Agent Host Service Availability Monitor NT Service: CcmExec

PXE service point is monitored through WDS availability and this is by accomplished by monitoring NT Service: wdsserver .

Site database server availability is monitored via SQL Writer Service Availability Monitor NT Service: SQLWriter

Software update point availability is monitored via registry key and two NT services mentioned below.

a) Software Update Point Availability Monitor Registry: HKLM\SOFTWARE\Microsoft\SMS\Operations Management\SMS Server Role\{Role Name}\Availability State
b) IIS Service Availability Monitor on Software Update Point NT Service: W3SVC
c) WSUS Windows Service Availability Monitor NT Service: WSUSService

Reporting services point Availability can be monitored through

a) Reporting Service Point Availability Monitor Registry: HKLM\SOFTWARE\Microsoft\SMS\Operations Management\SMS Server Role\{Role Name}\Availability State.

b) SQL Reporting Service Availability Monitor NT Service: ReportServer

Application Catalog web service point availability is monitored via following registry and service.

a) IIS Service Availability Monitor on Application Catalog Web Service Point NT Service: W3SVC
b) Application Catalog Web Service Point Availability Monitor Registry: HKLM\SOFTWARE\Microsoft\SMS\Operations Management\SMS Server Role\{Role Name}\Availability State
c) Application Catalog Web Service Monitor Registry: HKLM\SOFTWARE\Microsoft\SMS\Operations Management\Components\ SMS_AWEBSVC_CONTROL_MANAGER\ F0128B76-DD22-481D-A65B-270201AED381\ Severity
d) Application Catalog Web Service IIS Configuration Monitor Registry: HKLM\SOFTWARE\Microsoft\SMS\Operations Management\Components\ SMS_AWEBSVC_CONTROL_MANAGER\ 0B543BAC-54C7-463D-BDA5-ADD9F71AEA09\ Severity

Application Catalog website point availability is monitored via following registry and service.

a) IIS Service Availability Monitor on Application Catalog Web Site Point NT Service: W3SVC
Application Catalog Web Site Point Availability Monitor Registry: HKLM\SOFTWARE\Microsoft\SMS\Operations Management\SMS Server Role\{Role Name}\Availability State
b) Application Catalog Web Server Monitor Registry: HKLM\SOFTWARE\Microsoft\SMS\Operations Management\Components\ SMS_PORTALWEB_CONTROL_MANAGER\ 0B12B4BA-B838-4927-ADC1-2E9602B076E3\ Severity
c) Application Catalog IIS Configuration Monitor Registry: HKLM\SOFTWARE\Microsoft\SMS\Operations Management\Components\ SMS_PORTALWEB_CONTROL_MANAGER\ 4A06F831-B577-4C10-8643-8C577C2C22B3\ Severity

Database Notification Monitor availability is monitored via Windows Event ID 2420 (Site server fails to execute a maintenance task)

Distribution Manager availability is monitored via Windows Event ID 2323 (i.e Distribution manager fails to access network).

Primary To Central Site Replication monitoring has achieved through following WMI queries. Primary Site To Central Site “Global Data Receiving Status Monitor”, “Global Data Sending Status Monitor” and “Site Data Sending Status Monitor”. Default time interval is 6 minutes.

Central To Primary Site Replication monitoring has achieved through following WMI queries. Central Site to Primary Site Global Data Receiving Status Monitor, Global Data Sending Status Monitor and Site Data Receiving Status Monitor. Default time interval is 6 minutes.

Primary or Standalone site server availability is monitored through Active Directory Configuration Monitor for Device Management Registry key status HKLM\SOFTWARE\Microsoft\SMS\Operations Management\Components\ SMS_EN_ADSERVICE_MONITOR\ CAFD8C35-08B6-4772-9101-B1B220CBA044\ Severity. There are loads performance threshold monitoring can also achieved through SCOM.

Site Component Manager availability is monitored via following event IDs, NT service and registry Keys.

a) Windows Event ID 4909 (Site component manager fails to read Active Directory objects)
b) Windows Event ID 4912 (Site component manager fails to update Active Directory objects)
c) Windows Event ID 1037 (Component manager fails to access site system)
d) Site Server Component Service Availability Monitor via NT Service: SMS_SITE_COMPONENT_MANAGER
e) Site Component Manager Availability Monitor via Registry key HKLM\SOFTWARE\Microsoft\SMS\Operations Management\Components\ SMS_SITE_COMPONENT_MANAGER\ Availability State

Site Server Role availability is monitors via following registry key. Site Server Connectivity To SQL Database Server Via Registry Key HKLM\SOFTWARE\Microsoft\SMS\Operations Management\SMS Server Role\{Role Name}\Availability State

Site Server availability is ensured via following registry keys and WMI Query.

a) Database Certificate Validity Monitor via Registry key HKLM\SOFTWARE\Microsoft\SMS\Operations Management\Components\ SMS_HIERARCHY_MANAGER\ FBCA00DB-7C9D-4d6d-9F84-07C605B31191\ Severity
b) WSUS Synchronization Failed WMI Query
c) SQL Server Disk Space Monitor via Registry key HKLM\SOFTWARE\Microsoft\SMS\Operations Management\Components\ SMS_HIERARCHY_MANAGER\ 6FD0B53A-35DA-4da1-84C9-A9E1B6C12828\ Severity
d) SQL Server Firewall Port Monitor via Registry key HKLM\SOFTWARE\Microsoft\SMS\Operations Management\Components\ SMS_HIERARCHY_MANAGER\ 8D5E5CC1-CCF5-4c66-BC8A-527C9066161B\ Severity
e) SQL Server Port Monitor via Registry key HKLM\SOFTWARE\Microsoft\SMS\Operations Management\Components\ SMS_HIERARCHY_MANAGER\ B1B669B9-6C11-4b8e-A09A-4E515D20F4F6\ Severity
f) SQL Server Service Broker Certificate Validity Monitor via Registry key HKLM\SOFTWARE\Microsoft\SMS\Operations Management\Components\ SMS_HIERARCHY_MANAGER\ 812A1E5F-B31C-45a5-89EE-695460882F38\ Severity
g) SQL Server Service Broker Port Monitor via Registry key HKLM\SOFTWARE\Microsoft\SMS\Operations Management\Components\ SMS_HIERARCHY_MANAGER\ D362CF53-926B-4f7d-A4A2-0691D3F177F5\ Severity

WSUS Control Manager Availability is being Monitored via Registry key HKLM\SOFTWARE\Microsoft\SMS\Operations Management\Components\ SMS_WSUS_CONTROL_MANAGER\ Availability State

WSUS Synchronization Manager Availability is being Monitored via Registry key HKLM\SOFTWARE\Microsoft\SMS\Operations Management\Components\ SMS_WSUS_SYNC_MANAGER\ Availability State

WSUS Configuration Manager availability is being monitored by following event ids and registry key.

a) WSUS Configuration Manager Availability Monitor via Registry key HKLM\SOFTWARE\Microsoft\SMS\Operations Management\Components\ SMS_WSUS_CONFIGURATION_MANAGER\ Availability State.
b) Fail to configure proxy setting on WSUS server via Windows Event ID 7000.
c) This rule generates alert when the WSUS configuration manager fails to publish client to the WSUS server via Windows Event ID 6613.
d) Fail to subscribe to or get update categories and classification via Windows Event ID 6603.
e) WSUS version mismatch via Windows Event ID 7004.

Note :- The core information shared in this post is taken from the following document. Download the doc from ConfigMgr_MPGuide_Appendix.docx . Even Kevin’s blog has also inspired from the document OpsMgr_MP_ConfigMgr.docx .

Monitoring Citrix with Operations Manager 2012

November 5, 2012 1 comment

In the earlier days if you have Operations Manager 2007 you would have MP’s available for the most of the Citrix products. On the installation media on XenApp 6.5 you would for instance have a management pack which you could use in OpsMgr 2007.
Now with 2012, Citrix have said that they would no longer continue with development of these management packs and have pushed the development to a partner called ComTrade.

ComTrade have developed a bunch of Management Packs for most of Citrix’s products including;

* XenApp
* XenDesktop
* XenServer

Now for instance Netscaler is primarily a network device so you have “free” monitoring capabilities via SNMP but for extended monitoring and pro capabilities Citrix actually has a new MP which was released in September.
When regarding the MP’s you can sign up for a free trial at ComTrade’s website here–>
I’m going to take a quick walkthrough of how XenApp monitoring is set up and how it works.

After you have received the user information you can start downloading the MP’s
The installation process is pretty straight forward, next. next, finish and the setup will automatically import the management packs.

1

2

3

So if you open the console and check under adminitration –> Management packs
You can now see ComTrade Management Packs appear.

4

If you go back to the monitoring pane, you will see that there are a bunch of new options under ComTrade XenApp

5

As well as under reports there a new bunch of new reports available for XenApp.

6

This will give you a good insight in your Citrix environment, and regarding what applications users actually use. And what kind of performance issues they might be having.
We will take a further look at this later when we are finished setting up the connection to XenApp.

When the installation process is finished you will receive a new start-menu shortcut which allows you to complete the process of setting up the monitoring, you can see a shortcut called “XenApp connector”
Here you have to enter information about the XenApp farm, a farm administrator and password.

7

Now remember that you have to be a farm administrator if it is to setup correctly. And you have to get a valid license from ComTrade in order to use it. After that you have to set the scom agent as an proxy you can do this under managed agents in the administration pane on SCOM.

After this you have to go to the monitoring pane and find under Comtrade XenApp servers, from there choose the XenApp server you wish to monitor. On the right side you have the option to install a XenApp MP agent, so run this command

8

When the installation is done (You can see this in the event viewer) you can see (in a while) that data starts being populated into SCOM.
So Yay! now we have a good and solid XenApp monitoring solution along with the rest of the infrastructure.
Now we can start monitoring SLA on our infrastructure (XenApp, Netscaler, SQL Server, Web-interface)

And as simple as that ( I have no real licenses on my XenApp server, therefore I get an error message each time I logon to the server around the licenses. ) And it also appeared in Operations Manager

9

Saving Money by Increasing CPU Efficiency with SCOM

November 5, 2012 Leave a comment

The advances in processor technology continue to ensure access to substantial processing power for pretty much anyone. Dell, HP, and IBM entry-level servers all come standard with substantial processing power so it’s no longer uncommon to have surplus processing power. The next logical question is about efficiency – ‘Are you effectively using all the available processing power?’ Unless a server is running north of 55% utilization, IMHO, they’re just wasting money because they are wasting electricity.

With respect to processor efficiency, I am referring to the amount of work a processor is performing relative to its potential power. There isn’t really a universally accepted model for calculating processor efficiency so I adopted my own. The inspiration for this effort was from one customer who specifically asked if OpsMgr On-Line™ could measure overall server efficiency. I thought it was a very valuable metric so I set out to build a comprehensive model to accomplish just that. Network, Disk and Memory were easy to model but CPU proved a little more of a challenge.

Performance Centric Resources

Computers draw from four core resources pools. This blog will only cover the processor because, as I mentioned earlier, it’s not as easy as you’d think. As for the other three resource pools:

•Physical Memory (RAM) – You either have enough memory or you don’t.
•Disk – The disk subsystem has two core metrics:
•Capacity – It has enough capacity or it does not.
•Performance - Applications can either read from disk or write to disk fast enough to sustain acceptable performance.
•Network – Applications can either place data onto or read data from the network fast enough to sustain performance.
Defining ‘Work’

A computer’s ‘power’ is generated from its CPU. The CPU’s performance is measured in Gigahertz (GHz) and refers to the clock speed. As a general rule, the faster the clock can tick the faster data can be processed.

When it comes to measuring how much work a processor is performing, I am a fan of the Context Switches per Second performance counter. Context switching activity is important for three reasons:

1.A program that monopolizes the processor lowers the rate of context switches because it does not allow much processor time for the other processes’ threads.
2.A high rate of context switching means that many threads of equal priority are sharing the processor repeatedly.
3.A high context-switch rate often indicates that there are too many threads competing for the processors on the system.
Because context switches/sec is a very accurate measurement of ‘what the processor is doing’ it is the ideal indicator for measuring how much work a computer is performing. Naturally, the system must be healthy – no hardware malfunctions, properly configured and other all resource pools are sufficiently sized.

The CPU Performance Index

The CPU performance index (CPI) is a number that describes the efficiency of a processor. The CPI identifies the workload of a computer’s processors at different points in time. The subsequent deltas indicate the efficiency momentum of the computer and will answer the following questions:

1.What is the maximum workload a processor can sustain?
2.Is a processor over utilized or under-utilized?
3.How long before a processor can no longer sustain its current workload?
Deriving CPU Performance Index

The CPU performance index, represented as CPI, is a numeric ratio used to identify the efficiency of a processor measured as a function of work performed. Depending on the value, you can ascertain if a processor is over utilized (resource deficient), performing optimally or has additional potential processing capacity (under utilized).

Since work is work I just borrowed the current formula for work – W=F x D.

•Work is already defined – it is represented by context switches/sec
•Force is also defined – it is represented by processor utilization.
By solving for distance (D) we get W/F=D where ‘D’ is substituted with CPI. We have already established that Context Switches/sec is relative to the role of the computer and the value is relatively steady under standard operating conditions so the ‘more force applied to each thread’, the more work accomplished.

The higher the processor utilization the more work being accomplished. It is also critical to note that the processor can’t be backlogged under normal production workloads. The relative processor queue length must reflect no stacking. The CPI value will be meaningless if the processors are backlogged at anytime. For example, you may have low utilization and a high queue length due to a insufficiently sized FSB.

So putting this all together we get:

CPU Performance Index = (Context Switches/sec) / (Percent Processor Time)

which can be simply represented as:

CPI=Css/PPt

We are simply calculating a ratio of work to force. Assuming predefined performance variables remain within established operational limits, then the greater the force the more work accomplished. Now given we are operating under the assumption processing power is constant (for the purposes of this blog I am not addressing processors whose individual cores can be powered off), then the only way to maximize processor efficiency is to ensure the maximum amount of work is being performed, else, the surplus processing power simply is wasted in the form of heat and power consumption. In other words, if we can’t reduce the energy being consumed lets increase the workload so that the processor is accomplishing more work.

Interpreting the CPU Performance Index

A single CPI value is meaningless. Regularly measured values are needed to determine if CPI is increasing or decreasing. The objective is to make CPI as small as possible value and then ensure it remains relatively constant throughout the server’s remaining lifecycle. A computer’s processors have reached its maximum utilization when the smallest possible CPI has been reached and remains steady. There is no accepted singular value for CPI. The objective is to have the CPI trend down and remain down over time, which will indicate the computer’s processors are operating at maximum efficiency.

Monitoring SAN Performance with SCOM 2012

October 22, 2012 1 comment

Monitoring SAN performance is on everyone’s list of essential monitoring requirements. When it comes to monitoring the SAN, the most important question is what perspective will provide the most accurate data? For example, monitoring a website directly from the server hosting the website will not likely provide a good indication of what end users may be experiencing.

The same can be said for SANs. You can monitor the performance of the SAN from the perspective of the SAN but will that accurately represent the experience of the end users – and in the case of SANs – the end users are the servers using the SAN.

I always monitor storage from the perspective of the servers. We deal with this subject on a daily basis and employ a simple, yet effective, solution for monitoring our customer’s storage technology – regardless of vendor. It is highly customizable and flexible and ensures transparency into the performance of your SAN.

This blog has two purposes: First is to clearly demonstrate I have no desktop publishing skills at all and second, to walk thru how to monitor the performance of typical SAN storage, specifically, storage not presented as logical disks (c:\, d:\ etc.) to the server. We will implement monitoring of two separate LUNS residing on a small SAN connected to a 3 node Hyper-V failover cluster and build a corresponding dashboard.

Physical Disks

So how do you monitor disks that appear to be invisible because the disks don’t have a logical drive letter? Its actually very easy – you simply need to configure SCOM 2012 to discover the underlying physical disks. Once you can “see” them, monitoring is a snap.

Scenario

The environment consists of a 3-node Hyper-V failover cluster running Windows Server 2008 R2 SP1. It is comprised of:

Components

1- 3 x Dell PE R610 Servers
2- Dell MD3200 PowerVault
3- 6 x SAS 7K RPM 500 GB – RAID 5
4- 6 x SAS 15K RPM 300 GB – RAID 10
5- The RAID 5 LUN hosts a CSV for non-production virtual servers.
6- The RAID 10 LUN hosts a CSV for production virtual servers.

Monitoring Objectives

We want to monitor the performance of both CSVs and collect the following performance metrics:

Metrics

1- Write Bytes Per Second
2- Read Bytes Per Second
3- Disk Reads Per Second
4- Disk Writes Per Second
5- Average Disk Seconds Per Transfer

physical disk performance counters available in SCOM 2012. An * denotes the performance collection rule is enabled by default; all others must be explicitly enabled via an override.

1- Performance Counters
2- % Physical Disk Idle Time 2008
3- Average Physical Disk Read Queue Length 2008
4- Average Physical Disk Write Queue Length 2008
5- Physical Disk Average Disk Queue Length 2008
6- Physical Disk Average Disk Queue Length 2008
7- Physical Disk Average Disk Seconds per Transfer 2008*
8- Physical Disk Average Disk Seconds per Write 2008
9- Physical Disk Current Disk Queue Length 2008*
10- Physical Disk Bytes per Second 2008
11- Physical Disk Read Bytes Per Second 2008
12- Physical Disk Reads per Second 2008
13- Physical Disk Split I/O Per Second 2008
14- Physical Disk Write Bytes Per Second 2008
15- Physical Disk Writes per Second 2008

1. Create a New Management Pack

For this effort. Lets call it ‘CSV Performance’

2. Configure SCOM 2012 to Discover the Physical Disks

This will ‘expose’ the disks making them available to be monitored by SCOM 2012. You could enable the discovery for all servers but I recommend explicitly enabling discovery only for the servers whose disks you want to monitor:

3. Create a New Group

This group will contain the servers, which make up the cluster. It is these servers that have access to the SAN disks. Physical disk discovery will be enabled for this group. Let’s call it ‘Cluster Nodes’. Be sure to save it to the ‘CSV Performance’ management pack. Also, when adding the servers to the group, be sure to search for type ‘Windows Computer’. See Figures A & B

Figure A:
2

Figure B:
1

4. Turn on Physical Disk Discovery

This is done in ‘Authoring / Management Pack Objects / Object Discoveries’. Change scope to ‘Windows Server 2008 Physical Disk’. The name of the discovery rule is ‘Discover Windows Physical Disks’. Override the discovery rule for the group ‘Cluster Nodes’. Discovery may take up to 24 hours but usually will take much less time. Save the override to the CSV Performance management pack. See Figure C:

Figure C:

3

5. Verify the Disks were Discovered

Create a new ‘State’ view in ‘My Workspace’. When creating the view, select ‘Windows Server 2008 Physical Disk‘ for the ‘Show data related to:’ field and ‘Cluster Nodes’ for ‘Show data contained in a specific group:’ Lastly, be sure to personalize the view and choose ‘Model’. See Figure D:

Figure D:

1

If discovery completed successfully, you will be presented with a view similar to:

1

In our environment, each node has 3 internal disks, RAID 5. This accounts for 9 of the disks. The 3rd node of the cluster is in reserve; it owns no resources so only 2 nodes have access to each CSV accounting for 4 more disks for a total of 13.

So now we need to identify which disks are our SAN disks. In this case, the SAN disks are those with a Model name of ‘Dell MD32xx Multi-Path’. We have now isolated the two disks (LUNS) we want to monitor, Disk 1 and Disk

3

You will also want to know which CSV is hosted on which disk. You can use the Failover Cluster Manager utility to do this by mapping capacity to volume number.

6. Create another group called ‘CSV Disks’

Populate it with Disk 1 and Disk 2. Again, save it to the CSV Performance management pack created earlier.

A. Go to ‘Authoring / Groups’ and create a new group called ‘CSV Disks’

B. Navigate to ‘Explicit Members’ / ‘Add/Remove Objects…’ and select ‘Windows Server 2008 Physical Disk’ in the ‘Search for:’ field.

C. In the ‘Filter by part of the name (optional):’ enter Disk 1 (or whatever disk number corresponds to your environment) and add the items returned. The path should be the name of the server whose disks you want to monitor. Repeat for Disk 2. See Figures E & F:

Figure E:

5

Figure F:

u

D. Add Disk 1 and Disk 2 for all applicable nodes:

1

Now we have exposed the correct drives and neatly grouped them in a group called ‘SAN Disks’. See Figure G:

Figure G:

4

7. Enable the Performance Counters

Go to ‘Authoring / Management Pack Objects / Rules’ and change the scope to ‘Windows Server 2008 Physical Disks’. See Figure F:

Figure F:

earlier in this blog lists all physical disk performance counters available in SCOM 2012. They must be explicitly enabled via an override, which targets the group called CSV Disks, which should be stored in the CSV Performance management pack. Those underlined are the ones we will be using. Figure F also shows we changed the sampling frequency to 60 seconds.

8. Verify Performance Data is Being Collected

You can create a ‘Performance’ view in ‘My Workspace’ using the same values used for the ‘State’ view created earlier. See Figure G & H:

Figure G:

Figure H:

9. Create New Dashboard

Choose ‘Grid Layout’ and give it a name. Let’s call it ‘Cluster Disk Performance’. Choose ‘5 Cells’ and a layout template of your choosing then click ‘Create’.

A. For each cell, add a Performance widget

B. Use a name similar to the performance metric so for the first cell, lets use ‘Disk Writes/sec’

C. For ‘group or object’, select the ‘CSV Disks’ group created earlier.

D. For performance counter, ‘PhysicalDisk’ will be the only item available in the dropdown then select the ‘Disk Writes/sec’ counter.

E. Choose a desired time range and legend values.

F. Repeat for the other 4 performance counters.

Sample Dashboard for SAN Performance

I have been having a lot of struggles (as you have seen) with formatting so click HERE for a better image:

Please note the columns in the legends or your dashboard are sortable. Just click on the field name, say Average Value, and it will sort ascending or descending. IMHO this one simple feature is what makes the dashboards in SCOM 2012 super useful.

This blog focused on disk performance but I want to point out the when working with CSVs, capacity is also a crucial metric. Fortunately the Windows Server management pack provides capacity metrics for Cluster Shared Volume Disk Capacity. You can easily modify your dashboard to incorporate this metric.

Follow

Get every new post delivered to your Inbox.

Join 152 other followers