Monday, November 26, 2012

SCOM 2012 - Creating Overides

Overrides are the bread and butter of SCOM 2012. When you install your management packs some of the metrics are enabled by default but a lot of them are not. After thoroughly reviewing all of the metrics available you will want to enable new metrics or change other metrics. Overrides are fairly simple to create but have a broad range of uses. You can create overrides for both rules and monitors.

I have been asked a few times what the difference is between rules and monitors and the simplest explanation I can give is monitors are stateful and rules are dynamic. So monitors are used for up/down type objects such as services and rules are for things like CPU utilization.

Creating Overrides:
To create an override go into the Authoring space then click on Monitors. For this example we will will do an override for Logical Disk Free Space % for Windows 2003 servers, so in the find window look for Windows Server 2003 Logical Disk Free Space. You may need to expand out the Windows Server 2003 Logical Disk grouping to find what you are looking for. When you get to it Right click on it and Select Overrides, then Override this Monitor, then For all objects of class: Windows Server 2003 Logical Disk.


The Override Properties window will open up and you can get a good look at all of the available override parameters for this particular metric. As you do this more often you will see that what you can override will vary for the different types of monitors and rules you are working with. 
You can see that there are quite a few for available. I will cover the highlights on a few of the more typical ones.
  • Enabled - You will see this on pretty much every monitor in every management pack. This determines if the metric is on or off. For this example you can see the default value is False so it is off.
  • InvervalSeconds - This is the time frame (in seconds) that SCOM will go out and do its check of the object. By default we are set at 900 seconds or 15 minutes. So SCOM goes out and checks this monitors health every 15 minutes.
  • Number of Samples - This is the increment of failures before an alert is generated. So for this example it is 4 samples. Combine this with IntervalSeconds and we can see that an alert would be generated after one hour of continuous failure
  • Auto-Resolve Alert - This one is fairly self explanatory. If, after an alert is generated, the condition returns back to normal the alert closes on its own.
So lets go ahead and turn this one on. Check the box next to Enabled, this will make that parameter active. Change the Override Value to True. Next lets change the Interval frequency so Check the Box next to IntervalSeconds and change the Override Value to 600.

To the right of the list you will see a column that says Enforced. What this means is if you have several overrides covering the same thing you will give this override priority over all others.


Now we need to assign this override to a Management Pack. In 2007 it was not generally best practice to assign overrides to the Default Management Pack, but that is not the case in 2012. You can use the Default Management Pack or assign it to another MP of your choosing. Click OK

You will be returned to the Monitors window. If you Right Click on Windows Server 2003 Logical Disk Free Space(%) and select Override Summary you should be presented with the following window.
You can see that both of our override parameters have saved correctly. You are now monitoring disk space for all your 2003 Windows Servers.


More to come!


2 comments:

  1. Hi Jim, thank you for this simple explanation of sample and interval. Would I be correct in assuming that in the example given, the alert would now fire after 40 minutes of continuous failure rather than one hour? Thanks S

    ReplyDelete
    Replies
    1. Yes, 4 consecutive failures, 10 minutes apart would generate an alert at 40 minutes.

      Delete