Friday, January 25, 2013

SCOM 2012 - Maximum Number of Asynchronous Responses (5) has Been Reached

Came across an interesting issue yesterday. We have been working on a PowerShell script that would automatically change the resolution state of incoming alerts to a predefined state so we could send out notifications based on that state (i.e. if server is in group "A" and an alert comes in for that server, change the Resolution state to High Priority and send an email to A, B and C distribution lists). I plan to do a follow up segment providing more details on this at a later time.

We have the script in place and it is working as expected. Alert comes in > PowerShell script changes the resolution state > Notification goes out. Now during our testing we only were working on one alert at a time. Flipping the state back and forth between New and High and everything worked just fine. We expanded testing to the live environment and let it cook for a while. After about a week we noticed that not all of the Resolution states were changing from New to High automatically. After further investigation we noticed that the alerts that were not changing had all come in within about a second or so of each other. So for instance a server kicks off fifteen alerts within one second of each other only some of them would change to the new state, or there was a network issue and we lost connection to twenty servers we were only notified on a few.

After some investigation we found the following error in Management Group Health, "The process could not be created because the maximum number of asynchronous responses (5) has been reached, and it will be dropped":

As it turns out SCOM cannot execute more than 5 command notifications asynchronously by default. It is setup this way to protect the RMS box from being overwhelmed in the event of a flood of alerts. Well this can be a bit counter productive if you are relying on the command function to execute as part of notifications. If there were a real disaster you may not actually realize it if you only see one out of potentially hundreds of notifications.

Fortunately this setting can be modified. There is a registry key you can put in place to override this setting. I urge caution, however. By changing this setting you can overload your management server if the script does fire a lot or if it takes extensive time to process especially if it is on an slower machine. I would increase it slowly over a bit of time to make sure you don't get undesired results.

On the RMS box open up RegEdit. Navigate to HKEY_LOCAL_MACHINE\Software\Microsoft\Microsoft Operations Manager\3.0\Modules.
  1. Create a new subkey called Global (if it does not already exist)
  2. In Global create another new subkey called Command Executer
  3. In Command Executer create a new DWORD called AsyncProcessLimit
  4. You can set the decimal value between 1 and 100. Again I would start small, say 20 and move up from there.
  5. Restart the Health Service to allow the new settings to take effect.
5/18/16 - Update
So to make this a bit easier I wrote a PowerShell script to create this key for you. This will create the key and set it to 20.
$regKey = [Microsoft.Win32.RegistryKey]::OpenRemoteBaseKey(([Microsoft.Win32.RegistryHive]"LocalMachine"),'.');
[string]$KeyName="SOFTWARE\Microsoft\Microsoft Operations Manager\3.0\Modules\Global\Command Executer";
[void]$regKey.CreateSubKey($keyName); 
$subKey = $regKey.OpenSubKey($keyName,$true);
$subKey.SetValue("AsyncProcessLimit", 20,[Microsoft.Win32.RegistryValueKind]::DWord); 
Stop-Service HealthService
Start-Service HealthService

More to come!


If you like this blog give it a g+1


Contributing Documentation: Clive Eastwood

6 comments:

  1. Is this setting recommended for a larger SCOM environment like 3 management servers & 2 gateways with 1700 Agents with multiple technologies being monitored like Exchange 2007, 2013, Share point 2010, Lync 2010 & SQL 2005, 2008 & 2012 & Windows & IIS monitoring ?

    ReplyDelete
    Replies
    1. Yes, you would be fine in an environment of that size. As long as you start out small and move up incrementally. You will keep getting the error if you have not increased it enough. And remember you need to do it on all of your management servers as the notifications can come from any of them.

      Delete
  2. Hi Jim,

    Thank you for the suggestions. I started with 30 and its been 4 months and looks stable without any issues...

    ReplyDelete
    Replies
    1. That's great news. Glad it worked out for you!

      Delete
  3. what is default value given by Microsoft

    ReplyDelete