Tuesday, November 29, 2011

“When you un-installed the SCOM Agent from managed servers, SCOM is not discovering those Servers again to re-install agents”

Hi,

Today i have found that on few of our servers there are wrong version of SCOM Agents is installed, for example 32bit Agents are installed on 64bit Servers. Then i plan to uninstall them and reinstall the proper agent version

I un-installed the Agents using “Add remove programs” on all the servers and then tried to reinstall them again using “Operations Manager Console”.

But

When i run the “discovery wizard” it’s not showing me the server on whose I un-installed the agent, i tries 3-4 time but same problem. then i searched the internet and tried few troubleshooting tip but nothing helps.

then i thought let’s check the “agent managed” may be the server is still on SCOM Database that’s why its not detecting it.

I got to “Administration Pane” and click on “Agent managed” and then i search for the server on which i am trying to re-install the agent and as expected it was there.

Agent_12

ok, not its time to delete it, select the server and right click on it and click on “delete”

Agent_13

now click on “Yes”

Agent_14

All Done, wait for 5 minutes , run the discovery wizard again and hopefully this time you will successfully able to discover the Server :)

Solution: In my case the solution was removing the server in “Agent managed”

thanks

aman dhally

Monday, November 28, 2011

“Remote SMTP queue length is outside the configured threshold” error in SCOM

 

Hi,

As the error clearly showing us that the problem is with one and more “SMTP” queue length. Let’s investigate a little bit more about it.

Error:

SMTP

in error the source is “SMTP” and the path is one of our “Exchange Server”, in the error it is showing that the default number of message is set to “200” and the interval is “3600sec (1hour). which means this “SMTP Queue Monitor” check exchange server queues after every hour and if there is more than 200 messages are stuck in the queue it generates an alert.

let’ check our exchange Sever to solve the error

Open Exchange Server “System Manager” navigate to your exchange server and click on queues, and here we found our problematic queue, there is “301” messages stuck in it, the “SMTP monitor is right”,

SMTP-1

Solution:

  1. try to find out why your remote queue got stuck, may be some firewall rules blocking SMTP connection
  2. Make sure your SMTP Connector is working
  3. Right click on your problematic queue click on force connection and see if email start get deliver
  4. Try restarting “SMTP” service
  5. if nothing helps try contacting  any “Exchange Sever” engineer or administrator

in My case solution was different

i select the problematic server and right click on it and choose find messages.

SMTP-2

all the messages which stuck in the queue was “JUNK” then i decide to remove them.

I selected all messages , right click on them and choose “Delete (no NDR)”

SMTP-3

My message queue is not back to zero 0, all is well now , you can close this alert manually or wait for another one hour to get close it automatically.

SMTP-4

 

Thanks

Aman

Tuesday, November 22, 2011

Yippy !!!! Ten Thousand Page Views


Our Blog have 10,000 Page views ….. {i knew these Number doesn’t matter but stillll…}
Yippy!!!!!!!
10013

“A SQL job failed to complete successfully” error in SCOM

Hi,

As the error is self explainery that one of our SQL Backup Job is failed. How to fix it? Depends how much you know about SQL and you also know what to do. But let’s check which job is failed and let me show you how to check the failed “SQL Job” , I know its quite simple but sometime few SCOM new users don’t know where to find these failed jobs.

Error:

Error is showing that “SQL Job is failed on Computer 2K” and the job name is “Copy Tables From”

Sql_Job_1 

let me RDP to my SQL Server “2K” and let me open “ SQL Server Enterprise Manager”

Click on your SQL Server, expand the “Management” , beneath it expand “SQL Server Agent” and then click on “Jobs” and here is our failed SQL Job. You can see there is a red cross in the front of the the job.

Sql_Job_2

To check the more details about the job, right click on it and then choose “View Job History”

Sql_Job_3

now click on “Show Step Details” if you want to know more info about the job.

Sql_Job_4

So here is the complete step info about the backup job. I think you can find a useful info on why this job failed.

Sql_Job_5

Otherwise: go to your SQL Administrator and told him “Damn, Fix it ,”

Thanks

Aman Dhally

Monday, November 21, 2011

SCOM : How to resolve “The SMTP Local Retry Queue Total is outside calculated baseline” error in SCOM.

Hi
As the notification is clearly saying that there is some error in our exchange server’s Local retry Queue. But what is the exactly happening.?
Let explore it little bit.
Error :::
Right Clink on the Error, Click on Open and Choose “Health Explorer”

Now Click on “SMTP Local Retry Queue - Queue [Exchange Queue] , and click on “Monitor Properties”

In “Performance Counter” tab, it is showing that this monitor is based on “Perfmon” and it is monitoring using the “Local Retry Queue Length” on “SMTP Server” Object of the Exchange Server.










The Next tab in “Baselining”, which indicates that this is a “STT” Self Tuning Threshold Monitor. OK,,


Now click on “Overrides” tab, and click on “Overrides”

Choose “ For All Object of Class Exchange Queue”


in Overrides the “Inner Sensitivity” of the monitor is 3.11 and I think it means if it have more then 3.X message in the Local retry queue it should send an alert.


how to check????
why not we manually check the “Local Retry Queue” performance counter manually, Isn’t it is a good idea?
let’s do it, Open the “Performance Monitor” in Object choose “SMTP Servers” and in Counter choose “Local Retry Queue Length”
in Counter Explanation is says “ The Number of messages in the local retry queue”


Ok, so how many local messages are stuck, let check, OK , we have 4 local message stuck in the retry queue.


now go to your Exchange Server Queue and you will find that there is 4 messages are stuck.


let’s find them , Delete them if they are not necessary emails.


Once the Messages are deleted, our local retry queue is back to normal i means on Zero 0.


after 5-10 minutes the error should be gone otherwise you can choose the monitor and click on “Recalculate health” Option.


YippY!!! Exchange Server is Happy Again.


Thanks
Aman Dhally

Friday, November 18, 2011

Get-Overrides created on Specific Day .

hi,

yesterday one of our  SCOM admin created few overrides in SCOM and today he is on leave and the also not picking up the phone and I need to know which overrides he created.

Then i think that lets try to write a little basic PowerShell script which show the list of overrides created between specific number of days.

you can download the script from here : http://dl.dropbox.com/u/17858935/Get_SCOM_Overrides_by_Day_Created.zip

Make sure you run this script in “Operations manager Shell”

   1: ### I set $olddate to 2 Days ago date
   2:  
   3: $olddate = (Get-Date).AddDays(-2)
   4:  
   5: ## Select every Management pack and Piped it to "Get-Override)
   6:  
   7: $Mp = Get-Managementpack | Get-Override 
   8:  
   9: ### Now it will show only overrides which are created after $oldDate
  10:  
  11: $Mp| Where-Object { $_.TimeAdded -gt $olddate} | select ManagementGroupId,Name,TimeAdded | fl *
  12:  
  13: ######## E N D of S C R I P T #############

in $olddate i minus 2 days so if today is 18 November then $olddate should have 16 November stored in variable


OldDate


in variable $Mp in am storing all Management packs and piped them to Get-Overrides


MpPack


lets run $Mp lets see what we will get.


It shows the list of all Overrides in all management packs. Now we need to sort them.


MP2


$Mp| Where-Object { $_.TimeAdded -gt $olddate} | select ManagementGroupId,Name,TimeAdded | fl *


in above command , we piping $MP to where-Object cmdlet and choosing TimeAdded property in Overrides and comparing them with our variable $OldDate , so if the TimeAdded is property is greater then 16 November then it show all the Overrides created between 17,18 November.


seems working … :)


result


Download Link: http://dl.dropbox.com/u/17858935/Get_SCOM_Overrides_by_Day_Created.zip


Hope someone like it :)


Thanks


Aman Dhally