What are some ways to troubleshoot a greyed out agent in the console? (NETIQKB72805)

  • 7772805
  • 11-Feb-2011
  • 09-Jan-2019

Environment

NetIQ AppManager 9.2.x
NetIQ AppManager 9.1.x
NetIQ AppManager 8.2.x

Situation

What are some ways to troubleshoot a grayed out agent in Operator Console?

Resolution

The following information is provided to resolve some of the issues that may cause an agent machine to gray out in the console, either AppManager Operator Console or AppManager Control Center Console.  If none of these steps resolve the communication issues, please contact NetIQ Technical Support for additional assistance.

  • Right-click on the agent in the AppManager Operator Console, and select Ping Computers. Wait a couple of minutes and see if the agent comes back online.

NOTE: No information is displayed on the screen, the ping is performed in the background only.

  • Make sure the NetIQ AppManager Client Communication Manager (NetIQccm) and the NetIQ AppManager Client Resource Management (NetIQmc) [Agent] services are running on the server. If they are in a stopped state, restart them.
  • Make sure the NetIQ AppManager Client Communication Manager and the NetIQ AppManager Client Resource Monitor services are configured with the same Log On As accounts.
  • If one of the services fails to start or stops immediately after starting, start the affected service in console mode from the command prompt.  The service in question will determine which command to execute. You can do this by running the corresponding command at an elevated command prompt: 
    • For the NetIQ AppManager Client Communication Manager (NetiqCCM) service run:

netiqccm -c

    • For the NetIQ AppManager Client Resource Monitor (NetiqMC) service run:

netiqmc –c

These commands will display the actions being taken by the service. This will help you determine what is causing the service to stop. If the agent runs in console mode without any problems, but will not run as a service, it is most likely related to the account that the service is running as. Change the account to a different account (for testing use the account that you are currently logged in with) or if the service is running as a domain account, change it to Local System and see if the service will start with this configuration.  It should be noted that most agents can run without issue when configured to use Local System unless a module installed on the agent requires special permissions (i.e., Exchange).

  • From the AppManager Operator Console, right-click the machine that is greyed out, select Troubleshooter > Client Resource Monitor Info > Connectivity. This will verify if the agent is online and that you can communicate to the machine.  If this is successful and the machine is still grey, close and re-open the AppManager Operator Console to see if the issue is resolved by refreshing the console’s cache.
  • Check the name resolution by verifying the names for the NetIQ Management Server and the agent in question both resolve properly in DNS.  You can use the NSLOOKUP command from a command line to determine this. Even though you can work around this issue by adding the agent to the hosts file on each management server, we prefer that any DNS problems be corrected.
  • Use the NetIQCTRL command line utility to determine if the agent can make a successful “trip” to the NetIQ Management Server and back.  To perform this, open a command line on the agent server and type: NetIQCTRL and press enter.  This will bring up a new command prompt allowing you to perform many useful troubleshooting steps. Below is a sample of the syntax that should be used:

trip mc_hostname NetIQMC ms_hostname

Expected results would be similar to:

1253820145  - ctrl
1253820145  - mc < Agent Host Name >
1253820145  - ms < Management Server Host Name >
1253820145  - mc < Agent Host Name >
1253820145  - ctrl

This command is useful for determining if any RPC issues are being generated.

  • On the machine that is greyed out, verify the registry settings designating the primary and secondary management servers are configured correctly. Please note that all names should be comma separated. An asterisk indicates ANY available Management Server(s). These settings are found in the registry at the following location:

32-bit machines:
HKLM\Software\NetIQ\AppManager\4.0\NetIQmc\Security > Allow MS
HKLM\Software\NetIQ\AppManager\4.0\NetIQMC\ > MS Primary and MS Backup

 64-bit machines:
HKLM\Software\Wow6432Node\NetIQ\AppManager\4.0\NetIQmc\Security > Allow MS
HKLM\Software\Wow6432Node\NetIQ\AppManager\4.0\NetIQmc\ > MS Primary and MS Backup

 

Please ensure that these settings are correctly configured and there are no typographical errors in the names.  Also, while looking at this section of the registry, please note the ports that have been assigned to the NetIQMS Port (Default 9999) and the Port (Default 9998) settings, you will need this information to conduct the next step in troubleshooting, should this not resolve the issue.

·        From a PowerShell command line use either the following commands to verify communication between the agent and the management server:

o   From the management server to the agent:

     
New-Object System.Net.Sockets.TCPClient("mc_hostname",9998)

o   From the agent to the management server:

      New-Object System.Net.Sockets.TCPClient("ms_hostname",9998)

  • The agent services can also be instructed to perform a cold start. With a cold startup, ALL data that is being stored locally due to the communication failure will be lost and cannot be recovered.

To perform a cold start open an elevated command prompt and type the following commands:

sc stop netiqccm & sc stop netiqmc

sc start netiqmc –oa & sc start netiqccm –oa

As with any service restart, the agent will not return from the grey status immediately and should be given several minutes to begin responding.  To speed this process somewhat, right click on the server in the AppManager Operator Console and select Ping Machine.

·        DCOM permissions can also be the cause of an agent service not starting.  Typically this only occurs when the services are running with a domain account and not as Local System.  You should verify that the account running the service is not being denied any rights on the machine.  To verify, follow the process below:

·         Start -> Run -> DCOMCNFG

·         Expand Component Service -> Computers

·         Right click on My Computer and select Properties

·         Click on the COM Security tab.

·         Verify in Access Permissions and Launch and Activation Permissions that the account does not have any Deny permissions associated with it. Click both Edit buttons for each category and verify.

  • Checking the MSStatus and MSDesignation tables in the AppManager Repository (QDB.  The following query can be used to determine which agents are designated to talk to which Management Servers, as well as if no Management Servers have been set as the Primary or Secondary Management Servers on the agent. 

    NOTE: Before making any modifications to the NetIQ QDB Repository take a full backup of the database.

SELECT     B.Name AS [AGENT NAME], B.RootMachineObjID AS [Agent Object ID], D.Name AS [PRIMARY MS], C.MSID AS [Primary MSID],

                F.Name AS [SECONDARY MS], E.MSID AS [Secondary MSID]

FROM        MSDesignation AS A INNER JOIN

                Object AS B ON A.MachineObjID = B.ObjID LEFT OUTER JOIN

                MSStatus AS C ON A.PrimaryMSID = C.MSID LEFT OUTER JOIN

                Object AS D ON C.MachineObjID = D.ObjID LEFT OUTER JOIN

                MSStatus AS E ON A.SecondaryMSID = E.MSID LEFT OUTER JOIN

                Object AS F ON E.MachineObjID = F.ObjID

ORDER BY [AGENT NAME]

This query also provides information needed to modify the repository with new information. To update an agent or direct all agents that currently show a null value, contact NetIQ Technical Support.

Please contact NetIQ Technical support if you have any questions regarding any portion of this document, or if you require additional assistance.


Additional Information

Formerly known as NETIQKB72805