Archive

Archive for the ‘Active Directory’ Category

Monitoring SAN Performance with SCOM 2012

October 22, 2012 2 comments

Monitoring SAN performance is on everyone’s list of essential monitoring requirements. When it comes to monitoring the SAN, the most important question is what perspective will provide the most accurate data? For example, monitoring a website directly from the server hosting the website will not likely provide a good indication of what end users may be experiencing.

The same can be said for SANs. You can monitor the performance of the SAN from the perspective of the SAN but will that accurately represent the experience of the end users – and in the case of SANs – the end users are the servers using the SAN.

I always monitor storage from the perspective of the servers. We deal with this subject on a daily basis and employ a simple, yet effective, solution for monitoring our customer’s storage technology – regardless of vendor. It is highly customizable and flexible and ensures transparency into the performance of your SAN.

This blog has two purposes: First is to clearly demonstrate I have no desktop publishing skills at all and second, to walk thru how to monitor the performance of typical SAN storage, specifically, storage not presented as logical disks (c:\, d:\ etc.) to the server. We will implement monitoring of two separate LUNS residing on a small SAN connected to a 3 node Hyper-V failover cluster and build a corresponding dashboard.

Physical Disks

So how do you monitor disks that appear to be invisible because the disks don’t have a logical drive letter? Its actually very easy – you simply need to configure SCOM 2012 to discover the underlying physical disks. Once you can “see” them, monitoring is a snap.

Scenario

The environment consists of a 3-node Hyper-V failover cluster running Windows Server 2008 R2 SP1. It is comprised of:

Components

1- 3 x Dell PE R610 Servers
2- Dell MD3200 PowerVault
3- 6 x SAS 7K RPM 500 GB – RAID 5
4- 6 x SAS 15K RPM 300 GB – RAID 10
5- The RAID 5 LUN hosts a CSV for non-production virtual servers.
6- The RAID 10 LUN hosts a CSV for production virtual servers.

Monitoring Objectives

We want to monitor the performance of both CSVs and collect the following performance metrics:

Metrics

1- Write Bytes Per Second
2- Read Bytes Per Second
3- Disk Reads Per Second
4- Disk Writes Per Second
5- Average Disk Seconds Per Transfer

physical disk performance counters available in SCOM 2012. An * denotes the performance collection rule is enabled by default; all others must be explicitly enabled via an override.

1- Performance Counters
2- % Physical Disk Idle Time 2008
3- Average Physical Disk Read Queue Length 2008
4- Average Physical Disk Write Queue Length 2008
5- Physical Disk Average Disk Queue Length 2008
6- Physical Disk Average Disk Queue Length 2008
7- Physical Disk Average Disk Seconds per Transfer 2008*
8- Physical Disk Average Disk Seconds per Write 2008
9- Physical Disk Current Disk Queue Length 2008*
10- Physical Disk Bytes per Second 2008
11- Physical Disk Read Bytes Per Second 2008
12- Physical Disk Reads per Second 2008
13- Physical Disk Split I/O Per Second 2008
14- Physical Disk Write Bytes Per Second 2008
15- Physical Disk Writes per Second 2008

1. Create a New Management Pack

For this effort. Lets call it ‘CSV Performance’

2. Configure SCOM 2012 to Discover the Physical Disks

This will ‘expose’ the disks making them available to be monitored by SCOM 2012. You could enable the discovery for all servers but I recommend explicitly enabling discovery only for the servers whose disks you want to monitor:

3. Create a New Group

This group will contain the servers, which make up the cluster. It is these servers that have access to the SAN disks. Physical disk discovery will be enabled for this group. Let’s call it ‘Cluster Nodes’. Be sure to save it to the ‘CSV Performance’ management pack. Also, when adding the servers to the group, be sure to search for type ‘Windows Computer’. See Figures A & B

Figure A:
2

Figure B:
1

4. Turn on Physical Disk Discovery

This is done in ‘Authoring / Management Pack Objects / Object Discoveries’. Change scope to ‘Windows Server 2008 Physical Disk’. The name of the discovery rule is ‘Discover Windows Physical Disks’. Override the discovery rule for the group ‘Cluster Nodes’. Discovery may take up to 24 hours but usually will take much less time. Save the override to the CSV Performance management pack. See Figure C:

Figure C:

3

5. Verify the Disks were Discovered

Create a new ‘State’ view in ‘My Workspace’. When creating the view, select ‘Windows Server 2008 Physical Disk‘ for the ‘Show data related to:’ field and ‘Cluster Nodes’ for ‘Show data contained in a specific group:’ Lastly, be sure to personalize the view and choose ‘Model’. See Figure D:

Figure D:

1

If discovery completed successfully, you will be presented with a view similar to:

1

In our environment, each node has 3 internal disks, RAID 5. This accounts for 9 of the disks. The 3rd node of the cluster is in reserve; it owns no resources so only 2 nodes have access to each CSV accounting for 4 more disks for a total of 13.

So now we need to identify which disks are our SAN disks. In this case, the SAN disks are those with a Model name of ‘Dell MD32xx Multi-Path’. We have now isolated the two disks (LUNS) we want to monitor, Disk 1 and Disk

3

You will also want to know which CSV is hosted on which disk. You can use the Failover Cluster Manager utility to do this by mapping capacity to volume number.

6. Create another group called ‘CSV Disks’

Populate it with Disk 1 and Disk 2. Again, save it to the CSV Performance management pack created earlier.

A. Go to ‘Authoring / Groups’ and create a new group called ‘CSV Disks’

B. Navigate to ‘Explicit Members’ / ‘Add/Remove Objects…’ and select ‘Windows Server 2008 Physical Disk’ in the ‘Search for:’ field.

C. In the ‘Filter by part of the name (optional):’ enter Disk 1 (or whatever disk number corresponds to your environment) and add the items returned. The path should be the name of the server whose disks you want to monitor. Repeat for Disk 2. See Figures E & F:

Figure E:

5

Figure F:

u

D. Add Disk 1 and Disk 2 for all applicable nodes:

1

Now we have exposed the correct drives and neatly grouped them in a group called ‘SAN Disks’. See Figure G:

Figure G:

4

7. Enable the Performance Counters

Go to ‘Authoring / Management Pack Objects / Rules’ and change the scope to ‘Windows Server 2008 Physical Disks’. See Figure F:

Figure F:

earlier in this blog lists all physical disk performance counters available in SCOM 2012. They must be explicitly enabled via an override, which targets the group called CSV Disks, which should be stored in the CSV Performance management pack. Those underlined are the ones we will be using. Figure F also shows we changed the sampling frequency to 60 seconds.

8. Verify Performance Data is Being Collected

You can create a ‘Performance’ view in ‘My Workspace’ using the same values used for the ‘State’ view created earlier. See Figure G & H:

Figure G:

Figure H:

9. Create New Dashboard

Choose ‘Grid Layout’ and give it a name. Let’s call it ‘Cluster Disk Performance’. Choose ‘5 Cells’ and a layout template of your choosing then click ‘Create’.

A. For each cell, add a Performance widget

B. Use a name similar to the performance metric so for the first cell, lets use ‘Disk Writes/sec’

C. For ‘group or object’, select the ‘CSV Disks’ group created earlier.

D. For performance counter, ‘PhysicalDisk’ will be the only item available in the dropdown then select the ‘Disk Writes/sec’ counter.

E. Choose a desired time range and legend values.

F. Repeat for the other 4 performance counters.

Sample Dashboard for SAN Performance

I have been having a lot of struggles (as you have seen) with formatting so click HERE for a better image:

Please note the columns in the legends or your dashboard are sortable. Just click on the field name, say Average Value, and it will sort ascending or descending. IMHO this one simple feature is what makes the dashboards in SCOM 2012 super useful.

This blog focused on disk performance but I want to point out the when working with CSVs, capacity is also a crucial metric. Fortunately the Windows Server management pack provides capacity metrics for Cluster Shared Volume Disk Capacity. You can easily modify your dashboard to incorporate this metric.

Advertisements

Top five Group Policy improvements in Windows Server 2012

April 26, 2012 2 comments

Within the Windows Server 2012 beta (formerly known as the Windows Server 8 beta), there are over 4,560 group policies to play with– some old, some new. Additionally, there are usability improvements to the Group Policy infrastructure and the venerable Group Policy Management Console.

Here are the top five areas to focus your research on as you test for compatibility and understand how the Windows 8 client and server partners work together:

The Group Policy Update option within the Group Policy Management Console. Instead of issuing clunky command-line refresh commands, like gpupdate /force, on individual machines, you can graphically select organizational units on which to refresh Group Policy. This effectively means that because you can kick things off right from within the console, you don’t have to wait the hour and a half that it sometimes took for those refreshes to take place across a network. You can only target computers in organizational units, but the refresh itself will kick off a re-download of both the user and the computer portions of the group policy objects (GPOs) that apply to the given target. Behind the scenes, this option creates two scheduled tasks on each computer in the targeted organizational unit. For this to work, the domain controllers need to have access to create scheduled tasks on the computers, so firewalls on each system will need to be configured appropriately.

An easy-to-monitor status report about the Group Policy infrastructure on your Active Directory network. Within the Group Policy Management Console, there’s a new tab called “Infra Status.” (As a mechanical perfectionist, I’m hoping Microsoft will expand that unfortunate abbreviation, but I digress.) This information on this tab shows the status of Active Directory and Sysvol (using distributed file system replication services) replication for this domain as it relates to Group Policy. Previously, you had to look at the Sysvol status on each individual server and issues wouldn’t always bubble themselves up to the surface in an easy-to-digest way. Because AD replication is key to getting Group Policy to apply correctly within your domain, this will end up being a very handy troubleshooting tool.

Group Policy-based management of the Setting Sync feature. New to the Windows 8 family is the ability for users to enable one Windows Live ID to tie together all of their documents, settings and so on via a cloud-based synchronization service a la Apple’s iCloud service. When users roam from one device to another, by entering their ID, preferences and files are available to them just like on other devices; picture this as a giant roaming profiles service that works across security boundaries. Of course, corporate administrators will be wary of allowing many personal preferences to enable themselves on company machines, and there are seven new GPOs in Windows Server 2012 to control this feature. The Group Policy settings for the Setting Sync options are located in Computer Configuration > Administrative Templates > Windows Components > Settings Sync.

New Internet Explorer policies. You can now manage policy preferences for Internet Explorer 9 directly from the Windows Server 2012 Group Policy Management Console. Other new IE capabilities include disabling the password reveal (new to Windows 8 and IE 10), requiring that Enhanced Protected Mode be used (this forces Internet Explorer to run in 64-bit mode), preventing ActiveX controls from running in lesser security contexts in Enhanced Protected Mode and disabling the Windows 8 “Delete Browsing History on Settings” charm, among others.

Windows 8 and Metro-specific GPOs. You can customize the behavior of some of the new features in Windows 8, like disabling the lock screen, turning off PIN logon, turning off picture password logon, customizing how the default Metro app packages are deployed and enabled, using certain colors for the Start screen background, turning off tracking of app usage, disabling access to the Windows 8 App Store and customizing how Windows to Go behaves.
Microsoft has released a full spreadsheet of all the Group Policy settings for Windows 8 and Windows Server 2012 here.

Categories: Active Directory, Cloud

iPad Active Directory management options leave much to be desired

December 24, 2011 Leave a comment

Active Directory has limited value when it comes to iPad management, because the two just aren’t equipped to work closely with each other.

Apple designed the iPad as a consumer device, not for corporate settings. But as the popularity of the iPad grows among business users, IT professionals need to find way to perform large-scale iPad management tasks in a corporate environment — despite the fact that there aren’t really any good options yet.

Most companies integrate their network PCs with Active Directory, which lets administrators apply policies and access-control levels to these endpoints. This kind of control is limited when it comes to iPad Active Directory management, however, because you cannot fully integrate the two using today’s available technology.

iPad Active Directory management options

When setting up the email client on the iPad, a user can choose to connect to Microsoft Exchange Server. The Exchange server gets its user information from Active Directory, but the iPad/Active Directory relationship is almost nothing like what IT professionals are used to seeing between Active Directory and PCs.

To unlock a few more iPad management capabilities, admins can implement Exchange ActiveSync, which offers more than just email account synchronization. With ActiveSync enabled, admins can use the Exchange System Manager (in the Exchange Management Console) to enforce the use of passwords on iPads and set password length and character requirements. They can also set a limit on failed password attempts and, once that threshold is reached, perform a local wipe. There is also the option to execute a remote wipe, which can be useful when a device is lost or stolen.

Newer versions of Exchange have added some iPad management features, but most of them still relate to passwords. With Exchange Server 2007, for example, admins can allow or prohibit simple passwords, set password expiration rules and determine the number of complex characters that users must have in a password.

The limitations of iPad Active Directory management
These features do improve password security, but they don’t help at all when it comes to managing the device itself and its properties. For instance, there are no options that allow admins to import iPad information to and from Active Directory, or to create policies that specify which apps can and can’t be installed on the iPad.

As of right now, Microsoft, Apple and third-party vendors all lack the capabilities to manage the iPad with the same level of Active Directory control as you’d manage PCs. The products that do exist are mainly based on ActiveSync, so their options are comparable to what’s already available through Exchange.

Categories: Active Directory

Best Practices for Active Directory Forest Trusts

November 8, 2011 Leave a comment

When your Active Directory forest just contains a couple of domains, life is pretty good for you as the administrator—there’s not a lot to go wrong, clients receive fast responses, and in general, things work as they should.

But as more and more domains come online and, in particular, as you expand into different forests to further delineate security boundaries, the situation requires more management, especially as you come to expect trusts to hold everything together seamlessly. Here are some best practices on managing trusts to make authentication available and management of your AD infrastructure much easier.
Use shortcut trusts to eliminate delays. Delays creep up when your Active Directory forest has lots of trees in it containing multiple child domains. When you find that clients are taking a long time to authenticate, especially between those child domains, a best practice is to create shortcut trusts to mid-level domains within each tree hierarchy where possible. These shortcut trusts are essentially bidirectional transitive trusts that effectively lessen the length of the path traveled for authentications to take place between domains located in two separate trees.

To create these shortcut trusts:

Open Active Directory Domains and Trusts, and in the left pane, right-click the domain node for the domain you want to establish a shortcut trust with, and then click Properties.
On the Trusts tab, click New Trust, and then click Next.
On the Trust Name page, type the DNS name (or NetBIOS name) of the domain, and then click Next.
On the Direction of Trust page, choose to create either a two-way, shortcut trust (click Two-way) or choose one of the various one-way options if for some reason you need to limit reciprocity.
Continue on with the wizard to completion.
Keep a current list of all trust relationships in your forest. This way, during administrative tasks, you don’t have to puzzle out why some authentications are working and others aren’t, or what domain trusts another domain one way but not the other, and so on. This is a common problem in large forests, or organizations with multiple forests, with many administrators that may be creating trusts without adequately documenting their actions. There’s a tool from Microsoft called NLTest that, among other useful things, queries the trust status for all domains and shows the other domains that a given domain trusts.

For example, to view the established trust relationships for your domain, use nltest /domain_trusts. You’ll get a result that looks like this:

List of domain trusts:
0: testdomain.com testdomain.com (NT 5) (Forest Tree Root) (Primary Domain)
The command completed successfully
Perform a good backup and always test to ensure you have restore capability as well. Trusts are complicated to architect correctly and difficult to recreate exactly as they were in the event they’re lost. To protect yourself, ensure that all domain controllers in every domain in all of your forests have a current and tested system state backup. The system state backup contains the Active Directory trust data stored at any given point of time in the system. During a restore, the domain controller is put into a special mode that allows it to return to replication—including replicating the appropriate trust information—among all of the other online domain controllers without generating or encountering integrity errors. The built-in Windows Server Backup product contains the appropriate tooling to conduct these system state backups, but other third-party products that may already be protecting your data centers also have this capability as well.

Issue with the SCOM Agent authentication against the SCOM Management Server If you have multi-domain environment

September 13, 2011 8 comments

You have successfully installed SCOM Agent manually or by discovery wizard on managed computer. However, managed computer doesn’t appear in the Agent Managed or Pending Management list in the Operations Console.

The following event is logged in the Operations Manager event log on Agent-managed computer:

Event Type: Error

Event Source: OpsMgr Connector

Event Category: None

Event ID: 20057

Description: Failed to initialize security context for target MSOMHSvc/ The error returned is 0×80090311(No authority could be contacted for authentication.). This error can apply to either the Kerberos or the SChannel package.

The following event is logged in the Operations Manager event log on SCOM Management Server:

Event Type: Information

Event Source: Health Service Modules

Event Category: None

Event ID: 10616
Description:
The Operations Manager Server successfully completed the operation Agent Install on remote computer doc.contoso.msft.
Install account: CONTOSO\administrator
Error Code: 0
Error Description: The operation completed successfully.

How to confirm the problem?

To troubleshoot the issue, Microsoft Network Monitor can be used:
■Stop HealthService on managed computer to stop the SCOM Agent (open the Command Prompt and type the net stop HealthService).
■Start Microsoft Network Monitor.
■Click on the New capture tab.
■In the Capture Filter, enter the following filter:

KerberosV5
OR KerberosV5_Struct
OR NLMP
OR NLMP_Struct
OR GssAPI
OR SpnegoNegotiationToken
OR GssapiKrb5
OR LDAP

■Click on the Apply button to apply the Capture Filter.
■Click on the Start button to start the new capture.
■Now, quickly start the HealthService to start the SCOM Agent (net start HealthService).
■Wait (usually 10-15 seconds) until event 20057 appears in the Operations Manager event log on the affected computer.
■In Network Monitor, click on the Stop button to stop the capture.
■Now carefully revise capture frames in the Frame Summary window. You should see KerberosV5 and LDAP protocol traffic against the Active Directory Domain Controllers.

NOTE: Above applies in case that you are not using certificate-based authentication.

To resolve this issue, make sure that TCP/UDP 88 port (Kerberos) and TCP/UDP 389 port (LDAP) is open against the Domain Controllers in your Active Directory environment.

These ports are not documented in the TechNet’s article Using a Firewall with Operations Manager 2007.

What happens under the hub?

kerb

When SCOM Agent Management Server communication starts, authentication takes place (Kerberos). If you have multi-domain environment, things are bit more complicated. Before the authentication protocols can follow the forest/domain trust path, the service principal name (SPN) of the SCOM Management Server must be resolved (LDAP).

When a managed computer (SCOM Agent) in one domain attempts to access resource computer (SCOM Management Server) in another domain, it contacts the domain controller for a service ticket to the SPN of the resource computer. Once the domain controller queries the global catalog and identifies that the SPN is not in the same domain as the domain controller, the domain controller sends a referral for its parent domain back to the workstation. At that point, the workstation queries the parent domain for the service ticket and follows the referral chain until it gets to the domain where the resource is located.

If you have SCOM Management Server in child domain A of the Active Directory Forest infrastructure and the SCOM Agent in child domain B, make sure that SCOM Agent is able to access all DC’s in the referral chain which are required to get to the domain where SCOM Management Server is located.

For more information about the ports required for the System Center Operations Manager, and the authentication in Operations Manager, refer to the following TechNet articles:

Authentication and Data Encryption for Windows Computers in Operations Manager 2007, available at the: http://technet.microsoft.com/en-us/library/bb735408.aspx

Using a Firewall with Operations Manager 2007, available at the:
http://technet.microsoft.com/en-us/library/cc540431.aspx

Active Directory Management Pack best practice

June 26, 2011 8 comments

OpsMgr’s functionality is provided by management packs. While Microsoft
provides documentation in the form of management pack guides for each management
pack (MP), we believe there is benefit in providing a high-level overview of how
to implement and tune those management packs. A while back, the OpsMgr blog ran
a series of “OpsMgr by Example” postings covering some of the more well-known
management packs, which were later published on systemcentercentral.com. This
posting denotes the return of an updated series covering a number of management
packs available during the OpsMgr 2007 R2 timeframe.

Each of these postings will discuss steps taken to implement a particular MP,
and examples of alerts and tuning steps. In doing so, our goal isto provide a
5000’ perspective plus show the details while tuning during a deployment. Those
MPs which have tuning steps particular to OpsMgr 2007 R2 will be noted. We also
provide thoughts for how many of the management packs could evolve. First to be
covered – the Active Directory MP (ADMP).

The ADMP is available as a single download containing different libraries to
monitor Active Directory 2000, 2003, and 2008 domain controllers.

Installing the ADMP

  1. Download the Active Directory Management Pack from the Management Pack
    Catalog (http://technet.microsoft.com/en-us/opsmgr/cc539535.aspx).
    The Active Directory Management Pack Guide is included in the download and
    labeled “OM2007_MP_AD2008.doc.” Beginning with the R2 release of Operations
    Manager, you can download management packs directly using the OpsMgr user
    interface (UI). It is suggested you actually download from the website and
    install it yourself so that you can have a copy of the management pack guide
    available during installation. If you already have a copy of the management pack
    guide, use the OpsMgr UI functionality to download and install the management
    pack.
  2. Read the Management Pack guide – cover to cover. This document spells out in
    detail some important pieces of information you will need to know.
  3. Import the AD Management Pack (using either the Operations console or
    PowerShell).
  4. Deploy the OpsMgr agent to all domain controllers (DCs). The agent must be
    deployed to all DCs. Agentless configurations will NOT work for the AD
    Management Pack.
  5. Get a list of all domain controllers from the Operations console. In the
    Authoring space, navigate to Authoring -> Groups -> AD Domain Controller
    Group (Windows 2008 Server). Right-click on the group(s) and select View Group
    Members.
  6. Enable Agent Proxy configuration on all Domain Controllers identified from
    the groups. This is in the Administration node, under Administration ->
    Device Management -> Agent Managed. Right-click each domain controller,
    select Properties, click the Security tab, and then check the box labeled Allow
    this agent to act as a proxy and discover managed objects on other computers.
    Perform this action for every domain controller, even if you add the DC after
    your initial configuration of OpsMgr. For a simple method to bulk-add the proxy
    setting, see http://www.systemcenterforum.org/news/opsmgr-enabling-agent-proxy-for-all-computers-hosting-an-instance-of-a-specific-object-class/
    for details (thanks to Ziemek Boroski for his on the http://ops-mgr.spaces.live.com site
    for this).
  7. Configure the Replication account in the Operations console, under
    Administration -> Security (full details for this are in the AD MP Guide). Do
    this for every domain controller, even if you add the DC after your initial
    OpsMgr configuration.
  8. Validate the existence of the OpsMgrLatencyMonitors container (this was
    previously named the MOMLatencyMonitors container). Within this container, there
    should be sub-folders for each DC, using the name of each domain controller. If
    the container does not exist, it is often due to insufficient permissions. (See
    information configuring the Replication account within the AD MP Guide for
    details.)
  9. Open the Operations console. Go to the Monitoring node and navigate to
    Monitoring -> Microsoft Windows Active Directory -> Topology Views and
    validate functionality. (You may have to set the scope to the AD Domain
    Controllers Group to get these views to populate).
  10. Check to make sure Active Directory shows up under Monitoring ->
    Distributed Applications as a distributed application that is in the Healthy,
    Warning or Critical state. If it is in the “Not Monitored” state, check for
    domain controllers that are not installed or are in a “gray” state.
  11. Create a MicrosoftWindowsActiveDirectory_Overrides management pack to
    contain any overrides required for the MP (hey, if it’s not created now you’ll
    never remember to create it and you will end up using the default MP and that’s
    not good – see http://cameronfuller.spaces.live.com/blog/cns!A231E4EB0417CB76!1152.entry
    or System Center Operations Manager 2007 Unleashed [Sams, 2008] for
    details).
  12. The Active Directory Helper Object (known as oomads) needs to be installed
    on each domain controller that OpsMgr will monitor. This file (OOMADs.msi) is
    available on the OpsMgr R2 installation media in the HelperObjects folder, under
    the subfolder for the appropriate version of the operating system (amd64, i386,
    or ia64).

Deploying the Active Directory 2008 management pack was relatively painless.
After importing the management pack, there was no significant impact on
processors seen on the domain controllers. The Active Directory Topology Root
appeared as a distributed application and showed a health state of green. The
Active Directory diagram view also worked as expected.

Changes to the Run As Account in R2

The new Run As accounts in OpsMgr R2 for the Active Directory Management Pack
have changed by adding the ability to define where you can target a Run As
account to. The simplest (and most insecure) approach is to use the All targeted
options, but this causes the Run As accounts to be deployed everywhere
(including to remote forests where you should not attempt to use the account).
The recommended approach is to create a Run As account for the AD MP Account Run
As profile that specifies the domain controller’s computer objects as their
target.

Tuning / Alerts to look for in the Active Directory MP

The following alerts were encountered and resolved while tuning the various
Active Directory management packs (these are listed in alphabetical order by
Alert name):

Alert: (none)

Issue: The SysVol for Windows 2008 portion of the Management Pack for Active
Directory Server 2008 (Monitoring) identified an alert as part of the DFS
Service Health alert monitor for one of the domain controllers in our
environment. No additional knowledge was available.

Resolution: It was determined the technician had uninstalled the Exchange
2007 tools from the domain controller at the time that these alerts activated.
These alerts had not recurred since that time. The alerts were closed to monitor
to see if it will reoccur.

Alert: A problem has been detected with the trust
relationship between two domains.

Issue: A server in a location (site 1) lost communication with domain
controllers that existed in a second location (site 2). This critical alert did
NOT auto-resolve. This was detected by the alert rule “A problem has been
detected with the trust relationship between the two domains.” As part of
troubleshooting, verified that the Last Modified date occurred during the outage
(add this column to the display by personalizing the view on the Active Alerts
to include the field) and the Repeat Count was not incrementing.

Resolution: Use the Active Directory Domain Controller Server 2008 Computer
Role Task of Enumerate Trusts to validate all trusts were working after site
connectivity was re-established. Then log into the domain controller reporting
the error and use the Active Directory Domains and Trusts UI to validate each of
the trusts. Close the alert manually.

Alert: A problem has been detected with the trust
relationship between two domains

Issue: This alert is occurring from domain controllers who cannot communicate
with the domain controller in the trusted domain to validate this trust.

Resolution: These domain controllers do not require validation of the trust
from these remote locations. Disable these alerts for the domain controllers not
needing to validate the trust that were unable to reach the domain controllers
that they trusted due to routing restrictions.

Alert: A problem was detected with the trust relationship
between two domains

Issue: The domain controllers could not connect to the domain controller in
the other domain. This was due to a routing issue between the specific domain
controllers and the domain controller in the remote domain. Remote sites were
connected via VPN and could not route to that subnet.

Resolution: Provided routing from the domain controllers to the domain
controller in the other domain.

Alert: A problem has been detected with the trust
relationship between two domains

Additional Alert: A problem with the inter-domain trusts has been
detected.

Issue: This alert is occurring from domain controllers who cannot communicate
with the domain controller in the trusted domain to validate this trust.

Resolution: Tested first with the NETDOM command (override with
parameters to do a dsquery /domain: /verify dc) first for the local domain
(success) then for the remote domain reporting the failure (failed with cannot
contact the remote domain). Then nltest first for the local domain (success)
then for the remote domain reporting the failure (ERROR_NO_LOGON_SERVERS). Ran a
DCDIAG on the server next and a NETDIAG. Failures on the server on both NETDOM
and NLTEST queries. Ran the enumerate trusts task on the system, it fails on the
remote domain as well (AD_Enumerate_Trusts.vbs). DNS was inconsistent in the
environment (used nslookup with different servers to validate that the results
of the lookup to the remote domain name were not consistent). Made DNS
consistent and flushed DNS on the server experiencing the alerts. The critical
level alert resolved itself, closed the other one.

Alert: A problem has been detected with the trust
relationship between two domains

Additional Alert: A problem with the inter-domain trusts has
been detected

Issue: Specific domain controllers were reporting the alert as an
issue to verify the trust between two forests in the environment.

Resolution: The domain controller in question did not have a zone to
provide name resolution to the other forest. Added the zone to the domain
controller’s DNS.

Alert: A problem has been detected with the trust
relationship between two domains

Issue: This occurs when a domain controller has been removed from the
environment, and does not represent an issue if the Alert Description contains
the information that it cleaned up the naming context.

Resolution: Alerts of this type can be closed, as they will occur on
each domain controller in the environment that sees the piece of the replication
that is no longer relevant.

Alert: A problem with the inter-domain trusts has been
detected.

Issue: A server in a location (site 1) lost communication with domain
controllers that existed in a second location (site 2). This critical alert did
NOT auto-resolve. This was detected by the AD Trust Monitoring monitor, which
runs every 5 minutes using the AD Monitor Trusts script. It was verified that
the Last Modified date occurred during the outage (add this column to the
display by personalizing the view on the Active Alerts to include the field) and
the Repeat Count was not incrementing.

Resolution: Use the Active Directory Domain Controller Server 2008 Computer
Role Task of Enumerate Trusts to validate all trusts were working after site
connectivity was re-established. Next, log into the domain controller reporting
the error and use the Active Directory Domains and Trusts UI to validate each of
the trusts. This alert should auto-resolve when the trust relationships are
working, but that functionality does not appear to work. The alert was closed
manually.

Alert: A replication island has been detected. Replication
will not occur across the enterprise.

Issue: In sites and services dc1 replicated with dc2 but dc2 did not
replicate with dc1.

Resolution: DC1 was only referencing itself for DNS as 127.0.0.1, with no DNS
to the remote DC on the TCP port properties. Rebooted DC2 after the change since
DNS could not connect to itself on DC2.

The root cause of this alert was an issue with RPC between the two domain
controllers. RPC in the environment is coded to a specific port and this port
change had not been made to the second domain controller.

Alert: Account Changes Report Available.

Issue: Informational alert, which can be accessed in the AD SAM Account
Changes report (available on the right side under Active Directory Domain
reports).

Resolution: No resolution required. Checked the AD SAM Account Changes report
(available on the right side under Active Directory Domain reports) to see the
changes that were available.

Alert: Active Directory cannot perform an authenticated RPC
call to another DC because the SPN for the destination DC is not registered on
the KDC

Issue: One domain controller was offline during the time period, a second
domain controller was promoted, the FSMO roles were moved, and then the process
was rolled back due to technical issues. Caused by replication issues in the
environment. The domain controller had been dcpromoted back out and back in and
resulted in old records that were within ADSIEdit and were invalid. This was
part of the ForestDNSZones,DC=_msdcs.abcco.com records.

Resolution: Added SPN information manually to the server to work around the
errors.

Alert: AD cannot allocate memory

Issue: The domain controller has 4GB of memory but when it was logged
into there were more than 6GB of memory in use. Attempted to stop programs which
appeared to be causing this (large numbers of cmds and nslookup tasks that were
failed) but this did not end up freeing the memory.

Resolution: Per the product knowledge, rebooted the server and
verified that the memory had returned to more reasonable numbers (less than 1GB
in use). Closed the alert, but tracking this to see if it recurs on this
server.

Alert: AD Client Side – Script Based Test Failed to
Complete.

Issue: AD Replication Partner Op Master Consistency: The script ‘AD
Replication Partner Op Master Consistency’ could not create object
‘McActiveDir.ActiveDirectory.’ This is an unexpected error. The error returned
was ‘ActiveX component can’t create object’ (0x1AD)

Resolution: In MOM 2005, this was resolved by changing the Action account. In
OpsMgr 2007, this alert occurred in a different domain than the one with the
OpsMgr root management server (RMS). To resolve this, create a Run As Account
for the domain (DMZ) and assign the Run As Account to the AD domain controllers
in the DMZ domain.

Alert: AD Client Side – Script Based Test Failed to
Complete.

Issue: This alert is generated by the AD Replication Partner Op Master
Consistency monitor. The system reporting the error was generating an error of
event id 45 in the Operations Manager Log from the source of Health Service
Script.

This event is occurring on an hourly basis (12:57, 1:58, and so on):

AD Replication Partner Op Master Consistency: The script ‘AD Replication
Partner Op Master Consistency’ failed to execute the following LDAP query:
‘<LDAP://servername.odyssey.com/CN=Configuration,DC=ODYSSEY,DC=COM>;(&(objectClass=crossRefContainer)(fSMORoleOwner=*));fSMORoleOwner;Subtree’.

The error returned was ‘Table does not exist.’ (0x80040E37)

This alert is linked to “Could not determine the FSMO role holder.” alerts
that are occurring.

Resolution: Believe this was related to misconfigurations of the anti-virus
settings on the domain controllers in the environment.

Alert: AD Domain Performance Health Degraded.

Issue: More than 60% of the DCs contained in this AD Domain report a
Performance Health problem

Resolution: This alert indicates that there are alerts that are occurring in
more than 60% of the domain controllers in a domain. This alert does not require
an action for itself but does require analysis to determine what is causing the
domain controllers to be in a degraded state.

Alert: AD Op Master is inconsistent.

Issue: Tested using the AD Replication Partner Op Master Consistency alert
monitor, which runs every minute, to verify the incoming replication partners
for the domain controller show the same operations masters. Also used the
REPADMIN Replsum task in the Active Directory MP.

Resolution: The REPADMIN Replsum command validated that replication was
functioning correctly (had to override the “Support Tools Install Dir” on
Windows 2008 to %windir%\system32 to make the task work correctly). The override
was done when the task was actually run. It’s not created as an override in the
OpsMgr console or in the Authoring View but rather when the task is executed.
The link between the domain controllers has been running close to fully
saturated. The alert auto-resolved once the network utilization slowed down.

Alert: AD Op Master is inconsistent

Issue: Active Directory Operations Master role is found to be in a
transitional state.

Resolution: This message is generated when an AD Operations Master role is
moved from one server to another and can be safely ignored.

Alert: AD Op Master is inconsistent

Issue: Tested using the AD Replication Partner Op Master Consistency alert
monitor, which runs every minute, to verify the incoming replication partners
for the domain controller show the same operations masters. Also used the
REPADMIN Replsum task in the Active Directory MP.

Resolution: Additional information on this alert is available at
Marcus Oh’s blog at http://marcusoh.blogspot.com/2009/07/understanding-ad-op-master-is.html.

Alert: AD Replication is occurring slowly

Issue: Same as identified in alert AD Replication is slower than the
configured threshold. This rule does not provide the ability to override the
default configuration of 15 minutes. The AD environment is not configured with
the default of 15 minutes so these rules do not apply as they are still
replicating within a successful timeframe.

Resolution: Disabled this rule (AD Replication is occurring slowly) for group
AD Domain Controller Group (Windows 2003 Server). You could also do this for
individual servers if there were a limited number of these where the AD
replication was not configured with default replication times of 15 minutes.
Closed the alerts.

Alert: AD Replication is occurring slowly

Issue: Occurred on a domain controller that had been having issues
replicating for a period of time.

Resolution: Rebooted the domain controller, this alert was generated after
the reboot. The script is scheduled to run every 900 seconds (every 15 minutes).
Used the REPADMIN Replsum command to validate that replication was functioning
correctly (had to override the “Support Tools Install Dir” on Windows 2008 to
%windir%\system32). No errors were found on the REPADMIN Replsum
command. Waited the 15 minutes to verify the domain controller was not
continuing to experience the issue, and closed the alert.

Alert: AD Replication is slower than the configured
threshold

Issue: Intersite Expected Max Latency (min) default 15
Intrasite Expected
Max Latency (min) default 5.

Issue: This alert will also occur if connectivity is lost between sites for a
long enough period of time.

Resolution: If the alert is not current and not repeating and if replication
is occurring and the Repadmin Replsum task comes up clean, this alert can be
noted (to see if there is a consistent day of week or time that it occurs at)
and closed. Added a diagnostic to the AD Replication Monitoring monitor, for the
critical state, taking the information from the REPADMIN Replsum task which
provided (You must have the admin utilities installed on the DC for this to
work):

REPADMIN.EXE
%ProgramFiles%\Support Tools\ /replsum 1200

Created the diagnostic to run automatically using:
Program:
REPADMIN.EXE
Working Directory: %ProgramFiles%\Support
Tools
Parameters: /replsum
Options available included changing the
replication topology to replicate every 15 minutes, or configuring overrides. To
resolve, tried creating a custom group for the servers in the location (see the
Creating Computer Groups based on AD Site in OpsMgr blog entry on http://Cameronfuller.spaces.live.com
for additional information) and created an override for the new group changing
the Intersite Expected Max Latency to 120 (so it would be double the
configuration in AD Sites and Services). Performed this configuration for each
remote location that did not have a 15 minute replication interval. You could
also do thihs for all domain controllers, using the domain controller computer
group(s). This did not function as expected but is used as an example for how
overrides can be creatively configured, in this case based upon sites!

Alert: AD Replication Monitoring – Access denied

Issue: This occurred on one domain controller and there also was an
alert stating that it failed to create the MOMLatencyMonitors container.
Validated the container by logging into the domain controller, opening up AD
Users and Computers, View ->Advanced Features, and verifying the container
(and the two existing domain controllers as sub-containers) exists.

Resolution: Already resolved, as the MSAA had the permissions required
to create this container. Validated the MOMLatencyMonitors container existed and
that container included sub-folders matching the name of each domain controller.
(If the container does not exist, it is often due to insufficient permissions;
see configuring the replication account within the AD MP Guide for configuration
information.)

Alert: AD Replication Monitoring – Access denied

Issue: This occurred on several domain controllers when the
OpsMgrLatencyMonitors container was removed. Validated the container by logging
into the domain controller, opening up AD Users and Computers, View ->
Advanced Features, and verifying the container (and the two existing domain
controllers as sub-containers) exists.

Resolution: Already resolved as the MSAA had the permissions required
to create this container. Validated the OpsMgrLatencyMonitors container existed
and that container included sub-folders matching the name of each domain
controller. (If the container does not exist, it is often due to insufficient
permissions; see configuring the replication account within the AD MP Guide for
configuration information.)

Alert: AD Replication Monitoring – Time skew detected

Issue: Caused by domain controllers running on Virtual Servers that were
synchronizing with the host operating system while the host operating system was
not time synchronized.

Resolution: Fixed the actual time on the domain controllers and
configured the Guest operating system in Virtual Server to not synchronize with
the Host operating system. This was accomplished by shutting down the Guest
operating system, configuring the Virtual Machine Addition Properties, under
additional features uncheck Host Time Synchronization, and restarting the Guest
operating system.

Alert: AD Site Availability Health Degraded

Issue: Caused by another alert that is affecting the DCs availability. Check
the status of AD as a distributed application to determine what alert is
affecting AD availability.

Resolution: Investigated the alert causing the DC availability issue, which
in this case was the Logical Disk Free Space is Low alert.
Another example of
this was a domain controller with a second power supply that was not plugged in
and was alerting via the HP management pack.

Alert: AD Site Performance Health Degraded.

Issue: More than 60% of the DCs contained in this AD Site report a
Performance Health problem

Resolution: This alert indicates that there are alerts that are occurring in
more than 60% of the domain controllers in a site. This alert does not require
an action for itself but does require analysis to determine what is causing the
domain controllers to be in a degraded state.

Alert: Could not determine the FSMO role holder.

Issue: Each domain controller in the environment reported the error when
trying to determine the Schema Op Master on the various domain controllers. The
rule generating this was “Could not determine the FSMO role holder.”

Resolution: We used the NETDOM Query FSMO task (changing the Support Tools
Install Dir to %windir%\system32) to validate the FSMO role holders on
each domain controller.

Alert: Could not determine the FSMO role holder.

Additional Alert: AD Client Side – Script Based Test Failed
to Complete

Additional Alert: AD Op Master is inconsistent

Issue: These three alerts are DNS related. In one situation, there was
a bad DNS record on one of the top-level DNS servers. One could ping the NetBIOS
name, but could not ping the FQDN (it was a DC in another domain within the
forest). In the second instance, there was a bad IP address in the HOST file.
Once all DNS resolution was resolved, the alerts auto cleared.

The alerts have also come in and then auto resolve on their own. This
happened when someone rebooted a DC in another domain and that server was the
only DC for that domain.

A good link to investigate DNS issues is http://www.windowsnetworking.com/articles_tutorials/Using-NSLOOKUP-DNS-Server-diagnosis.html.

Resolution: Resolving DNS issues in the environment.

Submitted By: CK on the Ops-Mgr.spaces.live.com website

Alert: DC has failed to synchronize its naming context with
replication partners.

Issue: One of the domain controllers in the environment went to a grayed out
status.

The server having the issues reported the “DC has failed to synchronize its
naming context with replication partners” issue and “A problem has been detected
with the trust relationship between two domains” and “AD Replication is
occurring slowly” and “Script Based Test Failed to Complete” (for multiple AD
related scripts).

Other domain controllers reported “Could not determine the FSMO role holder”
and “AD Client Side – Script Based Test Failed to Complete.”

Events also occurred on the client system (21006 OpsMgr Connector, 20057
OpsMgr Connector, 21001 OpsMgr Connector).

Resolution: Installed the Telnet client feature to test connectivity to the
management server. Telnet connectivity failed from this system but not from
others. Restarted the OpsMgr Health service but it had no effect on the gray
status. After rebooting the system, the status went back to non-gray.

Alert: DC has failed to synchronize its naming context with
replication partners.

Issue: A server in a location (site 1) lost communication with domain
controllers that existed in a second location (site 2). The rule generating this
alert is “DC has failed to synchronize naming context with its replication
partner.”

Resolution: The alerts occurred when connectivity was lost between the sites.
These alerts had a Repeat Count of 0. Used the REPADMIN Replsum command to
validate that replication was functioning correctly (had to override the
“Support Tools Install Dir” on Windows 2008 to %windir%\system32 to
make the task work correctly). Closed the alerts manually.

Alert: DC is both a Global Catalog and the Infrastructure
Update master

Issue: The domain controller was both a Global Catalog and the Infrastructure
master. This configuration is acceptable as long as all domain controllers are
GCs, but this does result in additional replication traffic for the additional
domain.

Resolution: Options available on this would be to override this and disable
it on the server but this does not resolve the issue in most situations. For
this environment with all DCs being GCs and the additional domain being a small
child domain with only minor amounts of information, the recommended approach is
to create the override.

If this is not the case, the preferred approach to take is to deploy an
additional domain controller that is NOT a Global Catalog server and run the
Infrastructure Update master on that server. This new domain controller can be
deployed on either physical or virtual configurations depending upon the client
requirements. Further detail on this condition is available in the Microsoft
article at http://support.microsoft.com/default.aspx/kb/251095.

Alert: KCC cannot compute a replication path

Issue: KCC detected problems on multiple domain controllers

Resolution: Connectivity was lost from the central site to a remote site for
a period of several hours. The remote site was down due to a power outage.
Errors were logged every 15 minutes from when it was down until when the site
was back online. This also occurred when a domain controller had been shut off
but still existed from the perspective of Active Directory. This can also occur
in environments where the site topology is set to automatically generate the
site links but the network is configured so that some sites cannot see other
sites. (As an example, in a configuration with a hub in Dallas and sites in
Frisco and Plano, where both sites can see Dallas but cannot see each
other.)

Alert: One or more domain controllers may not be
replicating.

Issue: The AD MP will report replication issues across all DCs if only one
was down (and thus not able to replicate its monitor objects).

Resolution: Get all domain controllers monitored by OpsMgr. Validate
replication in the environment.

Alert: Overall Essential Services state

Issue: The Overall Essential Services state monitor portion of the Active
Directory Domain Controller Server 2008 Computer role identified an alert. No
additional knowledge was available.

Resolution: Speaking with the technician, it was determined he had performed
an uninstallation of the Exchange 2007 tools from the domain controller at the
time that these alerts activated. These alerts had not recurred since that time.
Closed the alerts to monitor if it will reoccur.

Alert: Performance Module could not find a performance
counter.

Issue: In PerfDataSource, could not resolve counter DirectoryServices, KDC AS
Requests, Module will be unloaded.

Resolution: Created a Run As Account and configured the AD MP Account
(Administration -> Security -> Run As Profiles) for each of the two
servers in the domain that were reporting errors.

Alert: Replication is not occurring – All replication
partners have failed to synchronize

Issue: The Alert Description is the key on this alert. All replication
partners are now replicating successfully.

Resolution: Alert description of “AD Replication Monitoring: All
replication partners are now replicating successfully” is a success condition
and does not require any intervention other than closing the alert.

Alert: Script Based Test Failed to Complete.

Issue: AD Lost And Found Object Count: The script ‘AD Lost And Found Object
Count’ failed to create object ‘McActiveDir.ActiveDirectory’. This is an
unexpected error. The error returned was ‘ActiveX component can’t create object’
(0x1AD)

Resolution: Configured the AD MP Account (Administration -> Security ->
Run As Profiles) for each of the two servers in the domain that were reporting
errors.

Alert: Script Based Test Failed to Complete.

Issue: AD Database and Log: The script ‘AD Database and Log’ failed to create
object ‘McActiveDir.ActiveDirectory’. The error returned was ‘ActiveX component
can’t create object’ (0x1AD).

Resolution: Configured the AD MP Account (Administration -> Security ->
Run As Profiles) for each of the two servers in the domain that were reporting
errors.

Alert: Script Based Test Failed to Complete.

Issue: AD Database and Log: The script ‘AD Database and Log’ failed to create
object ‘McActiveDir.ActiveDirectory.’ The error returned was ‘ActiveX component
can’t create object’ (0x1AD)

Resolution: Installed OOMADS from OpsMgr 2007 R2 installation media. The
OOMADs.msi file is included within the HelperObjects folder on the media within
the appropriate version of the operating system (amd64, i386, ia64).

Alert: Script Based Test Failed to Complete

A problem has been detected with the trust relationship between two
domains

Issue: The server was a domain controller that was exhibiting a variety of
different errors including the following:

AD Monitor Trusts: The trusts between this domain (ABC.COM) and the following
domain(s) are in an error state: xyz.com (inbound), the error is ‘There are
currently no logon servers available to service the logon request’ (0x51F)

AD Replication Partner Count: The script ‘AD Replication Partner Count’
failed to bind to
‘LDAP://DC01.ABC.COM/CN=DC01,CN=Servers,CN=Plano,CN=Sites,CN=Configuration,DC=ABC,DC=COM.’
The error returned was ‘Object variable not set’ (0x5B)

1153 of these in 4 days + 1 hour (1:17:28 pm) – failing every 5 minutes.

AD Lost And Found Object Count: Script ‘AD Lost And Found Object Count’ was
unable to bind to the lost and found container.

1152 of these in 4 days + 1 hour (1:17:28 pm) – every 5 minutes failing.

AD Database and Log: The script ‘AD Database and Log’ encountered an error
while trying to get the object ‘LDAP://DC01.ABC.COM/RootDSE.’ The error returned
was: ‘The server is not operational.’ (0x8007203A)

388 of these in 4 days + 2 hours (1:17:44 pm).

AD Replication Monitoring: encountered a runtime error. Failed to bind to
‘LDAP://DC01.ABC.COM/RootDSE.’ The error returned was ‘The server is not
operational.’ (0x8007203A)

799 of these in 4 days + 2 hours (1:17:44 pm).

Resolution: Logged into the server, attempted to open Active Directory
Domains and Trusts and received the message: “The configuration information
describing this enterprise is not available. The server is not operational.”
Debugging, rebooting the server. After reboot the issue opening Active Directory
Domains and Trusts no longer occurred. Closed the alerts generated to see if
they would recur.

Alert: Script Based Test Failed to Complete

Issue: AD Database and Log: The script AD Database and Log failed to create
object McActiveDir.ActiveDirectory. The error returned was: ActiveX component
cannot create object (0×1AD)

Resolution: Uninstalled OOMADS using Add/Remove programs, Active Directory
Management Pack Helper Object (the original version was .05MB in size) and
re-installed the 64 bit equivalent which was AMD64 in this case. To do this had
to copy the MSI locally to the system to install it, after installation it was
.07MB in size within Add/Remove programs.

Alert: Session setup failed because no trust account exists:
Script ‘AD Validate Server Trust Event’

Issue: Specific computer accounts were identified multiple times as
not containing a trust account

Resolution: This is caused either by systems that believe that they
are part of the domain but no longer are, or often by systems that are being
imaged. Resolution of this is either to drop and rejoin the system to the domain
or to close the alert if the system is no longer online. These alerts are not
actionable. Decreased the severity of these alerts from critical to
informational via an override.

Alert: Some replication partners have failed to
synchronize

Issue: A domain controller was offline and unable to be synchronized
with.

Resolution: Bring the domain controller back online.

Alert: The AD Last Bind latency is above the configured
threshold.

Issue: One domain controller had consistently high AD Last Bind Latency.
Logon to the system showed it as extremely unresponsive.

Used the suggested tasks from product knowledge to validate the bind was not
going slowly and no high CPU processes were identified on the system. The view
available in product knowledge pointed to a large spike in the time required for
the LDAP query (checking the Active Directory Last Bind counter). The spike
occurred while there was a very heavy processor utilization occurring on one of
the domain controllers. This monitor checks every 5 minutes. Alert auto-resolved
itself after the LDAP query was responding in an acceptable timeframe.

Resolution: Attempts to debug the issue were inconclusive and extremely
difficult due to the performance issue with the system. Rebooted the domain
controller, it came back online, and the AD Last Bind Latency returned to normal
values.

Alert: The AD Machine Account Authentication Failures Report
has data available.

Issue: The alert was raised on both domain controllers in the same physical
location. The alert description contains the name of the computer account that
is failing to authenticate. Multiple examples of this alert have been seen where
sometimes it is an actionable alert and sometimes it is not.

In one case, there was a server where the computer account had been removed
from the domain. This was a fully actionable situation where the computer had to
be re-added to the domain to resolve the issue. Then the alert was closed
because this alert is generated by a rule so it will not auto-resolve.

In another situation, the computer account was for a workstation that was
consistently not able to communicate with the domain controllers as it was
connected remotely to another network via VPN.

Resolution: Disjoin from the domain (no to reboot), rejoin the domain
and reboot the system which is having the issue. These alerts are not
actionable. Decreased the severity of these alerts from critical to
informational via an override.

Alert: The Domain Changes report has data available.

Issue: No issue, this was an informational message. This was generated when
the PDC emulator role was moved between domain controllers in the
environment.

Resolution: No actions required, this message is provided for situations
where the PDC emulator role was moved unexpectedly.

Alert: The Domain Controller has been started

Issue: Notification that a domain controller was started, sent as an
information message which is generated by an Alert Rule (since it is a rule not
a monitor it will not auto-resolve). This is a good alert to keep as it provides
a simple way to see when a domain controller is rebooted and when it back
online. Prior to this message there should be an information message appears
when the domain controller has been stopped.

Resolution: Manually close the alert as the domain controller reboot was
expected.

Alert: The Domain Controller has been stopped

Issue: Notification that a domain controller was stopped, sent as an
information message which is generated by an Alert Rule (since it is a rule not
a monitor it will not auto-resolve). This is a good alert to keep as it provides
a simple way to see what domain controllers have been rebooted to identify
situations where domain controllers are unexpectedly rebooted. A follow-up
information message appears when the domain controller has been restarted
successfully.

Resolution: Manually closed the alert as the domain controller reboot was
expected.

Alert: The Op Master Domain Naming Master Last Bind latency
is above the configured threshold.

Issue: A large number of alerts are generated at > 5 seconds for warning
and > 15 seconds for error.

Resolution: Per http://technet.microsoft.com/en-us/library/cc749936.aspx,
the effective thresholds should be changed to warning at > 15 seconds and
error at > 30 seconds. Created an override for all types of Active Directory
Domain Controller Server 2008 Computer role to change Threshold Error Sec to 30
and Threshold Warning (sec) to 15 and stored it in the
ActiveDirectory2008_Overrides management pack.

Alert: The Op Master PDC Last Bind latency is above the
configured threshold

Issue: Bind from the domain controller identified in the alert to the PDC
emulator is slower than 5 seconds for a warning and slower than 15 seconds for
an error. This occurred in a remote site connecting to a central site with the
PDC emulator role.

Resolution: The alert appears to be due to a slowness in the link between the
two locations, or a condition where one of the two servers identified may have
been overloaded. In this particular case it was caused by a domain controller
that was overloaded due to insufficient hardware, which had to be
decommissioned.

Alert: The logical drive holding the AD Database is low on
free space.

Issue: Low disk space on the drive with the Active Directory
database.

Resolution: The domain controller was a Windows 2008 virtual, which
had a 20GB C drive assigned to it. This drive was increased to 30GB.

Alert: The logical drive holding the AD Logfile is low on
free space

Issue: Low disk space on the drive with the Active Directory
logfiles.

Resolution: The domain controller was a Windows 2008 virtual, which
had a 20GB C drive assigned to it. This drive was increased to 30GB.

Alert: The Op Master Domain Naming Master Last Bind latency
is above the configured threshold.

Issue: A large number of alerts are generated at > 5 seconds for warning
and > 15 seconds for error.

Resolution: Per http://technet.microsoft.com/en-us/library/cc749936.aspx,
the effective thresholds should be changed to warning at > 15 seconds and
error at > 30 seconds. Create an override for all types of Active Directory
Domain Controller Server 2008 Computer role to change Threshold Error Sec to 30
and Threshold Warning (sec) to 15 and store it in the
ActiveDirectory2008_Overrides management pack.

Alert: The Op Master Schema Master Last Bind latency is
above the configured threshold.

Issue: A large number of alerts are generated at > 5 seconds for warning
and > 15 seconds for error.

Resolution: Per http://technet.microsoft.com/en-us/library/cc749936.aspx,
change the effective thresholds to warning at > 15 seconds and error at >
30 seconds. To resolve this alert, create an override for all types of Active
Directory Domain Controller Server 2008 Computer role to change Threshold Error
Sec to 30 and Threshold Warning (sec) to 15 and store it in the
ActiveDirectory2008_Overrides management pack.

Alert: This domain controller has been promoted to PDC.

Issue: No issue, this was an informational message. The message was generated
when the PDC emulator role was moved between domain controllers.

Resolution: No actions required, this message is provided for situations
where the PDC emulator role was moved unexpectedly.

During testing, there was a period of time where network connectivity was
lost to a site that had one of the domain controllers. The result was a flurry
of alerts listed below:

Critical Alerts:

A problem with the inter-domain trusts has been detected

DNS 2008 Server External Addresses Resolution Alert

OleDB: Results Error

Warnings:

A problem has been detected with the trust relationship between two
domains

AD Client Side – Script Based Test Failed to Complete (multiple)

Could not determine the FSMO role holder. (multiple)

DC has failed to synchronize its naming context with replication partners
(multiple)

Issue: Loss of network connectivity between one site and another, both
of which had domain controllers.

Resolution: Once network connectivity was re-established, we resolved
all issues identified above.

Active Directory Management Pack Evolution

Three items would appear to be logical to enhance in future versions of the
Active Directory management pack. These are:

Alert: A problem with the inter-domain trusts has been
detected

Does not auto-resolve when the issue is resolved. A warning event id of 83
from the source of “Health Service Script” creates the critical situation, but
no alerts appear which indicate that a successful trust test was accomplished so
this alert always stays in a critical state.

Alert: AD Op Master is inconsistent

This alert is too sensitive. If it recurs two or three times it is relevant,
or it should be tested every 5 minutes instead of every 1 minute.

The Repadmin, Repadmin Replsum, and Repadmin Snap-shot should have
the correct default path for Windows Server 2008 systems

The path should be %windir%\system32.

Exchange 2010 Hosting

November 29, 2010 2 comments

This topic is intended to address a specific issue called out by the Exchange Server Analyzer Tool. You should apply it only to systems that have had the Exchange Server Analyzer Tool run against them and are experiencing that specific issue. The Exchange Server Analyzer Tool, available as a free download, remotely collects configuration data from each server in the topology and automatically analyzes the data. The resulting report details important configuration issues, potential problems, and non default product settings. By following these recommendations, you can achieve better performance, scalability, reliability, and uptime.

When Microsoft® Exchange 2010 Setup is started by using the /Hosting option, Microsoft Exchange Best Practices Analyzer examines the Active Directory directory service to determine whether Active Directory has been prepared for a hosting Exchange environment.

Specifically, the Analyzer tool performs the following examinations:

  • It determines whether the Microsoft Exchange container is present in the Configuration container in Active Directory. If the Microsoft Exchange container exists, Active Directory is prepared for Exchange.
  • It determines whether a ConfigurationUnits container is present in the Microsoft Exchange container. The ConfigurationUnits container appears when Active Directory is prepared for hosting.

If the ConfigurationUnits container is not present in the Microsoft Exchange container, the hosting installation is unsuccessful.

To prepare Active Directory for Exchange 2010 hosting, you must run the following command:

Setup.com /PrepareAD /Hosting

If Active Directory has been prepared for Exchange but not for hosting, you must perform the following actions:

  1. Remove any objects in the following containers:
    • CN=Microsoft Exchange/CN=Services
    • CN=Microsoft Exchange/CN=ConfigurationUnits
  2. Run the Setup.com /PrepareAD /Hosting command.
  3. Restart Setup by using the /Hosting option.

View the Microsoft Exchange container in Active Directory

  1. On a domain controller, click Start, click Run, type adsiedit.msc, and then click OK.
  2. Expand the Configuration container.
  3. Expand CN=Configuration,DC=Contoso,DC=com.
  4. Expand CN=Services.
  5. Expand CN=Microsoft Exchange.

For more information about how to prepare Active Directory, see Prepare Active Directory and Domains.