Skip to content

ServiceNow CMDB, Discovery, and Service Mapping

Scope

This file covers the ServiceNow data layer that nearly every other Now Platform process depends on: the Configuration Management Database (CMDB) itself, the Discovery product that populates most of it, and the Service Mapping product that builds application-level relationship maps on top of it. Topics: CI class strategy and the Common Service Data Model (CSDM), the Identification and Reconciliation Engine (IRE) and data-source priority, CI relationships (cmdb_rel_ci) including depends-on / runs-on / hosted-on semantics, domain separation for multi-tenant CMDBs, the CMDB Health dashboard (Completeness, Compliance, Correctness), MID Server architecture and placement, Discovery patterns (schedules, IP ranges, credentials, exploration vs classification phases), Cloud Discovery vs on-prem Discovery, Service Mapping approaches (top-down, machine learning, traffic-based), and multi-source data quality when Discovery, IntegrationHub, manual entry, and file imports all write to the same tables. For platform-level architecture decisions (instance topology, licensing, Now Assist), see providers/servicenow/itsm.md. For operational depth (SLA engine, Performance Analytics, automation placement), see providers/servicenow/itsm-operations.md. For an end-to-end pattern that exercises all of this, see patterns/vmware-servicenow-chargeback.md.

Checklist

  • [Critical] Is the CMDB designed against the Common Service Data Model (CSDM) with the four-layer model (Foundation, Design, Build, Manage / Sell) understood and the target CSDM maturity level (1 Foundation, 2 Foundation+Design, 3 Service Mapping, 4 Service Offerings) chosen explicitly -- rather than ad-hoc CI classes accreted over time? (CSDM is ServiceNow's prescriptive data model for the relationships between business applications, technical services, service offerings, and underlying infrastructure; deviating from CSDM does not break the platform but breaks compatibility with Service Mapping, ITOM Visibility, APM, and the OOB Performance Analytics indicators that assume CSDM)
  • [Critical] Are CI classes selected from the OOB cmdb_ci_* hierarchy rather than created as custom classes wherever possible, with custom classes added only when the OOB hierarchy genuinely cannot express the CI type? (Custom CI classes break OOB Discovery patterns, OOB Service Mapping patterns, and OOB IRE identifiers; they also create upgrade conflicts; the OOB hierarchy is deep -- cmdb_ci_server -> cmdb_ci_linux_server, cmdb_ci_app_server -> cmdb_ci_app_server_tomcat, cmdb_ci_database -> cmdb_ci_db_mssql_instance -- and most edge cases are already modeled)
  • [Critical] Are CI Identifier rules in the Identification and Reconciliation Engine (IRE) configured per CI class with the correct identifier-entry priority order and the correct lookup attributes -- typically serial_number + manufacturer for hardware, vCenter object_id / instance UUID for virtual machines, cloud-provider account-id + resource-id for cloud resources -- so that re-discovery of the same CI updates the existing record rather than creating a duplicate? (IRE failure is the single most common CMDB problem in production; symptoms are duplicate CIs, "ghost" CIs that no Discovery source can claim, and incident routing that picks an arbitrary one of three records for the same host)
  • [Critical] Are Reconciliation rules defined for every CI attribute that has more than one possible data source, specifying which source wins -- e.g., Discovery wins on cpu_count, IntegrationHub HR spoke wins on owned_by, Vendor Management wins on support_group -- so that a write from a lower-priority source does not overwrite an authoritative value? (Without reconciliation rules, the last write wins; the practical consequence is that Discovery, which runs nightly, silently undoes manual ownership entries from the service catalog every morning)
  • [Critical] Are CI relationships modeled with the correct relationship types from cmdb_rel_type -- Runs on::Runs, Hosted on::Hosts, Depends on::Used by, Member of::Contains, Cluster of::Cluster -- and is the directionality used consistently so that impact analysis, dependency views, and Service Mapping behave as expected? (Reversed Runs on relationships cause incident impact analysis to traverse upward into the host instead of downward into the application; the symptom is "why does this database outage not show as impacting any business service")
  • [Critical] Is the MID Server architecture defined with one MID Server (or HA pair) per network security zone, sufficient sizing for the Discovery / Orchestration / IntegrationHub load (typically 4 vCPU / 8 GB RAM per 5,000 CIs as a starting point), TLS connectivity to the instance via outbound 443, and a service-account-based credential model rather than per-user credentials? (MID Servers placed at the wrong network boundary force every Discovery probe across a firewall, multiplying latency and creating credential-sprawl issues; under-sized MID Servers throttle Discovery and Orchestration silently; a single MID Server is a single point of failure for everything that needs to reach on-prem)
  • [Critical] Is the CMDB Health dashboard configured and actively monitored across the three dimensions -- Completeness (required attributes populated, e.g., serial_number, owned_by, support_group), Compliance (CI matches the expected schema and required relationships), Correctness (no duplicates, stale CIs identified) -- with target thresholds per CI class and a remediation owner assigned for each dimension? (CMDB Health is the only mechanism that surfaces data-quality drift before it shows up as incident-routing failures; without active monitoring, CMDB quality degrades silently and the platform team only learns about it via process failures)
  • [Recommended] For multi-tenant managed-services scenarios, is domain separation configured with the correct domain hierarchy and CI visibility rules, or has the design instead chosen company-level segregation via company field plus ACLs? (Domain separation is the OOB ServiceNow mechanism for true multi-tenant data isolation -- domain-aware tables, domain-scoped Business Rules, domain-aware reports -- but it is invasive, hard to undo, and complicates upgrades; company-field-plus-ACL is lighter-weight but not actually data isolation; choose deliberately because reversal is expensive)
  • [Recommended] Are Discovery schedules sized and segmented by purpose -- a fast cadence (e.g., every 4 hours) for change-prone segments like cloud and DMZ, a slow cadence (e.g., weekly) for slow-change segments like network gear and printers, classification-only schedules separate from full exploration schedules -- rather than one nightly all-IPs schedule that overruns its window? (A single nightly schedule that does not finish before business hours produces stale data and saturates MID Servers during peak hours; segmented schedules let each segment complete within its window and keep the high-change segments fresh)
  • [Recommended] Is the Discovery credential model designed with discovery-specific service accounts (not shared admin accounts), credentials stored in the ServiceNow Credentials table with per-credential affinity to specific MID Servers, and credential rotation procedures defined -- rather than long-lived passwords entered once and forgotten? (Shared admin credentials in Discovery create audit-trail confusion because every discovery action appears to come from the admin account; credentials without rotation are a standing security finding; affinity prevents a MID Server from attempting every credential in turn against every target, which is slow and triggers account-lockout policies)
  • [Recommended] For cloud resources, is Cloud Discovery (via the IntegrationHub cloud spokes for AWS, Azure, GCP) used as the primary mechanism rather than attempting to run network-based Discovery against cloud IP ranges? (Cloud Discovery uses cloud-provider APIs with read-only IAM roles to enumerate resources -- EC2 instances, S3 buckets, RDS databases, VPCs, security groups, load balancers -- which is faster, more complete, and does not require network reachability or VM-level credentials; network-based Discovery against cloud IP ranges misses managed services entirely and triggers cloud-provider security alerts)
  • [Recommended] Is Service Mapping configured for the business services that matter -- typically a small list (10-30) of tier-1 services rather than every application -- with an explicit choice between top-down patterns (entry-point-driven traversal of network connections), machine-learning-driven mapping (Service Mapping with ML), and traffic-based discovery via the optional Service Graph Connector for tools like Dynatrace or AppDynamics? (Service Mapping requires significant ongoing maintenance to keep maps accurate; attempting to map every application produces low-quality maps that nobody trusts; tier-1-only mapping produces high-quality maps for the services that drive incident management priority)
  • [Recommended] Are CI relationships from Service Mapping clearly separated from CI relationships from manual entry, Discovery pattern relationships, and IntegrationHub-spoke relationships -- via the discovery_source field on cmdb_rel_ci -- so that a stale Service Mapping relationship can be re-mapped without losing manually curated dependency data? (Relationship-source provenance is the only way to safely re-run Service Mapping; without it, a refresh either preserves stale Service Mapping data or wipes manually curated relationships)
  • [Recommended] Is the multi-source data quality model documented: which sources write which attributes for which CI classes, with the IRE priority list and reconciliation rules backing it up, so that the platform team can answer "where did this value come from" for any field on any CI? (When Discovery, IntegrationHub HR spoke, IntegrationHub Vendor spoke, manual entry, and file imports all write to the same CI, the only way to debug a wrong value is provenance; documenting the source-of-truth matrix up front is much cheaper than reconstructing it during an incident)
  • [Optional] Is the CMDB Query Builder or cmdb_query performance evaluated for the queries that drive impact analysis, Service Mapping, and Performance Analytics dashboards -- with indexes on the high-volume relationship tables and archival of stale CIs (e.g., decommissioned for > 90 days) to keep query performance acceptable? (Production CMDBs accumulate millions of CIs and tens of millions of relationships; unindexed queries on cmdb_rel_ci are a common cause of platform-wide performance degradation)
  • [Optional] Are CI lifecycle states (install_status, operational_status, lifecycle stage) used consistently to mark CIs as Installed / In Maintenance / Pending Install / Retired / Absent, with retired CIs excluded from active reporting and impact analysis -- rather than carrying retired CIs in the active inventory indefinitely? (Stale active CIs inflate impact analysis, allocate cost incorrectly in chargeback, and dilute CMDB Health metrics; lifecycle-state hygiene is a low-cost ongoing maintenance task that prevents this)
  • [Optional] For network-discovery-heavy environments, is the Discovery probe and pattern customization boundary respected -- using OOB patterns wherever possible, extending patterns via the Pattern Designer rather than scripting, and reserving custom probes for cases where no pattern exists? (Custom Discovery scripts written against the legacy probe-and-sensor framework become upgrade liabilities; Pattern Designer patterns are the supported customization path and upgrade more cleanly)

Why This Matters

CMDB quality is the foundation of every other ITSM process. Incident routing depends on the affected CI having an accurate support_group. Change impact analysis depends on the CI's relationships traversing correctly upward into business services and downward into infrastructure. Cost allocation depends on the CI's owned_by, cost_center, and business-unit attributes being current and authoritative. Vulnerability management depends on the CI's install_status so that retired hosts do not appear as open findings. When any of these processes produces wrong results, the platform team's first question is whether the CMDB record is correct -- and the answer is almost always no, because nobody has set up Completeness / Compliance / Correctness monitoring and nobody has documented who writes which attribute. The "CMDB is the foundation" cliche is true precisely because every downstream failure traces back to a Discovery rule, an IRE identifier, a reconciliation priority, or a relationship type that was never thought through.

The Identification and Reconciliation Engine is the single most consequential design choice in the data layer. IRE decides whether a re-discovered CI updates the existing record or creates a new one, and reconciliation decides which data source wins when two sources disagree. Both defaults are too permissive for production: the OOB IRE identifiers are sensible starting points but rarely cover the exact attribute mix the customer's environment provides, and the OOB reconciliation behavior is essentially last-write-wins, which means a nightly Discovery run can silently overwrite a manually entered ownership value. Organizations that skip the IRE-and-reconciliation-design step accumulate duplicate CIs steadily, never quite enough to be visible in any single week but cumulatively enough that after a year the CMDB has 30 percent more rows than it has actual configuration items.

The MID Server is the architectural component most often under-designed. MID Servers are the only way ServiceNow reaches on-premises infrastructure -- for Discovery, for IntegrationHub spokes that talk to internal systems, for Orchestration runbooks. They run as services on Windows or Linux, poll the ServiceNow instance over outbound HTTPS for work, and execute it locally. Placement matters because every probe runs from the MID Server: a MID Server in the wrong network zone forces every probe across a firewall, multiplies latency, and requires credential propagation across security boundaries. Sizing matters because Discovery, Orchestration, and IntegrationHub all queue work on the same MID Server, and a saturated MID Server silently drops or delays work. HA matters because a single MID Server is a single point of failure for everything that needs to reach on-prem, which becomes obvious only during the first MID Server outage.

Service Mapping is the product that builds application-level CI relationship maps that Discovery alone cannot. Discovery produces an inventory of hosts, processes, and network connections; Service Mapping turns that inventory into "the Online Banking service runs on these load balancers, which front these web servers, which call these app servers, which read from these databases." The mapping is what makes incident impact analysis say "this database outage affects Online Banking" rather than "this database outage affects an unidentified host." Service Mapping is expensive to maintain at scale because applications change, deployment topologies change, and traffic patterns change; the right strategy is to map only the tier-1 services that drive incident-management priority, with explicit ownership of the maps. Attempting to map everything produces stale maps that nobody trusts.

Cloud Discovery is the area where the CMDB design most often falls behind the actual environment. Cloud resources are created and destroyed faster than nightly network-based Discovery can keep up with, and most cloud services (S3 buckets, SQS queues, IAM roles, Lambda functions) are not reachable by network probes at all. The IntegrationHub cloud spokes use cloud-provider APIs with read-only IAM roles to enumerate resources, which is both faster and more complete. Organizations that attempt to discover cloud with the same nightly network probes used for on-prem typically end up with a CMDB that is 60-70 percent complete for cloud resources, with managed services entirely absent, which makes cloud cost allocation and cloud security findings unreliable.

Common Decisions (ADR Triggers)

  • CSDM maturity target -- Level 1 (Foundation) gets identity, ownership, and basic infrastructure CIs in place; Level 2 (Foundation + Design) adds business applications, technical services, and the application-to-infrastructure linkage that incident-impact analysis depends on; Level 3 (Service Mapping) adds dynamic application-relationship discovery; Level 4 (Service Offerings) adds the customer-facing service catalog linkage that chargeback and service-level reporting depend on. Choose the target up front; trying to skip levels produces a CMDB that does not support the processes the customer expects.
  • CI class customization: OOB hierarchy vs custom classes -- The OOB cmdb_ci_* hierarchy is deep and covers most enterprise CI types. Custom CI classes break OOB Discovery patterns, Service Mapping patterns, IRE identifiers, and Performance Analytics indicators; they also create upgrade conflicts. Choose custom only when the OOB hierarchy genuinely cannot express the CI type; document the rationale and the upgrade-impact assessment.
  • IRE identifier strategy: OOB identifiers vs custom identifiers -- OOB identifiers are sensible defaults but rarely match the exact attribute mix the customer's data sources provide. Custom identifiers can be precise for the customer's environment but require deliberate priority-order design and become a CMDB-team-owned artifact. Most production CMDBs end up with a mix; document each custom identifier with its rationale and its expected data sources.
  • Reconciliation priority: per-attribute vs blanket per-source -- Per-attribute reconciliation is precise (Discovery wins on cpu_count, HR spoke wins on owned_by, Vendor spoke wins on support_group) but requires per-attribute configuration; blanket per-source reconciliation is coarser (Discovery always wins, or manual entry always wins) but is harder to keep correct as new data sources are added. Per-attribute is recommended for any CMDB with three or more active data sources.
  • Domain separation vs company-based segregation -- Domain separation is the OOB mechanism for true multi-tenant data isolation and is supported by domain-aware Business Rules, reports, and ACLs; it is also invasive, hard to undo, and complicates upgrades. Company-based segregation (the company field plus ACLs and reference qualifiers) is lighter-weight but is not actually data isolation -- a privileged user can see across companies. Choose domain separation only when regulatory or contractual isolation is required; choose company-based segregation when "soft" multi-tenancy is sufficient.
  • MID Server topology: per-zone vs per-purpose vs shared -- Per-zone MID Servers (one HA pair per network security zone) is the standard pattern and is required for Discovery to reach hosts without crossing firewalls. Per-purpose MID Servers (a Discovery pool, an Orchestration pool, an IntegrationHub pool) prevent a single MID Server from being saturated by one workload but multiply infrastructure cost. Shared MID Servers are simpler but expose the platform to noisy-neighbor problems. Per-zone with workload-specific clusters within zones is the typical production design.
  • Discovery cadence: single nightly vs segmented schedules -- A single nightly schedule is the OOB default and is the easiest to operate, but routinely overruns its window in environments with more than ~50,000 CIs. Segmented schedules (fast cadence for change-prone segments, slow cadence for stable segments, separate classification-only and full-exploration runs) keep high-change segments fresh and prevent the all-IPs schedule from saturating MID Servers. Segmented is recommended once the single nightly schedule begins overrunning.
  • Cloud Discovery: IntegrationHub cloud spokes vs network-based Discovery vs Service Graph Connectors -- IntegrationHub cloud spokes (AWS, Azure, GCP) use provider APIs with read-only IAM roles and are the recommended path for IaaS / PaaS / managed-service inventory. Network-based Discovery against cloud IP ranges misses managed services entirely and triggers cloud-provider security alerts. Service Graph Connectors (third-party CMDB-population tools that publish to ServiceNow) are an alternative for organizations that already operate a multi-cloud CMDB population tool. Spokes are the default; mix in Service Graph Connectors only when the spoke does not cover the resource type.
  • Service Mapping methodology: top-down patterns vs ML-driven vs traffic-based -- Top-down patterns (entry-point-driven traversal of network connections) are the most precise and the most expensive to maintain. Machine-learning-driven mapping (Service Mapping with ML) reduces the upfront pattern-writing effort but produces lower-precision maps. Traffic-based discovery via Service Graph Connectors for Dynatrace / AppDynamics / Splunk uses APM data as the relationship source, which is the highest-precision approach for organizations that already operate APM but requires the APM tool. Most organizations end up with a mix; document the choice per service.
  • CMDB Health remediation ownership -- CMDB Health Completeness / Compliance / Correctness gaps can be owned by the platform team (centralized remediation), by the CI-class owners (each CI class has an accountable team that fixes its own data), or split (platform team owns Compliance, CI owners own Completeness). The split model scales best but requires explicit handoff; the centralized model works at smaller scale; the per-class model is the most accurate but requires mature CI ownership.

See Also

  • providers/servicenow/itsm.md -- platform architecture, instance topology, licensing, Now Assist, IntegrationHub fundamentals
  • providers/servicenow/itsm-operations.md -- SLA engine, Performance Analytics, automation placement, incident state model
  • providers/servicenow/financial-management.md -- ServiceNow ITFM / Financial Management Pro / Cloud Cost Management, which consumes CMDB ownership and cost-center data
  • patterns/vmware-servicenow-chargeback.md -- end-to-end pattern that exercises CMDB, Discovery, IntegrationHub, and Financial Management against VMware
  • general/itsm-integration.md -- general ITSM integration patterns and the boundary decisions with adjacent systems
  • general/managed-services-scoping.md -- managed-services scope decisions including CMDB ownership and remediation responsibility

The following links point to ServiceNow public documentation. ServiceNow gates a portion of its product documentation behind a customer login; verify the current URL and bundle name against the customer's release (Yokohama, Zurich, etc.) before relying on a specific page.