Skip to content

New Relic

Scope

This file covers New Relic One observability platform including APM (distributed tracing, service maps, error analytics, transaction monitoring across Java, .NET, Node.js, Python, Ruby, Go, PHP), infrastructure monitoring (hosts, containers, Kubernetes, cloud integrations), log management (log forwarding, parsing, filtering, Logs in Context for trace-log correlation), browser monitoring (real user monitoring, Core Web Vitals, JavaScript error tracking, session traces), synthetic monitoring (scripted browser checks, API tests, private minions), mobile monitoring, NRQL (New Relic Query Language) for ad-hoc analytics, consumption-based pricing model (data ingest in GB/month + optional add-ons), OpenTelemetry-native integration, New Relic AI (applied intelligence for alert correlation and anomaly detection), vulnerability management, change tracking, and the free tier (100 GB/month ingest + 1 full platform user). For general observability architecture, see general/observability.md.

Checklist

  • [Critical] Is the data ingest volume estimated and budgeted -- New Relic charges primarily on GB ingested per month (all telemetry types combined: metrics, events, logs, traces), with 100 GB/month free and additional data at approximately $0.30-$0.50/GB depending on commitment level -- and are data ingest controls (drop filters, sampling rules, log pipeline filtering) configured before production rollout?
  • [Critical] Is the user pricing model understood -- New Relic charges per "full platform user" per month ($49-$99/user/month for Standard, $349-$549/user/month for Pro, $549-$899/user/month for Enterprise edition depending on commitment), with unlimited free "basic users" (limited to dashboards and basic querying) -- and is the user provisioning strategy defined to minimize full platform user count?
  • [Critical] Are APM agents deployed with appropriate configuration -- language-specific agents (Java, .NET, Node.js, Python, Ruby, Go, PHP) with distributed tracing enabled, transaction naming conventions standardized, and custom instrumentation added for critical code paths not automatically captured?
  • [Critical] Is the data retention policy understood -- default retention varies by data type (metrics 13 months, events/logs/traces 30 days for Standard, 90 days for Pro/Enterprise), extended retention available at additional cost -- and does the retention period meet compliance and operational requirements?
  • [Critical] Are drop filters configured to exclude high-volume, low-value telemetry before it counts against data ingest -- particularly verbose debug logs, health check transactions, synthetic internal traffic, and high-cardinality custom events that inflate ingest volume without adding observability value?
  • [Recommended] Is Logs in Context enabled for all APM-instrumented applications -- automatically correlating log lines with distributed traces, service entities, and host metadata -- eliminating the need to manually search logs by timestamp when investigating performance issues?
  • [Recommended] Is NRQL proficiency established within the operations team -- NRQL is required for custom dashboards, alerting conditions, SLI/SLO definitions, and ad-hoc investigation, and is significantly more powerful than the point-and-click UI for complex queries (subqueries, faceted aggregations, funnel analysis, cohort comparison)?
  • [Recommended] Are alert conditions configured using NRQL-based alerting (not legacy alert types) -- with appropriate evaluation windows, signal-loss handling, and incident preference settings (by policy, by condition, or by condition and signal) to control alert noise and routing?
  • [Recommended] Is the New Relic Kubernetes integration deployed -- using the guided install with the nri-bundle Helm chart that deploys infrastructure agent DaemonSet, kube-state-metrics, nri-kubernetes integration, Pixie (optional eBPF-based auto-instrumentation), and log forwarding -- with cluster-level dashboards and alerting configured?
  • [Recommended] Are workloads defined to group related entities (services, hosts, containers, synthetics) into business-meaningful collections -- enabling workload-level health status, SLOs, and alerting that aligns with team ownership and business service boundaries?
  • [Recommended] Is synthetic monitoring configured for critical user journeys -- scripted browser monitors for UI workflows, API tests for service endpoints, and private minions (containerized synthetic workers) for internal applications behind firewalls -- with SLA reporting enabled?
  • [Recommended] Is New Relic AI (applied intelligence) configured for alert correlation -- automatically grouping related incidents to reduce alert noise, with decisions (correlation rules) tuned based on topology, timing, and custom metadata to prevent unrelated alerts from being incorrectly correlated?
  • [Optional] Is OpenTelemetry used as the primary instrumentation strategy -- New Relic natively supports OTLP ingestion and provides first-class OpenTelemetry experiences in the UI, enabling vendor-neutral instrumentation that avoids agent lock-in while retaining full platform capabilities (entity synthesis, service maps, Logs in Context)?
  • [Optional] Is the New Relic Vulnerability Management feature evaluated for runtime security -- it identifies known vulnerabilities (CVEs) in libraries and dependencies detected by APM agents, providing security context alongside performance data without requiring a separate scanning tool?
  • [Optional] Is change tracking configured to correlate deployments with performance changes -- using the change tracking API or CI/CD integrations (GitHub Actions, Jenkins, etc.) to mark deployment events on charts and enable automated deployment impact analysis?
  • [Optional] Is the CodeStream IDE integration evaluated for development teams -- enabling developers to see production telemetry (errors, performance, logs) directly in their IDE (VS Code, JetBrains) linked to specific code locations, reducing context-switching during debugging?

Why This Matters

New Relic has repositioned itself as the most accessible enterprise observability platform with its consumption-based pricing model. The 100 GB/month free tier and unlimited basic users remove the barrier to entry that historically kept observability tools siloed within operations teams. Any developer can have a New Relic account and build dashboards without incurring per-seat cost, which fundamentally changes observability adoption patterns -- instead of a dedicated monitoring team, every development team can own their observability. However, this accessibility creates a cost management challenge: without data ingest governance, organizations routinely exceed their data budget within weeks of production deployment.

The data ingest pricing model ($0.30-$0.50/GB) appears simple but requires careful analysis of what generates data volume. A single verbose Java application generating 10 KB of log data per request at 1,000 requests/second produces approximately 26 TB/month of log data alone -- potentially $8,000-$13,000/month from one application. Logs typically account for 60-80% of total data ingest volume in most organizations. Implementing log-level filtering (only forwarding WARN and above to New Relic), drop filters for known noisy patterns, and sampling for high-volume trace data are essential cost controls that must be designed before production deployment, not after the first invoice arrives.

NRQL (New Relic Query Language) is the platform's greatest strength and biggest adoption barrier. Unlike Datadog's point-and-click dashboard builder, New Relic's most powerful capabilities require writing NRQL queries. Teams that invest in NRQL proficiency unlock ad-hoc investigation, custom SLI/SLO tracking, business-metric correlation, and funnel analysis that go far beyond standard APM dashboards. Teams that avoid NRQL end up using only the pre-built UI views and miss the platform's core value proposition.

Common Decisions (ADR Triggers)

  • New Relic vs Datadog vs Dynatrace -- New Relic offers the lowest barrier to entry (free tier, consumption pricing, unlimited basic users) and strong NRQL-based analytics. Datadog provides broader integration breadth and more granular pricing control per capability. Dynatrace offers superior automatic discovery and AI-driven root cause analysis. Choose New Relic when cost-effective full-platform access for all engineers is the priority and the team is willing to invest in NRQL; Datadog when per-capability cost control and integration breadth matter; Dynatrace when automatic instrumentation and causal AI are the primary requirements.
  • New Relic agents vs OpenTelemetry instrumentation -- New Relic proprietary agents provide the deepest integration (automatic transaction detection, vulnerability management, code-level visibility, Logs in Context auto-injection). OpenTelemetry provides vendor neutrality and avoids lock-in but requires more manual configuration and may lack some agent-specific features. Recommended approach: use New Relic agents for primary application services where deep visibility matters; use OpenTelemetry for polyglot environments, third-party services, or where vendor portability is a hard requirement. New Relic supports both simultaneously.
  • Data ingest commitment tier -- New Relic offers committed data ingest volumes at discounted rates (annual commitment). The free tier includes 100 GB/month. Pay-as-you-go is approximately $0.50/GB; annual commitment reduces to approximately $0.30/GB. Estimate baseline ingest during a 30-60 day proof of concept, then commit at baseline + 25% buffer. Over-commitment is wasted; under-commitment incurs higher per-GB rates on overage.
  • Full platform users vs basic users allocation -- Full platform users ($349-$899/user/month depending on edition) get access to APM, distributed tracing, synthetic scripting, NRQL alerting, and all advanced features. Basic users (free) get dashboards, basic alerting, and log querying. Minimize full platform user count by limiting it to on-call engineers, SRE teams, and team leads; give all other engineers basic user access. This can reduce licensing cost by 60-80% compared to giving everyone full access.
  • Single account vs multi-account organization -- A single account simplifies management and cross-service correlation. Multiple accounts (under one organization) provide data isolation, separate billing, and independent user management per business unit or environment. Use a single account for most organizations; multi-account when regulatory, billing, or organizational boundaries require strict data separation. Cross-account querying is available but adds complexity.
  • Log forwarding strategy -- Forward all logs to New Relic (maximum correlation, highest cost) vs forward only WARN/ERROR and above (reduced cost, may miss context) vs forward all logs but apply drop filters for known noisy patterns (balanced approach). The recommended pattern is: forward all application logs with Logs in Context enabled, apply drop filters for infrastructure logs (health checks, cron noise, access logs for non-production), and use sampling for high-volume debug logging.
  • New Relic AI adoption -- enable AI-powered alert correlation and anomaly detection (reduced alert noise, faster MTTR) vs manual correlation workflows (more predictable behavior, no AI-driven grouping surprises); evaluate AI-generated insights for root cause suggestions alongside manual investigation.

AI and GenAI Capabilities

New Relic AI (Applied Intelligence) -- ML-powered alert correlation and anomaly detection. Automatically groups related incidents using topology, timing, and metadata signals. Proactive detection identifies anomalies in golden signals (throughput, latency, errors) before they trigger threshold-based alerts. Reduces alert noise by 60-80% in typical deployments by correlating symptomatic alerts into single incidents.

New Relic AI Monitoring -- Purpose-built monitoring for AI/LLM applications. Tracks AI model invocations, token usage, response times, and error rates. Compatible with OpenAI, Amazon Bedrock, LangChain, and other AI frameworks. Provides visibility into prompt/completion token costs, model performance comparison, and hallucination detection signals.

NRQL AI Assistant -- Natural language to NRQL query translation. Allows users to describe what they want to investigate in plain language and generates the corresponding NRQL query. Lowers the barrier to NRQL adoption for teams new to the query language.

See Also

  • general/observability.md -- general observability architecture patterns and pillar design
  • providers/datadog/observability.md -- Datadog observability platform for comparison
  • providers/dynatrace/observability.md -- Dynatrace observability platform for comparison
  • providers/prometheus-grafana/observability.md -- Prometheus and Grafana for open-source monitoring comparison