• /
  • Log in
  • Free account

Introduction to distributed tracing

Distributed tracing tracks and observes service requests as they flow through distributed systems. With distributed tracing data, you can quickly pinpoint failures or performance issues and fix them.

Distributed tracing systems collect data as the requests go from one service to another, recording each segment of the journey as a span. These spans contain important details about each segment of the request and are combined into one trace. The completed trace gives you a picture of the entire request.

Diagram showing how spans are generated in a distributed trace

Here is an example a web transaction where agents measure the time spent in each service. Agents then send that timing information to New Relic as spans where they are combined into one distributed trace.

Why it matters

A request might pass through various microservices to reach completion. The microservices or functions could be located in multiple containers, serverless environments, virtual machines, different cloud providers, on-premises, or any combination of these.

For example, let's say that you're in a situation where a slow-running request affects the experience of a set of customers:

  • The request is distributed across multiple microservices and serverless functions.
  • Several different teams own and monitor the various services that are involved in the request.
  • None of the teams have reported any performance issues with their microservices.

Without a way to view the performance of the entire request across the different services, it’s nearly impossible to pinpoint where and why the high latency is occurring and which team should address the issue.

Instrumentation: The key to distributed tracing

Distributed tracing starts with the instrumentation of your services to enable data collection and correlation across the entire distributed system. Instrumention means either manually adding code to services or installing agents that automatically track trace data.

Many of our New Relic solutions automatically instrument your services for a large number of programming languages and frameworks. You can also use open source tools and open instrumentation standards to instrument your environment. OpenTelemetry, part of the Cloud Native Computing Foundation (CNCF), is becoming the one standard for open source instrumentation and telemetry collection.

What you can see in the New Relic UI

After the data is collected, you can visualize it to see service dependencies, performance, and any anomalous events such as errors or unusual latency. Here are some examples of what you can do with your data:

What you can do

Description

Detect anomalous spans

Spans that are slow in comparison to typical behavior are marked as anomalous, with charts comparing them to typical performance.

See your errors and logs

Frontend and backend errors appear right in the context of your traces. Everything you need to troubleshoot is in one place.

Filter results

You can filter charts using many data points, so you can analyze trace data in different ways.

Customize queries and dashboards

You can create custom queries of your trace data and create custom data dashboards.

See data across accounts

See a global view of traces from across all your accounts and applications in New Relic One.

Query traces programmatically

Query distributed trace data by using GraphQL in our NerdGraph API explorer.

Next steps

Here are some tasks to consider:

For more help

If you need more help, check out these support and learning resources:

Create issueEdit page
Copyright © 2021 New Relic Inc.