Trace Graph

Trace Graph renders a live, interactive service-dependency map built from your trace data. Each node represents a service, and edges show the call relationships between them. Use it to understand your architecture, spot bottlenecks, and identify failing dependencies at a glance.

Getting Started

Navigate to Traces → Trace Graph in the sidebar. The graph populates automatically from trace spans observed in the selected time range.

Reading the Graph

Nodes

Each node represents a service (identified by the service.name span attribute). Nodes are color-coded by health:

Color	Meaning
Green	Healthy — low error rate
Yellow / Orange	Warning — elevated error rate
Red	Critical — high error rate

Edges

Directed arrows between nodes indicate that one service calls another. The arrow direction follows the request flow (caller → callee).

The left-hand sidebar provides several filter controls:

Environment — Filter by deployment environment (e.g. production, staging).
Cluster — Filter by Kubernetes cluster.
Health status — Show only Healthy, Warning, or Critical services.
Service search — Type to find a specific service by name.
Service checkboxes — Toggle individual services on or off to simplify the view.
Hide services with no connections — Remove isolated nodes from the graph.
Compare with previous period — Enable comparison stats to see how metrics have changed.

Time Range

Use the time picker in the sidebar to change the observation window (maximum 24 hours). A shorter window shows recent call patterns; a longer window provides a broader picture.

Interacting with the Graph

Pan and zoom — Use the mouse wheel to zoom and drag to pan. The toolbar buttons (Zoom In, Zoom Out, Fit View) are also available.
Click a node — Opens a detail panel showing the service's error rate, latency percentiles, and throughput.
Hover an edge — Displays the request rate and error rate for that specific call path.

Use Cases

Architecture discovery — Visualize how microservices interact without reading code or config files.
Incident triage — During an outage, identify which upstream or downstream dependency is the root cause.
Performance analysis — Find hot paths where multiple services converge and latency compounds.
Dependency auditing — Verify that services only communicate with expected dependencies.

Best Practices

Use a short time range (15 minutes) for real-time incident investigation.
Use a longer range (1–24 hours) for architecture review and dependency mapping.
Filter by cluster when you have multiple environments to avoid mixing production and staging traffic.

Trace Lookup — Find a specific trace by its Trace ID.
Trace Explorer — Search and filter traces across services.

Support

If you need assistance or have any questions, please reach out to us through:

Email at [email protected]

Getting Started​

Reading the Graph​

Nodes​

Edges​

Sidebar Filters​

Time Range​

Interacting with the Graph​

Use Cases​

Best Practices​

Related Pages​