PostgreSQL Tracing

For a recent project of mine, I was looking into observability around databases. I focused on open source tooling with PostgreSQL and OpenTelemetry. While there are various options for monitoring databases, most do not focus on integration with OpenTelemetry or otherwise combining PostgreSQL telemetry with application specific telemetry. However, for anyone who prefers to see all information regarding their application at a single glance, having a separate observability tool for the database is not good enough.

This post outlines how we can currently integrate PostgreSQL telemetry in the OpenTelemetry world, how it could be improved, and showcases a prototype I built to generate traces from PostgreSQL queries.

Propagating Trace Context Along your SQL Statements

With Sqlcommenter Google introduced a suite of middleware and plugins that enable your ORMs to augment SQL statements with comments containing information about the code that causes its execution (see Sqlcommenter). It has been donated to OpenTelemetry and integrated into many of the official OpenTelemetry instrumentation libraries.

If you use the OpenTelemetry instrumentation for Node.js and the Knex SQL Query Builder, the Knex instrumentation can be used to add comments to your SQL queries such as shown in the example below.

select "cartItem"."id",
       "cartItem"."product_id",
       "cartItem"."quantity",
       "product"."name",
       "product"."price"
  from "cartItem"
 inner join "product"
    on "cartItem"."product_id" = "product"."id"
 where "user_id" = $1
/*traceparent='00-866e8698eac009878483cfd029f62af9-bb14062982c63db3-01'*/

SQL statement that includes trace context information in a comment.

Once this instrumentation is set up, we also need to get information about the query execution from the database, including the traceparent information. Generally, I found that, aside from the information we can query from the pg_stat_statements view, we look for the query duration, the query plan, and information about cache hits for a query. We can configure PostgreSQL to log this information. For example log_statement = 'all' creates log entries for each SQL command sent to the server, including the duration.

With the auto_explain module we can automatically log query plans. To reduce the amount of log entries generated, we have the options to only log the query plan for slow queries or to enable sampling. The following configuration configures auto_explain to log all queries, including the actual execution statistics and buffer information for each step of the query plan/executor.

shared_preload_libraries = 'auto_explain'

auto_explain.log_min_duration = 0 # e.g. 10ms
auto_explain.log_format = 'json'
auto_explain.log_analyze = true
auto_explain.log_buffers = true

PostgreSQL configuration for auto_explain

Now we can look at a trace and take the trace ID to search in our logs and find context information about what happened on the database. That way we can better investigate why some specific queries are slow. Additionally, we can infer some information about our network latency by comparing total elapsed time for the query in the application and the duration on the database.

Screenshot of Grafana that shows a trace panel on the left and the logs panel on the right. On the active span there is a button that states "Logs for this span". On the logs panel, the logs query searches for a trace id. — Grafana with Jaeger and Loki

If you use Grafana, you can set up your tracing backend, such as Jaeger or Tempo, to jump directly from a trace to related logs using the Logs-To-Traces feature. The figure above shows a “Logs for this span” button on a trace span. When clicked, this button opens the log search on the right side and automatically searches for logs with that trace ID. In the logs, we can look for the fields that contain the plan information as visible in the figure below.

A screenshot of a Grafana logs panel with the start of a query plan. — Grafana logs panel showing the query plan.

Extracting Traces from PostgreSQL

Setting up Grafana to jump from traces to logs is feasible, but it’s not very easy. Additionally, I think it’s still quite hard to find the plan information in your logs. From a usability point of view, we could do better. For example, it would be neat to have this information directly available as trace spans in our tracing system. Proprietary Google Cloud SQL and pganalyze are able to do this, but for any other instance of PostgreSQL I did not find a way to achieve this.

Since most of the required information is available in our log entries, I set out to build a prototype that extracts traces from our PostgreSQL query plan logs and respects the traceparent information propagated to PostgreSQL from our instrumentation.

The prototype is built as a connector for OpenTelemetry’s collector. A connector is a new component type of the collector that acts as a bridge between telemetry streams. One use case is receiving signals from one pipeline, creating another signal type, and exporting those signals into another pipeline. For example, the span metrics connector can be used to aggregate request, error, and duration (RED) metrics from trace spans.

The connector is a good way to achieve what we desired, and I used it to build a rough prototype that I published on GitHub (elessar-ch/pgtraceconnector).

The figure below shows Grafana with traces from PostgreSQL logs using the connector. Each step of the query plan / executor is displayed as an individual span and the information from auto_explain is available in the span attributes.

A screenshot of Grafana with a trace panel that includes spans for a query plan. — Grafana’s trace view including spans from the database.

If you want to check it out yourself, a test setup using Docker Compose with an appropriately configured PostgreSQL database, an example application, and OpenTelemetry setup is available on GitHub (elessar-ch/sql-tracing).

Conclusion

To incorporate query specific information into our OpenTelemetry compliant observability systems, we can propagate traceparent information to the database using the OpenTelemetry instrumentation libraries and correlate our traces with logs. A more user-friendly way to get this information at a glance is to extract traces from the logs. This is feasible and practically achievable through implementing a connector in the OpenTelemetry collector.

If you have any questions, are interested in hearing more, or want to share your thoughts with me, let me know on LinkedIn or Mastodon.

2024-02-18

https://www.gabriel-koch.dev/posts/observability/postgresql-tracing/ Gabriel Koch