OpenTelemetry Traces and PostgreSQL

tarasglek · on May 16, 2022

I click on this hoping for an announcement of OpenTelemetry support within postgres :(

Would've been cool to be able to trace down from application-level sql execution to postgres-level index-scans table-scans and maybe even full query plan.

This turned out to be the opposite

jka · on May 16, 2022

Maybe surprisingly, there doesn't seem to be an open source C client library for OpenTelemetry at the moment -- I'd guess that such a library would make introducing spans/traces in PostgreSQL a bit easier (although to be honest, I don't know/remember much about calling and linking to C++ libraries from a C codebase).

(note: there is a C++ client[1], although they mention that support for C is not currently[2] a goal of the project -- and I couldn't find any C implementations elsewhere)

[1] - https://github.com/open-telemetry/opentelemetry-cpp.git/

[2] - https://github.com/open-telemetry/opentelemetry-cpp/tree/db6...

jka · on May 16, 2022

Follow-up after doing some more research out of curiosity: it looks like Fluent Bit's C library[1] may support OpenTelemetry trace output in future.

A quote from their v1.9.0 release notes[1]:

> We fully support Prometheus & OpenMetrics and we are also shipping experimental OpenTelemetry metrics support (spoiler: traces will come shortly!).

(I've no affiliation with Fluent nor particular knowledge of observability, so if anyone's interested in these findings, take them with a grain of salt :))

[1] - https://docs.fluentbit.io/manual/development/library_api

[2] - https://fluentbit.io/announcements/v1.9.0/

moderation · on May 16, 2022

Vector is going to add an OpenTelemetry source and sink too [0]. The PR isn't merged yet but Envoy Proxy is about to get an OpenTelemetry exporter that is written in C++ [1]

0. https://vector.dev/releases/0.21.0/#whats-next

1. https://github.com/envoyproxy/envoy/pull/20281

l-albertovich · on May 16, 2022

I wrote the code for the OpenTelemetry metrics I/O plugins and the underlying decoder and encoder, I'd be happy to answer any questions related to it.

jka · on May 16, 2022

Hello :)

Would capturing and sending OpenTelemetry metrics from the PostgreSQL codebase using the Fluent Bit C library + the OT plugin require a separate process on the host at runtime?

(in other words, does the library ship those events to a collector/transmitter process, or is everything handled in-process?)

l-albertovich · on May 16, 2022

Everything is handled in process, you can find some examples in the runtime test cases.

[1] - https://github.com/fluent/fluent-bit/blob/master/tests/runti...

jka · on May 16, 2022

Thank you!

flakiness · on May 16, 2022

SQL queries over traces are definitely worth it. Android and Chrome have had it for a while [1]. I once wrote about quantifying the UI janky-ness using it [2].

The point is that it can give you 1. quantitative comparison at scale and also 2. alternative visualization that reveals problems which isn't obvious from the default timeline view. With it performance investigation becomes more like exploratory data analysis than a torture of your eyes.

It's not clear if Promscale can cross-reference other types of the performance metric. If it's possible that'd be a game changer.

[1] https://perfetto.dev/

[2] https://notes-dodgson-org.translate.goog/android/trace-proce...

akulkarni · on May 16, 2022

(Timescale co-founder)

The Promscale team is at KubeCon right now, so I'll jump in to answer this question.

Yes, you can actually cross-analyze traces with prometheus metrics in Promscale. That in fact is one of the key reasons we built Promscale, and is something we can do because it is built on top of TimescaleDB.

    If it's possible that'd be a game changer.

I hope it is! And if not, we're always open to product feedback.

ramonguiu · on May 16, 2022

(Promscale team member)

As @akulkarni said, Promscale supports Prometheus metrics and OpenTelemetry traces natively and there are different ways to correlate both signals. I am actually delivering a talk that goes over the different ways you can correlate Prometheus and OpenTelemetry data tomorrow at Prometheus Day :)

One is via adding exemplars to your Prometheus metrics that link to specific traces that are representative of the value of those metrics. In Promscale you can store all that information and then display it in Grafana as explained in 1. That's the way that is most often discussed but typically involves deploying one backend for metrics and one backend for traces instead of just one as is the case with Promscale.

With Promscale you can also correlate metrics and traces using SQL joins. That opens a whole set of new possibilities. For example, imagine if you could retrieve the slowest requests happening on services running on nodes where CPU usage is high to understand how they are impacting their performance. Or imagine you are seeing a specific OOM error often in your traces and you could run a SQL query to look at the evolution of memory usage in the last 24 hours of nodes where those OOM errors are more frequent to see if you spot anything strange happening. You could even go a step forward andretrieve in the same query what processes are consuming the most memory in those nodes to pinpoint the processes that could be causing the issue.

[1] https://grafana.com/docs/grafana/latest/basics/exemplars/

flakiness · on May 16, 2022

That all sounds exciting!

I'm not a server side person but from my experience, other types of data app developer might want to join are various kinds of product-specific data, like feature flags or user-specific dimensions for each request. These aren't typically in the trace data itself.

These are different from what product-agnostic "performance engineers" tend to look into, so I understand if this is out of scope. Although I think product people should look into these numbers as well, instead of just throwing them into the performance team's plate :-/

ramonguiu · on May 16, 2022

It depends on what exactly you are referring to.

I should have mentioned that correlating observability data (or sometimes product metrics collected via Prometheus) with product data (or any other data really like business data) can be super useful and totally possible with Promscale because PostgreSQL is under the hood. So you could have a copy that data into the same PostgreSQL instance used by Promscale or maybe use Foreign Data Wrappers (1). This would allow you to analyze, for example, api request latency by product plan the customer is subscribed to or based on which feature flags are enabled for their account, etc. without having to add all those attributes as labels to all your metrics which can be technically complex and also costly.

[1] https://www.postgresql.org/docs/current/ddl-foreign-data.htm....

mfreed · on May 16, 2022

We actually do this within Timescale Cloud, and it's amazing.

It allows us to cohort performance data across data stored in others microservice databases (e.g., by account types, projects, billing data, etc.). JOINs across foreign data wrappers using TimescaleDB + Postgres, all within the database and without ETL or application code needed.

So you could look at Prometheus data for your trialers vs. customers, for customers running more than X services, for customers that pay more than $X per month or have been a customer for more than 6 months, etc.

It's super useful across operations, support, product, customer care, and more...

brycearden · on May 17, 2022

Maybe I'm missing something, but what are the major differences between Perfetto and an OpenTelemetry tracing / metrics approach? In other words, why would someone choose one tool over the other?

Naively, it seems like Perfetto was designed around tracing on-device behavior while OTel focuses more on distributed tracing. However, I'm not sure why two distinct solutions would be required for that constraint.

cube2222 · on May 16, 2022

Looks very nice!

Have built something similar by forking the Datadog tracing library to also send each Span as a JSON to Kinesis Firehose, and then to S3. From there you can easily query it using AWS Glue (for schema inference) + Athena (as the actual SQL engine).

It's indeed really nice for digging deep or doing performance analysis, though I'd say for 99% of cases, the Datadog APM UI + query editor is much preferred. In practice, we'd occasionally see that we can't express a query in the DD UI and run it through SQL, but it happens very rarely. It's also nice for cheaply achieving 100% sampling if you ever have the need to find a single specific request that's happenned.

I wonder what the UX would be of using something like this with a tool like Metabase for the UI (with an actually fast query engine underneath, like Timescale or ClickHouse).

chrnola · on May 16, 2022

Somewhat relatedly, the Uptrace[1] project is building a tool for querying observability data from a ClickHouse storage layer.

[1]: https://get.uptrace.dev/

derN3rd · on May 16, 2022

I've recently stumbled upon Signoz.io which also builds ontop of ClickHouse, but this seems interesting as well.

Wondering why uptrace is creating modules for their node/web tracer, while it does the same thing as OTEL instrumentations

ramonguiu · on May 16, 2022

(Promscale team member)

That's great!

SQL definitely opens lots of possibilities to analyze the data. And lots of visualization tools already integrate with PostgreSQL/TimescaleDB which Promscale uses to store data so you have a lot of options to pick from.

Would you mind sharing some of the problems you've had to use SQL for?

cube2222 · on May 16, 2022

Hey, two main use cases:

1. Find a trace where there is a span of type A with tag `xyz: true`, that has a direct child span of type B with tag `abc: 42`, that has at least one non-direct child span of type C with tag `lmn: forty-two`. So basically complex queries where I'm looking for a trace while making assertions about tags in different spans, possibly spans that are very far from each other in the trace (non-direct child, just joining on trace_id in that case, not child.parent_id = parent.span_id).

2. Analytical performance analysis across all spans of a certain type in a specified time span (a week for example). I have spans of type A, each of these spans has multiple children of type B, now I want to make some calculations involving the duration of A, the count of B, durations of B, and then slice and dice (group by) this by specific tags in those spans.

ramonguiu · on May 16, 2022

Thanks for sharing. Makes total sense. As soon as you have to query across span attributes AND span parent/child relationships you need a more sophisticated query language.

jabiko · on May 16, 2022

I'm very excited about this! We already have TimescaleDB as our primary datastore so using it to store traces is a very natural approach

ramonguiu · on May 16, 2022

(Promscale team member)

We are also super excited to offer this capability to all our TimescaleDB users.

Feel free to ping us on our community Slack (https://slack.timescale.com #promscale channel or drop me an email ramon[at]timescale.com). We would be happy to answer any question you have, learn more about your needs and get your feedback about the product.

mfreed · on May 16, 2022

Actually, try https://slack.timescale.com/ instead if you are signing up for the first time.

The link above is the right slack group, but you need to go in through the URL I provided to get the right invite tokens. Sorry, Slack isn't fully designed as a community platform, even though many use it for such.

firstSpeaker · on May 16, 2022

I hope to one day I see a single offer that has a rich dashboarding like grafana, good metric collections and rules and alerts like prometheus, easy tracing like jaeger and logging support like Loki.

But all through one Interface and a seamless backend.

OpenTelemetry has it figured out for the API but there is no seamless storage ans UI as far as I know

ramonguiu · on May 16, 2022

(Promscale team member)

100% agree. OpenTelemetry solves half of the problem by standardizing and unifying instrumentation. With Promscale our aim is to solve the other half with a unified storage and query language for metrics, traces and logs. This release is a big step towards realizing that vision. It includes:

- Support for Prometheus metrics

- Support for OpenTelemetry traces

- PromQL alerts and recording rules (announcement coming shortly ;) )

- Integrated with Grafana for dashboarding as well as for the tracing experience. Note that Grafana offers a very similar experience to Jaeger for traces (they reused the Jaeger code) and we extend that experience with a set of dashboards that deliver additional visibility into your services (the blog post has more details about those)

- SQL for querying and correlating traces and metrics.

So pretty much everything you said except logging which is in our roadmap.

grafelic · on May 16, 2022

PMM might cover many of your requirements.

https://www.percona.com/software/database-tools/percona-moni...

harpratap · on May 17, 2022

Haven't used it too much, but keeping an eye in this space. So far Skywalking [0] looks promising.

[0] - https://skywalking.apache.org/

polskibus · on May 16, 2022

The final outcome in Grafana looks like is covering most of Microsoft Application Insights features! Having self hosted alternative to a cloud service is a great boon for to the on premise projects that need observability but cannot use public cloud

ramonguiu · on May 16, 2022

(Promscale team member)

That's definitely our goal: to offer a solution that is very easy to self-host that covers most of the features APM products provide within Grafana so you don't have to go back and forth between different tools. And a key advantage is that you can customize the experience and extend it in any way you want because it's made of editable Grafana dashboards populated with data from SQL queries.

ldng · on May 17, 2022

The title, more than click-baity, is outright misleading...