Join the new Helios product community on Slack!

SANDBOX

SECURITY

OBSERVABILITY

LANGUAGES

USE CASES

RESOURCES

OpenTelemetry (OTel) tracing is opening new possibilities for developers

Written by


Subscribe to our Blog

Get the Latest News and Content

OpenTelemetry (OTel) is emerging as the industry standard for system observability and distributed tracing across cloud-native and distributed architectures. But where do developers fit in? With OTel’s main use case focusing on production monitoring and observability, I find that many developers are still not fully familiar with OTel. Others believe it is more of a tool for DevOps/SRE.
That’s too bad – distributed tracing should be used by developers daily, for a wide variety of use cases that range from local development, troubleshooting, testing, debugging, documentation, and more.

OpenTelemetry (OTel): A Brief Reminder

OpenTelemetry (OTel) is an open-source solution that provides a collection of SDKs, APIs, and tools for collecting and correlating telemetry data (i.e., logs, traces, and metrics) from different interactions (API calls, messaging frameworks, DB queries, and more) between components in cloud-native, distributed systems. After exporting the data to different backends like Jaeger or Zipkin, R&D organizations gain observability into their systems.
Needless to say, OTel answers a very widespread pain among R&D organizations – it provides them with tools to identify errors, issues, and bottlenecks in microservices and distributed architectures.

OTel for Developers?

OTel is slowly but surely becoming the industry standard for collecting telemetry data. Leading technology companies like Google, Microsoft, Amazon, Splunk and Datadog, are investing heavily in OpenTelemetery.
Datadog, for example, donated OTel’s Java SDK to the community and Google is including OTel as a built-in configuration into many of its GCP SDKs. According to Gartner, by 2025 – 70% of cloud applications monitoring will be based on OpenTelemetry instrumentation.
However, many developers may still refer to OTel as a technology that’s mostly relevant for DevOps and SRE. This comes as no surprise, as even leading companies like Splunk refer to it as “critical for helping DevOps and IT groups”, without mentioning its relevance to developers.
Additionally, many developers are still unfamiliar with the concept of distributed tracing; even though many of them create “request ids” for transactions across services and monitor results in systems like DataDog and New Relic, they are not aware of the vast potential this technology has for gaining data-driven insights and reducing their development and troubleshooting overhead.

It’s no wonder many developers think it’s DevOps’s turf.
But actually, OTel has the potential to help developers directly. We’ve all experienced the pain of working in a distributed, cloud-native environment. I remember one of my first; half a decade ago, I was working on a mobile security product that protected iOS and Android devices from several threat vectors, including malware. One of our most complex flows – creating a static analysis report on an Android APK or iOS IPA – was constantly breaking, and we never seemed to be able to stabilize it. The flow involved several services, communicating synchronously (HTTP) and asynchronously (SQS, Celery).
On each on-call duty I had, something else caused a failure – once it was an unexpected format in the Celery job payload; another time, a null-pointer; occasionally it was a DevOps issue, like an SQS misconfiguration or low container resources. Each and every time, we had to check all the potential failure points, searching through logs and SSHing to machines, looking for errors and warnings, trying to correlate everything together. I would succeed, eventually, but it took me and my teammates a lot of time and effort.
Developers today don’t have to go through such a tiring process anymore. OTel is a solution that makes distributed tracing data accessible, so they can gain visibility into the entire flow, end-to-end – and quickly!

When Should Developers Use OTel?

I’m a true believer in the potential of OpenTelemetry – we’re only scratching the surface in terms of its potential. Making its data accessible to developers will reveal capabilities way beyond the classic observability and monitoring use cases. I personally came across the following use-cases, whether first-hand or from the experience of others. A few quick examples:

  1. Production readiness – the way to production starts in every developer’s own environment and continues through the integration and staging environment. Leveraging OTel capabilities should start there, long before production.
  2. Testing – distributed tracing data can be used to validate the behavior of behaviors deep within the system. Unlike traditional UI or API testing that are essentially “black box”, data from OTel can be used to make complex assertions that otherwise would be very difficult to implement.
  3. Collaboration – the complexity of building and maintaining distributed applications often manifests in the need to pinpoint and reuse specific requests, queries, and payloads and share them between developers. OTel makes this possible, and easy.
  4. Documentation – deducing application APIs and expected behavior from traffic, and converting it to standards like Swagger/OpenAPI is something that can fairly easily be done by inspecting OTel data.
  5. Security – back in 2012 I was working on a web application security product that used instrumentation capabilities to identify unsanitized payloads that propagated from browser requests all the way to the DB. We had to build everything from scratch – and now OTel makes it much easier.
  6. Onboarding – combining collaboration and documentation, OTel data can be used to create dynamic onboarding experiences, reducing the time-to-value from new developers significantly. Seeing the system visually, examining specific instances of its important flows makes much more sense to me than going over stale architecture slides and theoretical explanations.
  7. Troubleshooting – using existing traffic (HTTP requests / DB queries / messaging payloads) to reproduce application states with ease, and speed up debugging and development in general.

I’m sure OTel will provide lots of new possibilities for developers, which I haven’t listed above. By being able to see their systems like never before and gaining access to data that was hidden, the opportunities for developers are endless.

Related: How Novacy Shortened Troubleshooting Time by 90%

The Commoditization of OTel

Just like Kubernetes has made deployment of containerized applications easy, I believe OTel will become a commodity, making system observability easy. Soon, OTel will be in widespread usage across organizations. Visibility into system architecture will become the norm. The ability to gather the data and act on the insights it provides will become easier and more common.
We see this as a great opportunity for R&D organizations. As OTel becomes more widely adopted, distributed tracing data will soon become a gold mine of opportunities for the use cases described above, and many more we have yet to imagine and discover. Don’t wait.
If you haven’t checked out OTel yet, I highly recommend you do. And if you need help implementing it or extracting the data to gain advanced insights, reach out

Subscribe to our Blog

Get the Latest News and Content

About Helios

Helios is an applied observability platform that produces actionable security and monitoring insights. We apply our deep runtime data collection capabilities to help Sec, Dev, and Ops teams understand the actual application risk posture, prioritize vulnerabilities, shorten troubleshooting time, and reduce MTTR.

The Author

Ran Nozik
Ran Nozik

CTO and co-founder of Helios. An experienced R&D leader, and mathematician at heart. Passionate about improving modern software development, and a big fan of and contributor to OpenTelemetry. After serving as an officer in unit 8200 and leading R&D efforts in the cybersecurity department, working as a Senior Software Developer, and becoming an Engineering Team Leader, Ran co-founded Helios, a production-readiness platform for developers. Ran holds a B.Sc. in Computer Science and Mathematics from the Hebrew University of Jerusalem.

Related Content

Banner for blog post - Scaling microservices - Challenges, best practices and tools
The Challenges of Collecting Runtime Data
Collecting data in real-time plays a crucial role in securing, monitoring, and troubleshooting applications. This real-time data, often referred to as...
Read More
Helios Runtime for Appsec
Helios Runtime for AppSec: The missing link in application security
Modern development teams increasingly rely on open-source packages to rapidly build and deploy applications. In fact, most, if not all applications consist...
Read More
Evaluating distributed tracing solutions
Convergence of Observability and Security: A New Era
Observability and security are converging, benefiting dev and security teams. Runtime observability is the missing component to this important endeavor,...
Read More