In distributed application environments, to understand how to solve problems in code you need to be able to connect the dots between all the different places where your code runs, including frameworks like Databricks and Apache Airflow.
The Databricks pipeline may be one of the most crucial places where your code runs, but the visibility you’re getting is limited.
Usually, the pipeline is detached from the rest of your architecture – which makes it nearly impossible to test, monitor, and understand how your code is executed. Databricks notebooks are often triggered by microservices, which also consume their output, but all those components are siloed.
How Helios can help your team
Helios provides actionable insight into your end-to-end flows by adapting OpenTelemetry’s context propagation method to fit the specific mechanism of triggering and running Databricks notebooks.
With Helios you can see a flow propagating through the components of your application – including microservices and notebooks, how they are connected, and what is triggering and is triggered by the notebook.
Helios enables you to:
- See downstream how services interact with each other and the connections between them
- See how data flows through your entire application with interactive trace visualization
- Understand issues and where they occur, and easily resolve them before deploying to any environment
- Generate tests across the end-to-end flow, including your Databricks notebooks
Use case scenario
A daily scheduled job pulls data from a database, pushes it into a few Kafka streams, and then triggers a Databricks Job which works on top of these streams to produce a certain result.
This type of flow is not necessarily easy to monitor and track.
The participating components are deployed apart from each other and the only means of interaction between them is through APIs and messaging systems (Kafka in this case).
What you can do with Helios
- OpenTelemetry enables you to trace this type of flow by adding the context of the current run to all of the data which runs through the different components.
We leverage OpenTelemetry to enable you to track these types of complex jobs out-of-the-box upon installing Helios in your stack.
- Helios provides visibility into the data on the flows, allows you to troubleshoot issues, and enables you to build tests on top of the flows.
“Helios is the first product we’ve used that enables us to understand and troubleshoot a complete, complex end-to-end flow – starting from our Databricks jobs, through our K8s jobs, and all the way to our own microservices. Helios has helped us cut 3 days of work into less than a day.”
Gal Moyal, Director of Risk Engineering at BitSight