How we solved a versioning issue our customer faced by contributing to an open-source project
One Friday afternoon we got a Slack message from one of our early customers, letting us know that there was an issue with updating our SDK. There was a dependency in the code involved, for the impatient readers in the group, but let us tell you the full story.
Apparently, our SDK introduced a new dependency to the code that their team wasn’t able to meet: graphql@16.0.1
.
This made it impossible for the customer to upgrade to a newer version of our SDK and became a blocker for them.
Before diving into why this happened and what we did, here’s a little bit of relevant background about Helios and OpenTelemetry. OpenTelemetry (OTel) is an open-source solution that provides a collection of SDKs, APIs, and tools for collecting and correlating telemetry data (i.e., logs, traces, and metrics) from different interactions (API calls, messaging frameworks, DB queries, and more) between components in cloud-native, distributed systems.
The Helios SDK leverages OpenTelemetry to provide developers with visibility into their architecture components and provides the ability to debug, troubleshoot, and test distributed applications. Each new release of the Helios SDK introduces new features and fixes and so it’s highly important for us to ensure customers can continuously update the SDK without it affecting their environment and causing any breakage.
It is a well-known best practice that instrumentation libraries (like OTel) do not usually introduce instrumented packages as dependencies in the code. This ensures that different customers aren’t forced to use packages they don’t intend to. Going back to our customer, this “bad” reference caused an incompatibility with the node version the customer was running (node v.10
), since it did not support the newly introduced package version (graphql@16.0.1
).
So where had this dependency suddenly come from?
To find out, we set out to investigate the issue.
We created a local environment matching the customer’s, using the same node version and dependencies listed in their package.json. Then we ran the npm ls graphql
command in order to see which versions of that dependency were installed so we could backtrace what was going on.
The findings: Dependency
The official OpenTelemetry graphql
instrumentation, which is leveraged by the Helios OpenTelemetry SDK, included a main reference to @types/graphql
which in turn, included the culprit reference to a newer version of graphql.
This scenario demonstrates why having a @types
package depending on the package it references is considered a bad practice. Incidentally, this package was already deprecated, as graphql
’s newer versions provide their own type definitions.
The fix
The solution was clear at that point: the dependency in the graphql
OpenTelemetry instrumentation needed to be eliminated.
Once we got to the root cause of this issue, we had a couple of options for how to handle it:
- Write a patch in our code to remove the undesired dependency – an immediate solution that requires in-house maintenance.
- Fix the issue through a PR to the OpenTelemetry open source project – a longer time-to-market (and a bit more overhead) but the right thing to do for the community.
Naturally, we opted for the second alternative as it aligns well with one of our core beliefs in Helios – the strategic importance of the OpenTelemetry open source community. We opened a PR that removed the dependency in the graphql
OpenTelemtry instrumentation and started an email correspondence with some maintainers of the OpenTelemetry project in order to negotiate the importance of the changes and make sure they go live as quickly as possible. Shortly, the PR got approved, merged, and a few days later – also released.
Final thoughts
We believe in the open-source community and we are proud that our SDK leverages OpenTelemetry. Contributing code to the community is a core part of our DNA. Thanks to the alertness of our customer, we were able to identify an open-source-related issue that affected not only them but potentially all developers who use OpenTelemetry for instrumentation. After our investigation, we issued a PR that quickly fixed the issue for our customer and also contributed back to the OpenTelemetry community. This is a great example of how we contribute to the open-source community, and we plan to continue being active contributors as we grow and develop Helios.
Reference to GitHub project can be found here.
Want to give Helios a try? Start here.