How we slashed detection and resolution time in half (Salt Security)

Salt security Helios case study

Salt Security had deployed OpenTelemetry but found it insufficient. So the company engineers evaluated Helios, which visualizes distributed tracing for fast troubleshooting. My role as the Director of Platform Engineering at Salt Security lets me pursue my passion for cloud-native tech and for solving difficult system-design challenges. One of the recent challenges we solved had to […]

Debugging and troubleshooting microservices in production—All you need to know

debugging and troubleshooting microservices in production

What do you do when things break in production?  Debugging microservices isn’t a walk in the park. Microservices are designed to be loosely coupled, which makes them more scalable and resilient, but also more difficult to debug. When a problem occurs in a microservices app, it can be difficult to track down the source of […]

Lambda monitoring: Combining the three pillars of observability to reduce MTTR

Lambda monitoring - Combining the three pillars of observability to reduce MTTR

A developer shares real-world examples of how bringing together metrics, logs and traces makes Lambda monitoring a lot more effective and helps reduce the time it takes to identify root cause of production issues.   Observability & monitoring can be challenging when it comes to distributed applications, serverless architectures being a typical examples of that. […]

API latency in microservices – Trace based troubleshooting

Api latency troubleshooting in microservices

In microservices architectures, apps are broken down into small, independent services that communicate with each other using APIs in a synchronous or asynchronous way.    Microservices carry many advantages, such as Increased flexibility and scalability (microservices can be scaled independently of each other, and APIs help to scale microservices by adding or removing instances of the […]

OpenTelemetry .NET Distributed Tracing – A Developer’s Guide

.net opentelemetry

Modern applications are becoming increasingly distributed due to a wide range of benefits including enhanced scalability, high availability, fault tolerance, and better geographical distribution. But it also makes the overall system complex making it challenging to understand how they function internally. Distributed tracing helps to address it by tracking how requests flow through various system […]

Serverless observability, monitoring, and debugging – Overview and best practices

  Serverless, as you may already know, is a cloud computing model where the cloud provider dynamically manages and allocates resources to execute code without the need for server provisioning or infrastructure management on the developer. This article overviews serverless observability, monitoring, and debugging, based on distributed tracing and OpenTelemetry (OTel).   It is gaining […]

API monitoring vs. observability in microservices- Troubleshooting guide

API monitoring and troubleshooting

Monitoring APIs through enhanced observability has gained traction with the popularity of microservices. Since microservice applications are built as independent and scalable modules, the number of microservices can grow dramatically as the application grows, increasing the complexity drastically. Since APIs work as the connective tissue between microservices, the number of APIs also grows in parallel. […]

Distributed tracing Node.js- OpenTelemetry-based monitoring

Distributed Tracing Node.js - Observability and error data

As the trend toward microservices-based architectures continues to gain momentum, it’s becoming increasingly clear that distributed tracing will be a crucial tool for monitoring and debugging these complex systems in the future. When designing a microservices-based architecture, breaking extensive services into smaller, more manageable components is standard practice. Communication between these components becomes crucial, but […]