Engineering Blog

Observability best practices, anomaly detection techniques, incident postmortems, and product updates.

Anomaly DetectionMar 25, 20268 min read

The SRE Guide to Anomaly Detection: Beyond Static Thresholds

Static thresholds fail when your traffic patterns change seasonally. Learn how AI-powered anomaly detection adapts to your data's natural rhythms and catches the anomalies that fixed rules miss.

Engineering12 min

How We Reduced Mean Time to Detection from 47 Minutes to 12 Seconds

A deep dive into our streaming architecture that processes 500K data points per second while maintaining sub-100ms anomaly scoring. Rust, SIMD, and quantized models.

Mar 20, 2026Read →

Postmortem10 min

Incident Postmortem: The Cascade Failure That Took Down 3 Regions

How a subtle memory leak in a cache layer cascaded into a multi-region outage. What we learned, how AnomalyWatch detected the early signals, and our remediation steps.

Mar 15, 2026Read →

AI/ML15 min

TimesFM vs Chronos: Choosing the Right Foundation Model for Your Data

We benchmarked Google's TimesFM and Amazon's Chronos on 50 real-world time series datasets. Here's when each model excels and how we ensemble them for better accuracy.

Mar 10, 2026Read →

Best Practices6 min

5 Observability Anti-Patterns That Lead to Alert Fatigue

Alert fatigue is the silent killer of incident response. We analyzed 10,000 alert configurations and found 5 patterns that cause 80% of false positives.

Mar 5, 2026Read →

Product Update4 min

Launching Forecasting: Predict Anomalies Before They Happen

Introducing AnomalyWatch Forecasting — use AI models to predict metric values up to 7 days ahead. Set proactive alerts on forecasted anomalies and prevent incidents.

Feb 28, 2026Read →

Engineering18 min

Building a Real-Time Anomaly Detection Pipeline with Rust

A technical walkthrough of our data ingestion pipeline: from HTTP endpoint to anomaly score in under 100ms. Covering async I/O, ring buffers, and zero-copy deserialization.

Feb 20, 2026Read →

Case Study9 min

IoT Anomaly Detection at Scale: Lessons from 100K Sensors

How a manufacturing customer uses AnomalyWatch to monitor 100,000 IoT sensors. Edge-based scoring, bandwidth optimization, and detecting equipment degradation patterns.

Feb 12, 2026Read →