EU: Elevated error rate on data capture

Incident Report for PostHog

Resolved

We resolved the issue and everything is operational again.

One of our reverse-proxy instances scaled in ungracefully which caused routing errors.
After manually terminating it, the services recovered.

We saw elevated errors from 16.01 UTC to 16.48 UTC. A good part of it was recovered by internal retries, but we can't be certain right now to not have lost some events.

We will analyze and provide a long term fix so that this won't happen again.

Our apologies for this as we were not able to capture all data during this time.

Posted Feb 19, 2025 - 17:09 UTC

Identified

We found something in networking and it seems to be recovering now. Monitoring the situation.

Posted Feb 19, 2025 - 16:47 UTC

Investigating

We've spotted that something has gone wrong. We're seeing elevated error rates on capture on the web app.

We're currently investigating the issue, and will provide an update soon.

Posted Feb 19, 2025 - 16:17 UTC

This incident affected: EU Cloud 🇪🇺 (App, Event and Data Ingestion Lag).