Performance issues

Incident Report for PostHog

Resolved

We believe the impact from the earlier performance regression is over. All systems are operating nominally.
Posted Feb 28, 2024 - 22:50 UTC

Monitoring

Decide (feature flags) have recovered. The issue was caused by a PR that introduced a performance hit, followed by a thundering herd of retries. We're back up.

No data loss occurred, and we're ingesting the backlog of events now.
Posted Feb 28, 2024 - 18:10 UTC

Update

We are continuing to investigate the issue and work on restoring all services.
Posted Feb 28, 2024 - 17:03 UTC

Update

All of US app and feature flags are impacted. We're identifying the cause.

There is no data loss, but event ingestion is delayed.
Posted Feb 28, 2024 - 16:16 UTC

Update

The issue seems to be isolated to our decide endpoint. We're investigating.
Posted Feb 28, 2024 - 15:56 UTC

Investigating

We've spotted that something has gone wrong. We're currently investigating the issue, and will provide an update soon.
Posted Feb 28, 2024 - 15:34 UTC
This incident affected: US Cloud 🇺🇸 (App, Event and Data Ingestion Lag).