All Systems Operational

PostHog.com Operational
90 days ago
100.0 % uptime
Today
US Cloud πŸ‡ΊπŸ‡Έ Operational
90 days ago
99.83 % uptime
Today
App Operational
90 days ago
99.89 % uptime
Today
Event and Data Ingestion Success Operational
90 days ago
99.98 % uptime
Today
Event and Data Ingestion Lag Operational
90 days ago
99.39 % uptime
Today
Feature Flags and Experiments Operational
90 days ago
99.93 % uptime
Today
Session Replay Ingestion Operational
90 days ago
99.66 % uptime
Today
Destinations Operational
90 days ago
100.0 % uptime
Today
API /query Endpoint Operational
90 days ago
100.0 % uptime
Today
EU Cloud πŸ‡ͺπŸ‡Ί Operational
90 days ago
99.95 % uptime
Today
App Operational
90 days ago
99.99 % uptime
Today
Event and Data Ingestion Success Operational
90 days ago
100.0 % uptime
Today
Event and Data Ingestion Lag Operational
90 days ago
100.0 % uptime
Today
Feature Flags and Experiments Operational
90 days ago
99.96 % uptime
Today
Session Replay Ingestion Operational
90 days ago
99.71 % uptime
Today
Destinations Operational
90 days ago
100.0 % uptime
Today
API /query Endpoint Operational
90 days ago
100.0 % uptime
Today
Support APIs Operational
90 days ago
100.0 % uptime
Today
Update Service Operational
90 days ago
100.0 % uptime
Today
AWS US πŸ‡ΊπŸ‡Έ Operational
AWS ec2-us-east-1 Operational
AWS elb-us-east-1 Operational
AWS rds-us-east-1 Operational
AWS elasticache-us-east-1 Operational
AWS kafka-us-east-1 Operational
AWS EU πŸ‡ͺπŸ‡Ί Operational
AWS elb-eu-central-1 Operational
AWS elasticache-eu-central-1 Operational
AWS rds-eu-central-1 Operational
AWS ec2-eu-central-1 Operational
AWS kafka-eu-central-1 Operational
Operational
Degraded Performance
Partial Outage
Major Outage
Maintenance
Major outage
Partial outage
No downtime recorded on this day.
No data exists for this day.
had a major outage.
had a partial outage.
US Ingestion End to End Time ?
Fetching
US Decide Endpoint Response Time
Fetching
US App Response Time
Fetching
US Event/Data Ingestion Response Time
Fetching
EU Ingestion End to End Time ?
Fetching
EU App Response Time
Fetching
EU Decide Endpoint Response Time
Fetching
EU Event/Data Ingestion Endpoint Response Time
Fetching
Dec 16, 2025

No incidents reported today.

Dec 15, 2025
Resolved - We are fully caught up and operational without realtime destination delays.
Dec 15, 14:15 UTC
Monitoring - We're monitoring while the rollback is progressing - feature flags returned to normal values and destinations are still catching up.
Dec 15, 13:49 UTC
Update - The rollback is progressing and performance and error rates have recovered.

Due to this faulty config change, we accumulated 2-3 minutes of delayed realtime destinations that are currently catching up.

Dec 15, 13:46 UTC
Identified - We identified a faulty change that we are currently rolling back. We're monitoring and seeing signs of recovery.
Dec 15, 13:42 UTC
Investigating - We've spotted that something has gone wrong and observing an elevated error rate with feature flags. We're currently investigating the issue, and will provide an update soon.
Dec 15, 13:33 UTC
Dec 14, 2025

No incidents reported.

Dec 13, 2025

No incidents reported.

Dec 12, 2025
Resolved - Lag has been recovered and cluster is stable now. Everything is working as expected.
Dec 12, 22:05 UTC
Update - Lag has been recovered and cluster is stable now. Everything is working as expected.
Dec 12, 22:04 UTC
Update - Query endpoint latency and errors have recovered, but we are still addressing increased lag for event ingestion.
Dec 12, 20:57 UTC
Update - Query endpoints are experiencing higher than typical latency that could result in timeouts.

The system is also experiencing some delays in event processing.

Operators are investigating the root cause and will update when the issue has been remedied.

Dec 12, 20:06 UTC
Investigating - We're experiencing an elevated level of API errors in our /query endpoints and are currently looking into the issue. You may see errors when trying to load insights or run SQL queries.
Dec 12, 19:44 UTC
Resolved - Duplicate incident, follow "Elevated query API timeouts" for updates
Dec 12, 19:47 UTC
Investigating - We're experiencing an elevated level of analytics query latency and errors and are currently looking into the issue.
Dec 12, 19:46 UTC
Completed - The scheduled maintenance has been completed.
Dec 12, 10:00 UTC
Update - The main part of the maintenance operations is done.

As expected, the database failover showed a brief increase of error rates that recovered after seconds.

We're cleaning up things that should have no customer-facing impact now and keep on monitoring.

Dec 12, 09:30 UTC
In progress - Scheduled maintenance is currently in progress. We will provide updates as necessary.
Dec 12, 09:00 UTC
Scheduled - We will be undergoing scheduled maintenance of our database infrastructure for app and /query.

This includes a scheduled failover of our writer instance which will cause a slight increase in error rates during that time, which usually is just a few seconds. Our services will retry most of the requests.

We will notify once this is done and post updates if necessary.

Dec 12, 08:39 UTC
Dec 11, 2025
Resolved - We identified a faulty compute node as root cause of this.

After moving all workloads off this node and removing the node from our infrastructure, error rates are consistently within our expected values.

Thank you for your patience.

Dec 11, 14:02 UTC
Monitoring - We found a few faulty webservers being responsible for this and removed from load balancing.

Seeing signs of recovery and monitoring.

Dec 11, 13:46 UTC
Investigating - We're experiencing an elevated level of API errors in the web portal and parts of the API and are currently looking into the issue.
Dec 11, 13:38 UTC
Dec 10, 2025

No incidents reported.

Dec 9, 2025
Resolved - Ingestion has caught up and everything is back to normal.
Dec 9, 21:32 UTC
Update - The query load issue is resolved and queries should be loading as normal. We're still working through the backlog of events, so some charts might be showing data that is ~30 minutes old. That should be resolved soon.

No data was lost.

Dec 9, 18:46 UTC
Monitoring - Cluster is looking stable now and we have been able to resume the ingestion of events.

Query performance should be now back to normal, as we are not hitting the limits anymore.

There is still events lag that we are already consuming, so data shown won't be up to date yet. We'll send an update once it's completely recovered.

No data has been lost during this period.

Dec 9, 18:36 UTC
Update - We are continuing to see queries failing and ingestion lag. We are switching on more capacity, which should hopefully resolve the queries failing (though they'll be showing data that is ~30-60 minutes out of date).

This is not impacting workflows or CDP, and no data has been lost.

We will keep you up to date.

Dec 9, 17:54 UTC
Update - We are under heavy load, and seeing query outages and delayed ingestion of events. No data is lost.
Dec 9, 17:06 UTC
Identified - Our ClickHouse cluster is going under heavy load right now and the ingestion of events is being delayed.

We have found the root cause and are now working to leave the cluster in a stable state so we can catch up on the ingestion.

Dec 9, 15:28 UTC
Dec 8, 2025
Resolved - The email delivery delays have been resolved.
Dec 8, 22:30 UTC
Update - Customer.io incident resolved, we re-enabled verification emails
Dec 8, 21:40 UTC
Monitoring - We're seeing delays with verification and password reset emails due to an ongoing incident with Customer.io:
https://status.customerio.com/incidents/x5hwkcw0ddds
To unblock logins, we've temporarily disabled verification emails until the issue is resolved.

Dec 8, 20:14 UTC
Update - We've spotted that something has gone wrong. We're currently investigating the issue, and will provide an update soon.
Dec 8, 19:51 UTC
Investigating - We're seeing an unusual spike in tickets about login verification emails and password reset emails not being received. We're currently investigating.
Dec 8, 19:44 UTC
Resolved - This incident has been resolved.
Dec 8, 19:50 UTC
Investigating - We're experiencing an elevated level of API errors and are currently looking into the issue.
Dec 8, 19:45 UTC
Dec 7, 2025

No incidents reported.

Dec 6, 2025

No incidents reported.

Dec 5, 2025

No incidents reported.

Dec 4, 2025
Resolved - The brief spike in errors was due to some maintenance work on the queueing system. Maintenance is over and all systems have recovered
Dec 4, 18:10 UTC
Investigating - The capture endpoint for ingesting events in the EU region is experiencing an elevated rate of errors. Operators are investigating the cause and will provide updates shortly.
Dec 4, 17:54 UTC
Resolved - Monitoring data tells us this issue is now resolved.
Dec 4, 17:55 UTC
Monitoring - We're experiencing an elevated level of API errors in PostHog AI and are currently looking into the issue.
Dec 4, 17:42 UTC
Dec 3, 2025
Resolved - This incident has been resolved.
Dec 3, 21:53 UTC
Monitoring - A fix has been deployed to address the root cause of the elevated error rates.
Dec 3, 21:05 UTC
Identified - We've identified the cause of the elevated level of API errors and are currently deploying a fix.
Dec 3, 20:38 UTC
Investigating - We're experiencing an elevated level of API errors and are currently looking into the issue.
Dec 3, 18:10 UTC
Dec 2, 2025

No incidents reported.