Resolved -
We have attached the disk on the failing instance and restored all the data in it.
Lag has been recovered.
Sep 12, 15:45 UTC
Identified -
One of our ClickHouse disks needs to be replaced.
Lag is recovered for now and we are working on attaching a new disk and fill it with all the data.
No data has been lost at any time during the process.
Sep 12, 10:52 UTC
Investigating -
One of our ClickHouse instances is constrained on disk, leading to an increase of ingestion lag.
We are investigating the cause of this.
Sep 12, 07:53 UTC