Resolved -
This issue is now resolved.
The main impact to customers was delay in our modeling and alerting pipeline that would have caused any potential alert to be delayed potentially multiple hours.
Our custom cron scheduling system was impacted, causing us to miss observing any custom cron monitors from from 12:17 EST -> 8AM EST
Jan 29, 13:52 UTC
Monitoring -
Alerts are sending in realtime again
Jan 29, 13:46 UTC
Identified -
From 12:17 EST -> 8AM EST monitoring for customers was highly lagged, causing a delay or miss in alerting.
Metrics with a custom schedule also failed to run.
We have identified the issue and are implementing a fix so we can recover and alert on as much data as possible.
Jan 29, 13:13 UTC