Deployment nodes reporting unhealthy

Incident Report for ESS (Public)

Resolved

This incident has been resolved.
Posted Apr 16, 2021 - 10:19 UTC

Update

We are continuing monitoring the issue and will provide further updates in two hours.
Posted Apr 16, 2021 - 08:13 UTC

Monitoring

Regional roll-out has completed and we're seeing unhealthy nodes return to normal. We'll continue to monitor and will provide a further update in 2 hours.
Posted Apr 16, 2021 - 06:01 UTC

Update

Testing has successfully completed and we're proceeding with the regional roll-out. We'll update as soon as that has been concluded
Posted Apr 16, 2021 - 00:51 UTC

Update

Testing of the fix is ongoing. We'll have another update in 2 hours.
Posted Apr 15, 2021 - 23:08 UTC

Identified

We have identified a code fix to resolve this problem and are working on putting it through our testing process. We will have another update in four hours.
Posted Apr 15, 2021 - 19:35 UTC

Investigating

We have discovered an issue with deployments reporting nodes as unhealthy. In the case where a node is marked as unhealthy in a highly available deployment, those nodes are no longer receiving traffic, and may cause a decrease in performance on other nodes. Single node deployments will be working as usual. All impacted clusters will have an alert in the cloud console for the deployment reporting that a node or nodes are unhealthy. Our team is working on a fix and we will have an update within two hours.
Posted Apr 15, 2021 - 18:08 UTC
This incident affected: Global services (Cloud console).