Issues with azure southeastasia-2 AZ
Incident Report for ESS (Public)
Resolved
This incident has been resolved.
Posted Feb 09, 2023 - 06:28 UTC
Monitoring
The migration of an impacted deployment hosts has been completed. The full restore for the replica shards on clusters where unhealthy instances were replaced may take additional time to complete. We are moving status of the incident to monitoring state and will provide updates as necessary.
Posted Feb 09, 2023 - 01:10 UTC
Update
The new capacity has been provisioned. Our engineers have started migration of an impacted deployment hosts. Next update will be provided in an hour.
Posted Feb 08, 2023 - 23:59 UTC
Update
Azure has provided status that all storage resources have been restored and monitors are healthy. With significant progress on compute resources with nearly all nodes restored back up. We will provide a status update in an hour or as we get more information. Please see https://azure.status.microsoft/en-us/status for more detailed information.
Posted Feb 08, 2023 - 21:43 UTC
Update
Azure has provided status that all storage resources have been restored. Around 70% of compute resources have been restored and team is making good progress on remaining resources. Please see https://azure.status.microsoft/en-us/status for more detailed information. We will provide another update in an about an hour or when we receive more information.
Posted Feb 08, 2023 - 20:27 UTC
Update
Azure has recovered most of the impacted storage resources and are performing post recovery checks. Compute resources recovery is making good progress. They are actively working on restoring the remaining resources and services. We will provide another update in an hour. Please see https://azure.status.microsoft/en-us/status for more details.
Posted Feb 08, 2023 - 18:55 UTC
Update
Azure engineers are continuing thought the structured power-up process of stroage and compute resources. More details in https://azure.status.microsoft/en-us/status. Next update will be provided in 2 hours or as soon as we have more to share.
Posted Feb 08, 2023 - 16:57 UTC
Update
Azure engineers successfully restored cooling systems in the impacted areas of the datacenter. They are commencing a structured power-up sequence for previously powered-down compute and storage resources. More details in https://azure.status.microsoft/en-us/status . Next update will be provided in 2 hours or as soon as we have more to share.
Posted Feb 08, 2023 - 14:41 UTC
Update
Azure engineers are actively working in restoring cooling units to mitigate issues in the datacenter. Once operational threshold temperatures have stabilized, they will begin work on the restoration of Compute and Storage. More details in https://azure.status.microsoft/en-us/status . Next update will be provided in 2 hours or as soon as we have more to share.
Posted Feb 08, 2023 - 13:13 UTC
Update
Azure engineers are actively working to mitigate issues in the datacenter to provide capacity for deployment hosts migration. Next update will be provided in 2 hours or as soon as we have more to share.
Posted Feb 08, 2023 - 11:12 UTC
Update
Azure engineers are continue actively working to mitigate issues in the datacenter to provide capacity for deployment hosts migration. Next update will be provided in 2 hours or as soon as we have more to share.
Posted Feb 08, 2023 - 09:10 UTC
Update
Azure engineers are continue actively working to mitigate issues in the datacenter to provide capacity for deployment hosts migration. Next update will be provided in an hour.
Posted Feb 08, 2023 - 08:00 UTC
Update
Azure engineers are continue actively working to mitigate issues in the datacenter to provide capacity for deployment hosts migration. Next update will be provided in an hour.
Posted Feb 08, 2023 - 06:55 UTC
Update
Azure engineers are continue actively working to mitigate issues in the datacenter to provide capacity for deployment hosts migration. Next update will be provided in an hour.
Posted Feb 08, 2023 - 05:45 UTC
Update
Azure engineers are continue actively working to mitigate issues in the datacenter. We have moved non-HA Elasticsearch, Kibana, APM, Enterprise search clusters out of southeastasia-2 availability zone, restoring from latest snapshot, to recover availability of those clusters. Next update will be provided in an hour.
Posted Feb 08, 2023 - 04:41 UTC
Identified
Azure engineers are reporting about cooling event in southeastasia-2 AZ datacenter. We are observing degrade performance for clusters that having instances allocated in this AZ. Azure engineers are continue actively working to mitigate the temperature issues in the datacenter. Currently there is no ETA to share for restoration of the impacted scale units.
Posted Feb 08, 2023 - 03:09 UTC
This incident affected: Azure Singapore (azure-southeastasia) (Deployment orchestration (Create/Edit/Restart/Delete): Azure azure-southeastasia, Deployment hosts: Azure azure-southeastasia).