The Register: Azure networking snafu enters day 2, some services still limping

Source URL: https://www.theregister.com/2025/01/10/microsoft_azure_networking_snafu/
Source: The Register
Title: Azure networking snafu enters day 2, some services still limping

Feedly Summary: Struggling to connect to the cloud? You’re not alone
Microsoft on Friday warned Azure cloud service users may continue to experience “intermittent errors," blaming the problem on a US East regional networking service configuration change.…

AI Summary and Description: Yes

Summary: Microsoft has warned users of its Azure cloud services about ongoing intermittent errors resulting from a recent networking configuration change in the East US 2 region. The outage affected various services, and Microsoft is actively working on resolving the issue by rerouting traffic and patching problems, but complete recovery is not yet achieved.

Detailed Description:
Microsoft’s Azure cloud service experienced significant operational difficulties stemming from a networking configuration error that began on January 8. The incident primarily affected a specific zone within the East US 2 region, resulting in various connectivity issues and service disruptions for users.

Key points regarding the incident include:

– **Cause of Outage**: A network configuration issue within a specific zone led to three storage partitions becoming unhealthy, causing widespread connectivity problems.

– **Affected Services**: Multiple Azure services were impacted, including:
– Azure Databricks
– Azure Container Apps
– Azure Function Apps
– Azure App Service
– SQL Managed Instances
– Azure Data Factory
– Azure Container Instances
– PowerBI
– VMSS
– PostgreSQL flexible servers
– Other services communicating with Private Endpoint Network Security Groups

– **Mitigation Efforts**:
– Microsoft rerouted traffic away from the affected zone to alleviate some issues for non-zonal services.
– They patched the problems related to Private Links, allowing dependent services to resume operations.

– **Ongoing Recovery**:
– As of January 10, the impacted partitions had been brought back online, but not all services had fully recovered.
– Users may still experience intermittent errors and performance degradation as the recovery process continues.

– **Customer Guidance**: Microsoft recommended that affected customers execute disaster recovery strategies for their cloud services to mitigate further disruptions.

This incident underscores the importance of robust network configuration management within cloud services, as misconfigurations can lead to widespread service outages and affect user operations significantly. Additionally, it highlights the necessity for cloud service providers to maintain transparent communication regarding outages and recovery efforts to assist clients in managing their operational continuity.