[go: up one dir, main page]

Google Workspace Status Dashboard

This page provides status information on the services that are part of Google Workspace. Check back here to view the current status of the services listed below. If you are experiencing an issue not listed here, please contact Support. Learn more about what's posted on the dashboard in this FAQ. For additional information on these services, please visit https://workspace.google.com/. For incidents related to Google Analytics, visit the Google Ads Status Dashboard.

Incident affecting Gmail

Incident began at 2024-03-05 15:25 and ended at 2024-03-05 17:22 (times are in Coordinated Universal Time (UTC)).

Date Time Description
Mar 12, 2024 7:52 PM UTC

Incident Report

Summary

On Tuesday, 5 March 2024, starting at 07:25 US/Pacific, 0.23% of Gmail UI requests were failing for a duration of 61 minutes and 1.1% of email messages experienced up to five minute delays for a duration of 1 hour, 57 minutes.

Root Cause

This report includes a revised version of root cause for the service issue Gmail customers experienced on 5 March 2024.

After further investigation, we identified that a sudden and unexpected surge of email traffic initiated the issue, rather than an increase in authentication traffic as shared in the previous report.

Gmail frontend services HTTP, POP, or IMAP use an internal lookup service to route the requests to the specific backend instances that contain the user data. When this internal lookup service fails, users experience issues with Gmail UI interactions.

The same internal lookup service is also consulted during email delivery. When the lookup requests fail during email delivery, the requests are retried with an exponential backoff and this would materialize as delayed emails from an end-user perspective.

On 5 March 2024 at 07:25 US/Pacific, Gmail began experiencing a sudden and unexpected increase in real user traffic that coincided with a third-party service outage outside of Google’s infrastructure. At the time of the traffic increase, the internal lookup service in some geo locations was overloaded, which resulted in errors for a small fraction of Gmail UI requests and delayed email message delivery.

Remediation and Prevention

Google engineers were alerted to the increase in traffic via internal systems on 5 March 2024 at 07:32 US/Pacific and immediately started an investigation. Once the scope and nature of the issue was identified, at 08:11 US/Pacific, Google engineers redirected traffic away from the affected locations. This immediately stopped the Gmail UI errors, and prevented delivery delays for new messages.

By 09:22 US/Pacific the backlog of emails in the delivery pipeline had cleared completely.

If your service or application was affected, we apologize — this is not the level of quality and reliability we strive to offer you. Google is committed to preventing a repeat of this issue in the future and is completing the following actions:

  • Enhance Gmail’s lookup server traffic surge protection layer to be able to better handle these kinds of traffic surges.

Detailed Description of Impact

On 5 March 2024, between 07:25 and 09:22 US/Pacific, Gmail experienced increased error rates and delivery delays. During this time:

  • Approximately 0.23% of requests in the Gmail UI resulted in an error message. Affected customers encountered these errors for various actions on Gmail, including login.
  • Approximately 1.1% of email messages experienced delivery delays ranging between one and five minutes.

Mar 6, 2024 12:19 AM UTC

Mini Incident Report

We apologize for the inconvenience this service disruption may have caused. We would like to provide some information about this incident below. Please note, this information is based on our best knowledge at the time of posting and is subject to change as our investigation continues. If you have experienced an impact outside of what is listed below, please reach out to Google Workspace Support using the help article https://support.google.com/a/answer/1047213.

(All Times US/Pacific)

Incident Start: 05 March, 2024 07:25

Incident End: 05 March, 2024 09:22

Duration: 1 hour, 57 minutes

Affected Services and Features:

Gmail

Regions/Zones: Global

Description:

Gmail experienced 0.23% UI error rates for 61 minutes, and 1 to 5 minute email delays for just under two hours.

From preliminary analysis, the trigger of the issue was a sudden and unexpected increase in authentication traffic which was caused by user re-login attempts following device restarts. The device restarts were a result of users attempting to mitigate a third party application issue outside of Google’s infrastructure. This additional load led to elevated errors in our backend authentication service, which in turn impacted Gmail.

Google engineers responded promptly to the increased error rates and mitigated the issue by adding compute resources to handle the additional load. By 08:26 AM US/Pacific, Gmail UI error rates were largely resolved, and the error rate returned to baseline by 08:51. Delivery delays continued until 09:22 at which point the impact to Gmail was completely mitigated.

Customer Impact:

  • Affected customers would have noticed an “oops” error message in the Gmail UI. Customers encountered these errors for various actions on Gmail, including login.

  • Some customers would have experienced delays with email delivery ranging between one and five minutes.

Mar 5, 2024 5:34 PM UTC

We saw a surge in traffic starting around 7:25 AM Pacific Time on March 5, and we scaled up our systems to serve the additional load.

Mar 5, 2024 4:35 PM UTC Our team is continuing to investigate this issue. We will provide an update by Mar 5, 2024, 5:15 PM UTC with more information about this problem. Thank you for your patience.

Symptoms:

Some Gmail users are experiencing elevated error rates while performing various actions and may also see delays in email delivery

Mar 5, 2024 4:22 PM UTC We're investigating reports of an issue with Gmail. We will provide more information shortly.