Corefinity - Corefinity investigating high loads across a number of GCP environments – Incident details

Corefinity investigating high loads across a number of GCP environments

Resolved
Degraded performance
Started about 2 years agoLasted about 2 hours

Affected

Google Cloud Platform Stacks (GCP)

Degraded performance from 11:17 AM to 1:21 PM

Updates
  • Resolved
    Resolved

    We have also found a huge (3x to 5x) spike in traffic across many of our Magento 2 applications at the same time as this incident. We are conducting a separate investigation into this however there is no cause for concern.

    We have implemented a fix and currently monitoring the result. Everything is healthy and green.

    We will provide further updates.

  • Identified
    Identified

    We have identified the root cause of the issue and unfortunately it is completely unrelated to the recent events we have had although it has most likely contributed to the high load and increased the impact.

    We've had an immediate response to our P1 request with GCP and the root cause of the issue has been identified as high usage across majority of client nodes by a process named "/home/kubernetes/bin/gcfsd" - This process is a GCP managed process providing virtual mounts to the servers.

    The below line shows this process using around 8 cores of CPU (Peaking to 20 and 22 cores on other clients).

    2114 root 20 0 23.5g 15.9g 21088 S 757.0 8.5 1962:35 /home/kubernetes/bin/gcfsd --mountpoint=/run/gcfsd/mnt --maxcontentcachesizemb=213 --maxlargefilescachesizemb=213 --layercachedir=/var/lib/containerd/io.containerd.snap+

    We are working on an immediate mitigation action at the moment and another update will be provided within 10 minutes.

  • Investigating
    Investigating

    We are currently investigating high loads across a number of GCP servers and environments.
    We will provide another update within 30 minutes.