Login Serivce - Redis Cache - Cluster Failure
Incident Report for Happeo
Postmortem

Date: 2019-06-15

Start Time: 5:06

Resolution Time: 9:50

Affected: Login to app did not work for anyone

Root cause:

Cluster was not updated early enough and GCP auto upgraded the master node, resulting to lost node-pools and cluster workloads becoming unavailable. The cluster has Redis which Login service relies on.

Posted Oct 18, 2019 - 06:18 UTC

Resolved
Cluster was not updated early enough and GCP auto upgraded the master node, resulting to lost node-pools and cluster workloads becoming unavailable. The cluster has Redis which Login service relies on. End result was that login was not working.
Posted Jun 06, 2019 - 02:00 UTC