Failure Modes and Continuous Resilience
A resilient system continues to operate successfully in the presence of failures. There are many possible failure modes, and each exercises a different aspect of resilience. The system needs to maintain a safety margin that is capable of absorbing failure via defense in depth, and failure modes need to be prioritized to take care of the most likely and highest impact risks.
A good article by Adrian Cockcroft:
https://medium.com/@adrianco/failure-modes-and-continuous-resilience-6553078caad5