From the course: Complete Guide to the AWS Well-Architected Framework

Unlock the full course today

Join today to access over 24,800 courses taught by industry experts.

Failure management

Failure management

- [Instructor] When it comes to managing failure, the one aspect of managing applications is that we don't want to have to go to our boss and say, "We lost data." So the first thing we have to do is ensure that we're backing up all of our data. Data can be the functions that we've created, the data that we've stored, our AMI images. Whatever it is, all the system components, all the small scripts, everything has to be backed up in duplicate. We also have to have resilient architecture to protect our workload. Resilient architecture from the management services, which have failover, to our application servers, web servers, and database servers, which have to be hosted on multiple subnets in different availability zones at a minimum. We have to ensure that the applications are designed to withstand component failures. If it's storage, I want three copies of my storage. Is it images? I want them copied to another region. Is it S3 buckets? Replicate that information to another region. I…

Contents