What's Wrong with Facebook New Updated 2019

What's Wrong With Facebook - Early today Facebook was down or inaccessible for many of you for around 2.5 hrs. This is the worst outage we've had in over 4 years, as well as we intended to first of all excuse it. We additionally wanted to offer a lot more technological detail on what took place and also share one big lesson discovered.

What's Wrong With Facebook

What's Wrong With Facebook


The key imperfection that triggered this outage to be so severe was an unfavorable handling of a mistake problem. An automated system for validating arrangement values ended up triggering a lot more damages than it taken care of.

The intent of the computerized system is to look for configuration values that are invalid in the cache and replace them with updated values from the consistent store. This works well for a transient issue with the cache, however it doesn't function when the consistent store is invalid.

Today we made a change to the relentless duplicate of an arrangement value that was interpreted as void. This meant that every single customer saw the void worth and tried to repair it. Since the repair involves making an inquiry to a collection of data sources, that collection was promptly overwhelmed by numerous countless questions a second.

To make matters worse, each time a customer got a mistake trying to query one of the databases it translated it as a void value, and erased the matching cache trick. This indicated that even after the original trouble had actually been repaired, the stream of inquiries continued. As long as the data sources stopped working to service several of the requests, they were triggering even more demands to themselves. We had gotten in a feedback loop that really did not enable the data sources to recoup.

The way to stop the responses cycle was quite painful - we had to quit all web traffic to this data source collection, which implied shutting off the website. When the data sources had recuperated and the root cause had been fixed, we gradually allowed more people back onto the website.

This obtained the website back up and also running today, as well as for now we've turned off the system that attempts to remedy arrangement values. We're exploring new styles for this arrangement system following design patterns of various other systems at Facebook that deal more gracefully with responses loopholes as well as short-term spikes.

We say sorry once more for the site failure, as well as we want you to recognize that we take the performance and dependability of Facebook extremely seriously.