3 shades of latency: How Netflix built a data architecture around timeliness

Like other companies operating at webscale, Netflix knows that processing and serving up lots of data — some to customers, some for use on the backend — doesn’t have to happen either right away or never. It’s more like a gray area, and Netflix detailed the uses for three shades of gray — online, offline and nearline processing — in a post on its tech blog on Wednesday.

The Netflix way of processing data online, offline and nearline.

The whole point of its data architecture is to tackle latency by pointing workloads and tasks toward systems designed to work at their speed. People love to think about Hadoop when they think about web data, but the reality is that relying solely on batch processing means data can get stale and applications probably don’t include the newest user input.

Netflix uses online processing for receiving information from users in real time and serving up responses right away, such as looking at a new rating or some other customer action to change the set of movies shown to the customer. Real-time processing works best when algorithms are relatively simple and when data is on the smaller side. The data feeding in to computations must also be available right away.

Nearline processing happens when the data needs to be computed in real time but can be stored for serving up at a later point in time. This option makes sense when computations are more complex and are amenable to a more-traditional database-oriented approach. Netflix uses a variety of databases, including MySQL, the NoSQL Cassandra database and its own homemade EVcache system.

Offline processing in Netflix’s world might also be called batch processing — think bigger and longer-term Hadoop jobs. It also fits for compute-heavy projects to train new models that will come into use at a later date. And it’s a backup for situations when real-time processing isn’t possible.

This online-nearline-offline approach is fairly common among web companies that understand that different applications can tolerate different latencies. LinkedIn has built its data infrastructure with the same general theory in mind. Facebook, too, has thought deeply about this. The social network recently detailed a new memcached-like data store called McDipper that foregoes DRAM for flash in order to cut costs for tasks that can live with slightly higher latency.