Even Netflix struggles to identify and understand the cost of its AWS estate
If you have trouble keeping track of your various streaming subscriptions, you're gonna love the irony
Keeping track of the amount of cloudy resources an org uses, and the cost of doing so, is notoriously tricky – so tricky, indeed, that even Netflix isn't on top of it.
We know this because on Wednesday US time the vid-streamer blogged about its cloud efficiency measures.
The post – penned by senior analytics engineer "Jennifer H" and Pallavi Phadnis, who describes her role as "Data" – opens by noting Netflix's well-known use of Amazon Web Services (AWS) for its cloud infrastructure needs, and that its engineering teams have self-service tools they can use to provision apps in the cloud.
The pair also reveal that Netflix operates a Platform DSE (Data Science Engineering) team, which helps engineering teams "to understand what resources they're using, how effectively and efficiently they use those resources, and the cost associated with their resource usage."
The Platform DSE team's goal is helping "downstream consumers to make cost conscious decisions using our datasets."
To assist in that goal, it's created two tools:
- A Foundational Platform Data (FPD) that "provides a centralized data layer for all platform data, featuring a consistent data model and standardized data processing methodology."
- A Cloud Efficiency Analytics (CEA) tool that is built on top of FPD and "offers an analytics data layer that provides time series efficiency metrics across various business use cases."
- AWS now renting monster HPE servers, even in clusters of 7,680-vCPUs and 128TB
- Bitfinex heist gets the Netflix treatment after 'cringey couple' sentenced
- The horror that is VHS revived for horror movie release
- Lawsuit claims Meta hobbled Facebook Watch to help Netflix
FPD consumes fed data from applications like Apache spark, which records how long cores are allocated to jobs and the amount of data read. CEA is then sent "inventory, ownership, and usage data and applies the appropriate business logic to produce cost and ownership attribution at various granularities," the post explains.
The datasets Netflix generates are highly complex "due to the breadth and scope of the business infrastructure and platform specific features."
"Services can have multiple owners, cost heuristics are unique to each platform, and the scale of infra data is large," Jennifer H and Pallavi Phadnis wrote, before explaining Netflix's platforms often have customizations that mean the Platform DSE team always has plenty to do – including regular audits.
"Maintaining data completeness while ensuring correctness becomes challenging due to upstream latency and required transformations to have the data ready for consumption," they explained.
Their work therefore continues, with both FPD and CEA under development and Netflix "striving for nearly complete cost insight coverage in the upcoming year."
It gets better. The post concludes by revealing Netflix's intention to "move towards proactive approaches via predictive analytics and ML for optimizing usage and detecting anomalies in cost."
You read that right: Netflix, one of the most famous users of public cloud, isn't in total control of its cloud spend and needs to get better at detecting anomalies.
So you're not alone if you struggle to do so, too. ®