Kubernetes’ Case Study: Spotify
What is Kubernetes?
Kubernetes is a portable, extensible, open-source platform for managing containerized workloads and services, that facilitates both declarative configuration and automation. It has a large, rapidly growing ecosystem. Kubernetes services, support, and tools are widely available.
The name Kubernetes originates from Greek, meaning helmsman or pilot. Google open-sourced the Kubernetes project in 2014. Kubernetes combines over 15 years of Google’s experience running production workloads at scale with best-of-breed ideas and practices from the community.
Why you need Kubernetes?
Containers are a good way to bundle and run your applications. In a production environment, you need to manage the containers that run the applications and ensure that there is no downtime. For example, if a container goes down, another container needs to start. Wouldn’t it be easier if this behavior was handled by a system?
That’s how Kubernetes comes to the rescue! Kubernetes provides you with a framework to run distributed systems resiliently. It takes care of scaling and failover for your application, provides deployment patterns, and more. For example, Kubernetes can easily manage a canary deployment for your system.
Kubernetes provides you with:
- Service discovery and load balancing Kubernetes can expose a container using the DNS name or using their own IP address. If traffic to a container is high, Kubernetes is able to load balance and distribute the network traffic so that the deployment is stable.
- Storage orchestration Kubernetes allows you to automatically mount a storage system of your choice, such as local storage, public cloud providers, and more.
- Automated rollouts and rollbacks You can describe the desired state for your deployed containers using Kubernetes, and it can change the actual state to the desired state at a controlled rate. For example, you can automate Kubernetes to create new containers for your deployment, remove existing containers, and adopt all their resources to the new container.
- Automatic bin packing You provide Kubernetes with a cluster of nodes that it can use to run containerized tasks. You tell Kubernetes how much CPU and memory (RAM) each container needs. Kubernetes can fit containers onto your nodes to make the best use of your resources.
- Self-healing Kubernetes restarts containers that fail, replaces containers, kills containers that don’t respond to your user-defined health check, and doesn’t advertise them to clients until they are ready to serve.
- Secret and configuration management Kubernetes let you store and manage sensitive information, such as passwords, OAuth tokens, and SSH keys. You can deploy and update secrets and application configuration without rebuilding your container images, and without exposing secrets in your stack configuration.
Challenges faced by Spotify
Launched in 2008, the audio-streaming platform has grown to over 200 million monthly active users across the world.” Our goal is to empower creators and enable a really immersive listening experience for all of the consumers that we have today — and hopefully the consumers we’ll have in the future,” says Jai Chakrabarti, Director of Engineering, Infrastructure, and Operations. An early adopter of microservices and Docker, Spotify had containerized microservices running across its fleet of VMs with a homegrown container orchestration system called Helios. By late 2017, it became clear that “having a small team working on the features was just not as efficient as adopting something that was supported by a much bigger community,” he says.
“We saw the amazing community that had grown up around Kubernetes, and we wanted to be part of that,” says Chakrabarti. Kubernetes was more feature-rich than Helios. Plus, “we wanted to benefit from added velocity and reduced cost, and also align with the rest of the industry on best practices and tools.” At the same time, the team wanted to contribute its expertise and influence in the flourishing Kubernetes community. The migration, which would happen in parallel with Helios running, could go smoothly because “Kubernetes fit very nicely as a compliment and now as a replacement to Helios,” says Chakrabarti.
The team spent much of 2018 addressing the core technology issues required for a migration, which started late that year and is a big focus for 2019. “A small percentage of our fleet has been migrated to Kubernetes, and some of the things that we’ve heard from our internal teams are that they have less of a need to focus on manual capacity provisioning and more time to focus on delivering features for Spotify,” says Chakrabarti.
An early adopter of microservices and Docker, Spotify had containerized microservices running across its fleet of VMs since 2014. The company used an open-source, homegrown container orchestration system called Helios, and in 2016–17 completed a migration from on-premise data centers to Google Cloud. But by late 2017, it became clear that having a small team working on the Helios features was just not as efficient as adopting something that was supported by a much bigger community. They saw the amazing community that had grown up around Kubernetes and wanted to be part of that. They wanted to be benefitted from added velocity and reduced cost, and also align with the rest of the industry on best practices and tools.” At the same time, the team wanted to contribute its expertise and influence to the flourishing Kubernetes community. Kubernetes fitted very nicely as a compliment and now as a replacement to Helios, so they have it running alongside Helios to mitigate the risks. A small percentage of Spotify’s fleet, containing over 150 services, has been migrated to Kubernetes so far. The biggest service currently running on Kubernetes takes over 10 million requests per second as an aggregate service and benefits greatly from autoscaling