AWS Case Study: Pearson
What is Cloud Computing?
Cloud computing is the on-demand delivery of IT resources over the Internet with pay-as-you-go pricing. Instead of buying, owning, and maintaining physical data centers and servers, you can access technology services, such as computing power, storage, and databases, on an as-needed basis from a cloud provider.
The three main types of cloud computing include Infrastructure as a Service, Platform as a Service, and Software as a Service. Each type of cloud computing provides different levels of control, flexibility, and management so that you can select the right set of services for your needs.
Amazon Web Services
Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform, offering over 175 fully-featured services from data centers globally. Millions of customers — including the fastest-growing startups, largest enterprises, and leading government agencies — are using AWS to lower costs, become more agile, and innovate faster.
AWS has the most extensive global cloud infrastructure. No other cloud provider offers as many Regions with multiple Availability Zones connected by low latency, high throughput, and highly redundant networking. AWS has 77 Availability Zones within 24 geographic regions around the world and has announced plans for nine more Availability Zones and three more AWS Regions in Indonesia, Japan, and Spain. The AWS Region/Availability Zone model has been recognized by Gartner as the recommended approach for running enterprise applications that require high availability.
AWS Use Cases
Millions of customers — including the fastest-growing startups, largest enterprises, and leading government agencies — are using AWS to lower costs, become more agile, and innovate faster.
In every field, the AWS service is used. Below are some areas and some top companies use AWS.
👉 Aerospace (NASA, Maxar, ESA, etc.)
👉 Gaming (MPL, FanFight, Gammation, etc.)
👉 Education (Coursera, BYJU’s, etc.)
👉 Telecommunication (Pinterest, Vodafone, Aircel, etc.)
👉 Entertainment (Netflix, Hotstar, etc.)
👉 Media (BBC, The Hindu, Punjab Kesri, etc.)
👉 Software (Share chat, Slack, etc.)
Global educational media company Pearson needed a more efficient way to analyze and gain insights from its log data. With a number of teams in various locations using Elasticsearch — the popular open-source tool for search and log analytics — Pearson found that keeping track of log data and managing updates led to high operating costs. Faced with this, as well as increasingly complex security log management and analysis, the company found a solution on Amazon Web Services (AWS). Pearson quickly saw improvements by migrating from its self-managed open-source Elasticsearch architecture to Amazon Elasticsearch Service, a fully managed service that makes it easy to deploy, secure, and run Elasticsearch cost-effectively at scale. Rather than spending considerable time and resources on managing the Elasticsearch clusters on its own, Pearson used the managed Amazon Elasticsearch Service as part of its initiative to modernize its products.
Some of the AWS services used by Pearson are:
1. Amazon Elasticsearch Service
Amazon Elasticsearch Service is a fully managed service that makes it easy for you to deploy, secure, and run Elasticsearch cost-effectively at scale. You can build, monitor, and troubleshoot your applications using the tools you love, at the scale you need. The service provides support for open-source Elasticsearch APIs, managed Kibana, integration with Logstash and other AWS services, and built-in alerting and SQL querying. Amazon Elasticsearch Service lets you pay only for what you use — there are no upfront costs or usage requirements. With Amazon Elasticsearch Service, you get the ELK stack you need, without the operational overhead.
- Easy to deploy and manage
With Amazon Elasticsearch Service you can deploy your Elasticsearch cluster in minutes. The service simplifies management tasks such as hardware provisioning, software installation and patching, failure recovery, backups, and monitoring. To monitor your clusters, the Amazon Elasticsearch service includes built-in event monitoring and alerting so you can get notified on changes to your data to proactively address any issues.
2. Highly scalable and available
Amazon Elasticsearch Service lets you store up to 3 PB of data in a single cluster, enabling you to run large log analytics workloads via a single Kibana interface. You can easily scale your cluster up or down via a single API call or a few clicks in the AWS console. Amazon Elasticsearch Service is designed to be highly available using multi-AZ deployments, which allows you to replicate data between three Availability Zones in the same region.
3. Highly secure
For your data in Elasticsearch Service, you can achieve network isolation with Amazon VPC, encrypt data-at-rest and in-transit using keys you create and control through AWS KMS, and manage authentication and access control with Amazon Cognito and AWS IAM policies. Amazon Elasticsearch Service is also HIPAA eligible, and compliant with PCI DSS, SOX, ISO, and FedRamp standards to help you meet industry-specific or regulatory requirements.
With Amazon Elasticsearch Service, you pay only for the resources you consume. You can select on-demand pricing with no upfront costs or long-term commitments, or achieve significant cost savings via our Reserved Instance pricing. As a fully managed service, Amazon Elasticsearch Service further lowers your total cost of operations by eliminating the need for a dedicated team of Elasticsearch experts to monitor and manage your clusters.
2. Amazon S3
Amazon Simple Storage Service (Amazon S3) is an object storage service that offers industry-leading scalability, data availability, security, and performance. This means customers of all sizes and industries can use it to store and protect any amount of data for a range of use cases, such as data lakes, websites, mobile applications, backup and restore, archive, enterprise applications, IoT devices, and big data analytics. Amazon S3 provides easy-to-use management features so you can organize your data and configure finely-tuned access controls to meet your specific business, organizational, and compliance requirements. Amazon S3 is designed for 99.999999999% (11 9’s) of durability, and stores data for millions of applications for companies all around the world.
- Industry-leading performance, scalability, availability, and durability
Scale your storage resources up and down to meet fluctuating demands, without upfront investments or resource procurement cycles. Amazon S3 is designed for 99.999999999% (11 9’s) of data durability because it automatically creates and stores copies of all S3 objects across multiple systems. This means your data is available when needed and protected against failures, errors, and threats. Amazon S3 also delivers strong read-after-write consistency automatically, at no cost, and without changes to performance or availability.
2. Wide range of cost-effective storage classes
Save costs without sacrificing performance by storing data across the S3 Storage Classes, which support different data access levels at corresponding rates. You can use S3 Storage Class Analysis to discover data that should move to a lower-cost storage class based on access patterns, and configure an S3 Lifecycle policy to execute the transfer. You can also store data with changing or unknown access patterns in S3 Intelligent-Tiering, which tiers objects based on changing access patterns and automatically delivers cost savings. With the S3 Outposts storage class, you can meet data residency requirements, and store data on-premises in your Outposts environment using S3 on Outposts. Optimize costs using S3
3. Unmatched security, compliance, and audit capabilities
Store your data in Amazon S3 and secure it from unauthorized access with encryption features and access management tools. S3 is the only object storage service that allows you to block public access to all of your objects at the bucket or the account level with S3 Block Public Access. S3 maintains compliance programs, such as PCI-DSS, HIPAA/HITECH, FedRAMP, EU Data Protection Directive, and FISMA, to help you meet regulatory requirements. S3 integrates with Amazon Macie to discover and protect your sensitive data. AWS also supports numerous auditing capabilities to monitor access requests to your S3 resources.
4. Easily manage data and access controls
S3 gives you robust capabilities to manage access, cost, replication, and data protection. S3 Access Points make it easy to manage data access with specific permissions for your applications using a shared data set. S3 Replication manages data replication within the region or to other regions. S3 Batch Operations helps manage large-scale changes across billions of objects. S3 Storage Lens delivers organization-wide visibility into object storage usage and activity trends. Since S3 works with AWS Lambda, you can log activities, define alerts, and automate workflows without managing additional infrastructure.
3. AWS Lambda
AWS Lambda is a serverless compute service that lets you run code without provisioning or managing servers, creating workload-aware cluster scaling logic, maintaining event integrations, or managing runtimes. With Lambda, you can run code for virtually any type of application or backend service — all with zero administration. Just upload your code as a ZIP file or container image, and Lambda automatically and precisely allocates compute execution power and runs your code based on the incoming request or event, for any scale of traffic. You can set up your code to automatically trigger from 140 AWS services or call it directly from any web or mobile app. You can write Lambda functions in your favorite language (Node.js, Python, Go, Java, and more) and use both serverless and container tools, such as AWS SAM or Docker CLI, to build, test, and deploy your functions.
1. No servers to manage
AWS Lambda automatically runs your code without requiring you to provision or manage infrastructure. Just write the code and upload it to Lambda either as a ZIP file or container image.
2. Continuous scaling
AWS Lambda automatically scales your application by running code in response to each event. Your code runs in parallel and processes each trigger individually, scaling precisely with the size of the workload, from a few requests per day to hundreds of thousands per second.
3. Cost-optimized with millisecond metering
With AWS Lambda, you only pay for the compute time you consume, so you’re never paying for over-provisioned infrastructure. You are charged for every millisecond your code executes and the number of times your code is triggered. With Compute Savings Plan, you can additionally save up to 17%.
4. Consistent performance at any scale
With AWS Lambda, you can optimize your code execution time by choosing the right memory size for your function. You can also keep your functions initialized and hyper-ready to respond within double-digit milliseconds by enabling Provisioned Concurrency.