Download the free Kindle app and start reading Kindle books instantly on your smartphone, tablet, or computer - no Kindle device required.
Read instantly on your browser with Kindle for Web.
Using your mobile phone camera - scan the code below and download the Kindle app.
Follow the authors
OK
Data Science on AWS: Implementing End-to-End, Continuous AI and Machine Learning Pipelines 1st Edition
Purchase options and add-ons
With this practical book, AI and machine learning practitioners will learn how to successfully build and deploy data science projects on Amazon Web Services. The Amazon AI and machine learning stack unifies data science, data engineering, and application development to help level up your skills. This guide shows you how to build and run pipelines in the cloud, then integrate the results into applications in minutes instead of days. Throughout the book, authors Chris Fregly and Antje Barth demonstrate how to reduce cost and improve performance.
- Apply the Amazon AI and ML stack to real-world use cases for natural language processing, computer vision, fraud detection, conversational devices, and more
- Use automated machine learning to implement a specific subset of use cases with SageMaker Autopilot
- Dive deep into the complete model development lifecycle for a BERT-based NLP use case including data ingestion, analysis, model training, and deployment
- Tie everything together into a repeatable machine learning operations pipeline
- Explore real-time ML, anomaly detection, and streaming analytics on data streams with Amazon Kinesis and Managed Streaming for Apache Kafka
- Learn security best practices for data science projects and workflows including identity and access management, authentication, authorization, and more
- ISBN-101492079391
- ISBN-13978-1492079392
- Edition1st
- PublisherO'Reilly Media
- Publication dateMay 11, 2021
- LanguageEnglish
- Dimensions7 x 1.05 x 9.19 inches
- Print length521 pages
Frequently bought together
Products related to this item
From the brand
-
Explore more AWS resources
-
Sharing the knowledge of experts
O'Reilly's mission is to change the world by sharing the knowledge of innovators. For over 40 years, we've inspired companies and individuals to do new things (and do them better) by providing the skills and understanding that are necessary for success.
Our customers are hungry to build the innovations that propel the world forward. And we help them do just that.
From the Publisher
Who Should Read This Book
This book is for anyone who uses data to make critical business decisions. The guidance here will help data analysts, data scientists, data engineers, ML engineers, research scientists, application developers, and DevOps engineers broaden their understanding of the modern data science stack and level up their skills in the cloud.
The Amazon AI and ML stack unifies data science, data engineering, and application development to help users level up their skills beyond their current roles. We show how to build and run pipelines in the cloud, then integrate the results into applications in minutes instead of days.
Ideally, and to get most out of this book, we suggest readers have the following knowledge:
- Basic understanding of cloud computing
- Basic programming skills with Python, R, Java/Scala, or SQL
- Basic familiarity with data science tools such as Jupyter Notebook, pandas, NumPy, or scikit-learn
Overview of the Chapters
Chapter 1 provides an overview of the broad and deep Amazon AI and ML stack, an enormously powerful and diverse set of services, open source libraries, and infrastructure to use for data science projects of any complexity and scale.
Chapter 2 describes how to apply the Amazon AI and ML stack to real-world use cases for recommendations, computer vision, fraud detection, natural language understanding (NLU), conversational devices, cognitive search, customer support, industrial predictive maintenance, home automation, Internet of Things (IoT), healthcare, and quantum computing.
Chapter 3 demonstrates how to use AutoML to implement a specific subset of these use cases with SageMaker Autopilot.
Chapters 4–9 dive deep into the complete model development life cycle (MDLC) for a BERT-based NLP use case, including data ingestion and analysis, feature selection and engineering, model training and tuning, and model deployment with Amazon SageMaker, Amazon Athena, Amazon Redshift, Amazon EMR, TensorFlow, PyTorch, and serverless Apache Spark.
Chapter 10 ties everything together into repeatable pipelines using MLOps with SageMaker Pipelines, Kubeflow Pipelines, Apache Airflow, MLflow, and TFX.
Chapter 11 demonstrates real-time ML, anomaly detection, and streaming analytics on real-time data streams with Amazon Kinesis and Apache Kafka.
Chapter 12 presents a comprehensive set of security best practices for data science projects and workflows, including IAM, authentication, authorization, network isolation, data encryption at rest, post-quantum network encryption in transit, governance, and auditability.
Throughout the book, we provide tips to reduce cost and improve performance for data science projects on AWS.
Editorial Reviews
Review
to production. Chris and Antje have covered all of the important concepts and the
key AWS services, with plenty of real-world examples to get you started
on your data science journey."
--Jeff Barr,
Vice President & Chief Evangelist,
Amazon Web Services
"It's very rare to find a book that comprehensively covers the full end-to-end process of
model development and deployment! If you're an ML practitioner, this book is a must!"
--Ramine Tinati,
Managing Director/Chief Data Scientist Applied Intelligence,
Accenture
"This book is a great resource for building scalable machine learning solutions on AWS
cloud. It includes best practices for all aspects of model building, including training,
deployment, security, interpretability, and MLOps."
--Geeta Chauhan,
AI/PyTorch Partner Engineering Head,
Facebook AI
"The landscape of tools on AWS for data scientists and engineers can be absolutely
overwhelming. Chris and Antje have done the community a service by providing a map
that practitioners can use to orient themselves, find the tools they need to get the
job done and build new systems that bring their ideas to life."
--Josh Wills,
Author, Advanced Analytics with Spark (O'Reilly)
"Successful data science teams know that data science isn't just modeling but needs a
disciplined approach to data and production deployment. We have an army of tools for all
of these at our disposal in major clouds like AWS. Practitioners will appreciate this
comprehensive, practical field guide that demonstrates not just how to apply
the tools but which ones to use and when."
--Sean Owen,
Principal Solutions Architect,
Databricks
From the Author
to successfully build and deploy data science projects on Amazon Web Services
(AWS). The Amazon AI and ML stack unifies data science, data engineering, and
application development to help level up your skills. This guide shows you how to
build and run pipelines in the cloud, then integrate the results into applications in
minutes instead of days. Throughout the book, authors Chris Fregly and Antje Barth
demonstrate how to reduce cost and improve performance.
* Apply the Amazon AI and ML stack to real-world use cases for natural language
processing, computer vision, fraud detection, conversational devices, and more.
* Use automated ML (AutoML) to implement a specific subset of use cases with
Amazon SageMaker Autopilot.
* Dive deep into the complete model development life cycle for a BERT-based natural
language processing (NLP) use case including data ingestion and analysis,
and more.
* Tie everything together into a repeatable ML operations (MLOps) pipeline.
* Explore real-time ML, anomaly detection, and streaming analytics on real-time
data streams with Amazon Kinesis and Amazon Managed Streaming for Apache
Kafka (Amazon MSK).
* Learn security best practices for data science projects and workflows, including
AWS Identity and Access Management (IAM), authentication, authorization, and
more.
Overview of the Chapters
Chapter 1 provides an overview of the broad and deep Amazon AI and ML stack, an
enormously powerful and diverse set of services, open source libraries, and infrastructure
to use for data science projects of any complexity and scale.
Chapter 2 describes how to apply the Amazon AI and ML stack to real-world use
cases for recommendations, computer vision, fraud detection, natural language
understanding (NLU), conversational devices, cognitive search, customer support,
industrial predictive maintenance, home automation, Internet of Things (IoT),
healthcare, and quantum computing.
Chapter 3 demonstrates how to use AutoML to implement a specific subset of these
use cases with SageMaker Autopilot.
Chapters 4-9 dive deep into the complete model development life cycle (MDLC) for a
BERT-based NLP use case, including data ingestion and analysis, feature selection
and engineering, model training and tuning, and model deployment with SageMaker,
Amazon Athena, Amazon Redshift, Amazon EMR, TensorFlow, PyTorch, and serverless
Apache Spark.
Chapter 10 ties everything together into repeatable pipelines using MLOps with Sage‐
Maker Pipelines, Kubeflow Pipelines, Apache Airflow, MLflow, and TFX.
Chapter 11 demonstrates real-time ML, anomaly detection, and streaming analytics
on real-time data streams with Amazon Kinesis and Apache Kafka.
Chapter 12 presents a comprehensive set of security best practices for data science
projects and workflows, including IAM, authentication, authorization, network isolation,
data encryption at rest, post-quantum network encryption in transit, governance,
and auditability.
Throughout the book, we provide tips to reduce cost and improve performance for
data science projects on AWS.
Who Should Read This Book
This book is for anyone who uses data to make critical business decisions. The guidance
here will help data analysts, data scientists, data engineers, ML engineers,
research scientists, application developers, and DevOps engineers broaden their
understanding of the modern data science stack and level up their skills in the cloud.
The Amazon AI and ML stack unifies data science, data engineering, and application
development to help users level up their skills beyond their current roles. We show
how to build and run pipelines in the cloud, then integrate the results into applications
in minutes instead of days.
Ideally, and to get most out of this book, we suggest readers have the following
knowledge:
* Basic understanding of cloud computing
* Basic programming skills with Python, R, Java/Scala, or SQL
* Basic familiarity with data science tools such as Jupyter Notebook, pandas,
NumPy, or scikit-learn
From the Inside Flap
Jeff Barr
Vice President & Chief Evangelist at Amazon Web Services
It's very rare to find a book that comprehensively covers the full end-to-end process of model development and deployment! If you're a ML practitioner, this book is a must!
Ramine TinatiManaging Director/Chief Data Scientist Applied Intelligence at Accenture
This book is a great resource for building scalable machine learning solutions on AWS cloud. It includes best practices for all aspects of model building, including training, deployment, security, interpretability, and MLOps.
Geeta Chauhan
AI/PyTorch Partner Engineering Head, Facebook AI
The landscape of tools on AWS for data scientists and engineers can be absolutely overwhelming. Chris and Antje have done the community a service by providing a map that practitioners can use to orient themselves, find the tools they need to get the job done and build new systems that bring their ideas to life."
Josh Wills
Author, Advanced Analytics with Spark
Successful Data Science teams know that data science isn't just modeling but needs a disciplined approach to data and production deployment. We have an army of tools for all of these at our disposal in major clouds like AWS. Practitioners will appreciate this comprehensive, practical field guide that demonstrates not just how to apply the tools but which ones to use and when.
Sean Owen
Principal Solutions Architect at Databricks
This is the most extensive resource I know about ML on AWS, unequaled in breadth and depth. While ML literature often focuses on science, Antje and Chris take a different approach and dive deep into practical architectural concepts needed to serve science in production, such as security, data engineering, monitoring, CICD, and costs management. The book is state-of-the-art on the science as well: It presents advanced concepts such as Transformer architectures, AutoML, online learning, distillation, compilation, bayesian model tuning, and bandits. It stands out by providing both a business-friendly description of services and concepts and low-level implementation tips and instructions. A must-read for individuals and organizations building ML systems on AWS or willing to improve their knowledge of AWS data science stack
Olivier Cruchant
Principal ML Specialist Solutions Architect at AWS
This book is a great resource to not only understand the end-to-end machine learning workflow in detail and build operationally efficient machine learning workloads at scale on AWS. Highly recommend this book for anyone building machine learning workloads on AWS!
Shelbee Eigenbrode
AI/ML Specialist Solutions Architect, Amazon Web Services
This book is a comprehensive resource for diving into data science on AWS. The authors provide a good balance of theory, discussion, and hands-on examples to guide the reader through implementing all phases of machine learning applications using AWS services. A great resource to not just get started but to scale and secure end-to-end ML applications.
Sireesha Muppala, PhD
Principal Solutions Architect, AI/ML, Amazon Web Services
Implementing a robust end-to-end machine learning workflow is a daunting challenge, complicated by the wide range of tools and technologies available; the authors do an impressive job of guiding both novice and expert practitioners through this task leveraging the power of AWS services.
Brent Rabowsky
Data Scientist, AWS
Using real-world examples, Chris and Antje provide indispensable and comprehensive guidance for building and managing ML and AI applications in AWS.
Dean Wampler
Author, Programming Scala
Doing MLOps and Data Science on AWS is exciting and intimidating due to the vast quantity of services and methodologies available. This book is a welcome guide to getting Machine Learning into production on the AWS platform, whether you want to do ML with AWS Lambda or with AWS Sagemaker.
Noah Gift
Duke Faculty and Founder Pragmatic AI Labs
Data Science on AWS provides an in-depth look at the modern data science stack on AWS. Machine learning practitioners will learn about the services, open-source libraries, and infrastructure they can leverage during each phase of the ML pipeline and how to tie it all together using MLOps. This book is a great resource and a definite must-read for anyone looking to level up their ML skills using AWS.
Kesha Williams, A Cloud Guru
As AWS continues to generate explosive growth, the data science practitioner today needs to know how to operate in the cloud. This book takes the practitioner through key topics in cloud-based data science such as SageMaker, AutoML, Model Deployment, and MLOps cloud security best practices. It's a bookshelf must-have for those looking to keep pace with machine learning on AWS.
Josh Patterson
Author, Kubeflow Operations Guide
AWS is an extremely powerful tool, a visionary and leader in cloud computing. The variety of available services can be impressive, which is where this book is a big deal. Antje and Chris have crafted a complete AWS guide to building ML/AI pipelines complying with best-in-class practices. Allow yourself to keep calm and go to production.
Andy Petrella
CEO and Founder of Kensu
This book is a must-have for anyone who wants to learn how to organize a data science project in production on AWS. It covers the full journey from research to production and covers the AWS tools and services that could be used for each step along the way.
Rustem Feyzkhanov
Machine Learning Engineer at Instrumental, AWS ML Hero
Chris and Antje manage to compress all of AWS AI in this great book. If you plan to build AI using AWS, this book has you covered from the beginning to the end and more. Well done!
Francesco Mosconi
Author and Founder @ Zero to Deep Learning
Chris and Antje expertly guide ML practitioners through the complex and sometimes overwhelming landscape of managed cloud services on AWS. Because this book serves as a comprehensive atlas of services and their interactions toward the completion of end-to-end data science workflows from data ingestion to predictive application, you'll quickly find a spot for it on your desk as a vital quick reference!
Benjamin Bengfort
Rotational Labs
This book covers the different AWS tools for data science and how to select the right ones and make them work together.
Holden Karau,
Author, Learning Spark and Kubeflow for Machine Learning
Chris and Antje have done a phenomenal job in translating business use cases and case studies into implementable solutions using AWS. The book is written in such a way that it is easy to read along with a tested and well managed code base to accompany it. This book is highly recommended to anyone interested in Data Science, Data Engineering and Machine Learning Engineering at Scale.
Shreenidhi Bharadwaj
Sr Principal, Private Equity/Venture Capital Advisory (M&A), West Monroe Partners
"Chris and Antje have written the definite guide on how to build AI/ML & Data Science solutions using AWS. They describe all the different steps of the AI/ML life cycle from development through production in comprehensive detail, and AI/ML practitioners of all levels will greatly benefit from reading this book. Highly recommended!"
Jan Neumann,
Executive Director, Machine Learning, Comcast
From the Back Cover
Cloud computing enables the on-demand delivery of IT resources via the internet
with pay-as-you-go pricing. So instead of buying, owning, and maintaining our own
data centers and servers, we can acquire technology such as compute power, storage,
databases, and other services on an as-needed basis. Similar to a power company
sending electricity instantly when we flip a light switch in our home, the cloud provisions
IT resources on-demand with the click of a button or invocation of an API.
"There is no compression algorithm for experience" is a famous quote by Andy Jassy,
CEO, Amazon Web Services. The quote expresses the company's long-standing experience
in building reliable, secure, and performant services since 2006.
AWS has been continually expanding its service portfolio to support virtually any
cloud workload, including many services and features in the area of artificial intelligence
and machine learning. Many of these AI and machine learning services stem
from Amazon's pioneering work in recommender systems, computer vision, speech/
text, and neural networks over the past 20 years. A paper from 2003 titled "Amazon.
com Recommendations: Item-to-Item Collaborative Filtering" recently won the
Institute of Electrical and Electronics Engineers award as a paper that withstood the
"test of time." Let's review the benefits of cloud computing in the context of data science
projects on AWS.
Agility
Cloud computing lets us spin up resources as we need them. This enables us to
experiment quickly and frequently. Maybe we want to test a new library to run dataquality
checks on our dataset, or speed up model training by leveraging the newest
generation of GPU compute resources. We can spin up tens, hundreds, or even thousands
of servers in minutes to perform those tasks. If an experiment fails, we can
always deprovision those resources without any risk.
Cost Savings
Cloud computing allows us to trade capital expenses for variable expenses. We only
pay for what we use with no need for upfront investments in hardware that may
become obsolete in a few months. If we spin up compute resources to perform our
data-quality checks, data transformations, or model training, we only pay for the time
those compute resources are in use. We can achieve further cost savings by leveraging
Amazon EC2 Spot Instances for our model training. Spot Instances let us take advantage
of unused EC2 capacity in the AWS cloud and come with up to a 90% discount
compared to on-demand instances. Reserved Instances and Savings Plans allow us to
save money by prepaying for a given amount of time.
Elasticity
Cloud computing enables us to automatically scale our resources up or down to
match our application needs. Let's say we have deployed our data science application
to production and our model is serving real-time predictions. We can now automatically
scale up the model hosting resources in case we observe a peak in model
requests. Similarly, we can automatically scale down the resources when the number
of model requests drops. There is no need to overprovision resources to handle peak
loads.
Innovate Faster
Cloud computing allows us to innovate faster as we can focus on developing applications
that differentiate our business, rather than spending time on the undifferentiated
heavy lifting of managing infrastructure. The cloud helps us experiment with
new algorithms, frameworks, and hardware in seconds versus months.
Deploy Globally in Minutes
Cloud computing lets us deploy our data science applications globally within
minutes. In our global economy, it is important to be close to our customers. AWS
has the concept of a Region, which is a physical location around the world where
AWS clusters data centers. Each group of logical data centers is called an Availability
Zone (AZ). Each AWS Region consists of multiple, isolated, and physically separate
AZs within a geographic area. The number of AWS Regions and AZs is continuously
growing.
We can leverage the global footprint of AWS Regions and AZs to deploy our data science
applications close to our customers, improve application performance with
ultra-fast response times, and comply with the data-privacy restrictions of each
Region.
Smooth Transition from Prototype to Production
One of the benefits of developing data science projects in the cloud is the smooth
transition from prototype to production. We can switch from running model prototyping
code in our notebook to running data-quality checks or distributed model
training across petabytes of data within minutes. And once we are done, we can
deploy our trained models to serve real-time or batch predictions for millions of
users across the globe.
Prototyping often happens in single-machine development environments using
Jupyter Notebook, NumPy, and pandas. This approach works fine for small data sets.
When scaling out to work with large datasets, we will quickly exceed the single
machine's CPU and RAM resources. Also, we may want to use GPUs--or multiple
machines--to accelerate our model training. This is usually not possible with a single
machine.
The next challenge arises when we want to deploy our model (or application) to production.
We also need to ensure our application can handle thousands or millions of
concurrent users at global scale.
Production deployment often requires a strong collaboration between various teams
including data science, data engineering, application development, and DevOps. And
once our application is successfully deployed, we need to continuously monitor and
react to model performance and data-quality issues that may arise after the model is
pushed to production.
Developing data science projects in the cloud enables us to transition our models
smoothly from prototyping to production while removing the need to build out our
own physical infrastructure. Managed cloud services provide us with the tools to
automate our workflows and deploy models into a scalable and highly performant
production environment.
About the Author
Chris is also the Founder of many AI-focused global meetups including the global "Data Science on AWS" Meetup. He regularly speaks at AI and Machine Learning conferences across the world including O'Reilly AI, Open Data Science Conference (ODSC), and Nvidia GPU Technology Conference (GTC).
Previously, Chris was Founder at PipelineAI where he worked with many AI-first startups and enterprises to continuously deploy ML/AI Pipelines using Spark ML, Kubernetes, TensorFlow, Kubeflow, Amazon EKS, and Amazon SageMaker.
Antje Barth, Senior Developer Advocate, AI and Machine Learning @ AWS (Dusseldorf)
Antje Barth is a Senior Developer Advocate for AI and Machine Learning at Amazon Web Services (AWS) based in Düsseldorf, Germany. She is co-author of the O'Reilly Book, "Data Science on AWS."
Antje is also co-founder of the Düsseldorf chapter of Women in Big Data. She frequently speaks at AI and Machine Learning conferences and meetups around the world, including the O'Reilly AI and Strata conferences. Besides ML/AI, Antje is passionate about helping developers leverage Big Data, container and Kubernetes platforms in the context of AI and Machine Learning.
Previously, Antje worked in technical evangelism and solutions engineering at MapR and Cisco where she worked with many companies to build and deploy cloud-based AI solutions using AWS and Kubernetes.
Product details
- Publisher : O'Reilly Media; 1st edition (May 11, 2021)
- Language : English
- Paperback : 521 pages
- ISBN-10 : 1492079391
- ISBN-13 : 978-1492079392
- Item Weight : 1.82 pounds
- Dimensions : 7 x 1.05 x 9.19 inches
- Best Sellers Rank: #79,023 in Books (See Top 100 in Books)
- #7 in Data Warehousing (Books)
- #17 in Business Intelligence Tools
- #22 in Cloud Computing (Books)
- Customer Reviews:
About the authors
Antje Barth is a Principal Developer Advocate for generative AI at AWS. She is co-author of the O’Reilly books – Generative AI on AWS and Data Science on AWS. Antje frequently speaks at AI/ML conferences, events, and meetups around the world. She also co-founded the Düsseldorf chapter of Women in Big Data.
Chris Fregly is a Principal Solutions Architect for Generative AI at Amazon Web Services (AWS) based in San Francisco, California. Chris holds every AWS certification. He is also co-founder of the global Generative AI on AWS Meetup. Chris regularly speaks at AI and Machine Learning meetups and conferences across the world. Previously, Chris was an engineer at Databricks and Netflix where he worked on scalable big data and machine learning products and solutions. He is also co-author of the O'Reilly book, Data Science on AWS.
Related products with free delivery on eligible orders
Customer reviews
Customer Reviews, including Product Star Ratings help customers to learn more about the product and decide whether it is the right product for them.
To calculate the overall star rating and percentage breakdown by star, we don’t use a simple average. Instead, our system considers things like how recent a review is and if the reviewer bought the item on Amazon. It also analyzed reviews to verify trustworthiness.
Learn more how customers reviews work on AmazonReviews with images
It’s not just a book you read once; it’s a reference guide
Top reviews from the United States
There was a problem filtering reviews right now. Please try again later.
- Reviewed in the United States on March 22, 2023This book is loaded with lots of practical knowledge on how to use the ML services of AWS. This is not a dry cookbook, but also explains what and why. Of note, the book is closely tied to the "Practical Data Science on the AWS Cloud" course on Coursera. The authors are also part of the instructor team for this course. Everything in the course is in the book. But the book has more depth and additional material. Reading the section of the book really enriched the lectures and helped in working on the assignments. But the examples in the book and the assignments in the class are not the same. But it does help to have another example.
Overall, I would recommend this book, and the course, to anyone who has a basic familiarity of ML concepts and needs to learn how to implement a MLOps pipeline in an AWS cloud environment.
- Reviewed in the United States on July 7, 2024Got sagemaker n udemy classes!!
AWS mls-c01 sagemaker Studio udemy classes and exams!!
- Reviewed in the United States on December 2, 2024If you’re looking to learn how to build machine learning workflows using AWS, this book is a fantastic choice. It covers a wide range of AWS services like SageMaker, Lambda, and Step Functions, showing how to use them together to create powerful data science pipelines. The explanations are clear and easy to understand, even for topics that can be quite technical.
What stands out most is how the book is organized. It starts with the basics and gradually moves to advanced topics, making it great for readers at all levels. The authors include step-by-step examples and practical projects, which make it easy to follow along and apply what you learn to real-world tasks.
Another highlight is the focus on scalability and automation. These are essential when putting machine learning models into production, and the book goes beyond just teaching the tools—it also explains best practices for optimizing workflows, tracking performance, and keeping your models reliable.
Whether you’re new to AWS or experienced in data science, this book has something for everyone. Highly recommended!
4.0 out of 5 stars It’s not just a book you read once; it’s a reference guideIf you’re looking to learn how to build machine learning workflows using AWS, this book is a fantastic choice. It covers a wide range of AWS services like SageMaker, Lambda, and Step Functions, showing how to use them together to create powerful data science pipelines. The explanations are clear and easy to understand, even for topics that can be quite technical.
Reviewed in the United States on December 2, 2024
What stands out most is how the book is organized. It starts with the basics and gradually moves to advanced topics, making it great for readers at all levels. The authors include step-by-step examples and practical projects, which make it easy to follow along and apply what you learn to real-world tasks.
Another highlight is the focus on scalability and automation. These are essential when putting machine learning models into production, and the book goes beyond just teaching the tools—it also explains best practices for optimizing workflows, tracking performance, and keeping your models reliable.
Whether you’re new to AWS or experienced in data science, this book has something for everyone. Highly recommended!
Images in this review - Reviewed in the United States on May 1, 2021Very well written, this book covers many AWS services across the entire Amazon AI/ML data science stack. After clearly explaining the value proposition of doing data science in the cloud, the authors navigate the reader through an complete end-to-end machine learning pipeline using the latest in natural language processing techniques including BERT, HuggingFace transformers, and Amazon SageMaker. The authors demonstrate how to implement automated pipelines using TensorFlow, PyTorch, MXNet, Python, and even Java! This book has both technical depth and practical breadth. This book helped me prepare for - and complete - my AWS ML Specialty certification. It was a true delight to read!
- Reviewed in the United States on April 5, 2022I like how the authors present the contents. There is a good balance between sample code and explanation.
There are also many related insights such as Parquet format diagram, compression consideration, performance consideration, etc. The code repository is being actively maintained as well.
- Reviewed in the United States on May 26, 2021I have been following this book since it was in beta thanks to an Oreilly subscription. I have also attended a workshop put on by the authors. It has greatly helped my overall understanding of how to practically implement in AWS. You can use this as a framework to figure out how and what services to implement.
- Reviewed in the United States on July 23, 20233.0 out of 5 stars It just arrived but the front page and first pages where folded
Reviewed in the United States on July 23, 2023
Images in this review - Reviewed in the United States on August 15, 2021Loved the way the book is structured and very well written.
Follows the end to end approach on performing machine learning, specifically on tools available on AWS for moving the ml models in production following the industry best practices.
Top reviews from other countries
-
Client de Amzn.Reviewed in Mexico on May 23, 2024
5.0 out of 5 stars Buen material
Lo usé para una prácticas de AWS en la escuela y me ayudó bastante.
Mi gato también lo aprueba.
Client de Amzn.
Reviewed in Mexico on May 23, 2024
Mi gato también lo aprueba.
Images in this review - Frank MoralesReviewed in Canada on January 6, 2024
5.0 out of 5 stars Outstanding Textbook.
Gook book, which provides good guidelines based on a vast amount of practical examples about how to use the AWS Ecosystem Toward data science correctly and accordingly.
- Philippe ModardReviewed in Belgium on July 19, 2024
3.0 out of 5 stars Code not working
The book is great to discover the different services and tools for AI/ML in AWS. But the code in the GitHub repository is very messy, and doesn't work anymore, which goes against the goal of the book.
- JLBReviewed in Germany on April 26, 2023
5.0 out of 5 stars Awesome
This book is great!
- UalterReviewed in Spain on August 13, 2022
2.0 out of 5 stars I falls short, very expensive book for what gives you back
It is interesting, mainly for those that want to get used to some data science jargon, to get an overview of some processes, and tools. But for the price, it doesn't worth the money, it is very superficial, with too much high-level explanation. If the price of the book were much lower, maybe should deserve more stars, but for that price, it falls short.