Название: Big data: Principles and best practices of scalable realtime data systems Автор: Nathan Marz, James Warren Издательство: Manning Publications ISBN: 1617290343 Год: 2015 Страниц: 328 Язык: английский Формат: pdf (true), epub Размер: 15.3 MB
Big Data teaches you to build big data systems using an architecture that takes advantage of clustered hardware along with new tools designed specifically to capture and analyze web-scale data. It describes a scalable, easy-to-understand approach to big data systems that can be built and run by a small team. Following a realistic example, this book guides readers through the theory of big data systems, how to implement them in practice, and how to deploy and operate them once they're built.
About the Book
Web-scale applications like social networks, real-time analytics, or e-commerce sites deal with a lot of data, whose volume and velocity exceed the limits of traditional database systems. These applications require architectures built around clusters of machines to store and process data of any size, or speed. Fortunately, scale and simplicity are not mutually exclusive.
Big Data teaches you to build big data systems using an architecture designed specifically to capture and analyze web-scale data. This book presents the Lambda Architecture, a scalable, easy-to-understand approach that can be built and run by a small team. You'll explore the theory of big data systems and how to implement them in practice. In addition to discovering a general framework for processing big data, you'll learn specific technologies like Hadoop, Storm, and NoSQL databases.
This book requires no previous exposure to large-scale data analysis or NoSQL tools. Familiarity with traditional databases is helpful.
What's Inside
Introduction to big data systems Real-time processing of web-scale data Tools like Hadoop, Cassandra, and Storm Extensions to traditional database skills
About the Authors
Nathan Marz is the creator of Apache Storm and the originator of the Lambda Architecture for big data systems. James Warren is an analytics architect with a background in machine learning and scientific computing. Table of Contents:
Preface
Acknowledgments
About this Book
Chapter 1. A new paradigm for Big Data
1. Batch layer
Chapter 2. Data model for Big Data
Chapter 3. Data model for Big data: Illustration
Chapter 4. Data storage on the batch layer
Chapter 5. Data storage on the batch layer: Illustration
Chapter 6. Batch layer
Chapter 7. Batch layer: Illustration
Chapter 8. An example batch layer: Architecture and algorithms
Chapter 9. An example batch layer: Implementation
2. Serving layer
Chapter 10. Serving layer
Chapter 11. Serving layer: Illustration
3. Speed layer
Chapter 12. Realtime views
Chapter 13. Realtime views: Illustration
Chapter 14. Queuing and stream processing
Chapter 15. Queuing and stream processing: Illustration
Chapter 16. Micro-batch stream processing
Chapter 17. Micro-batch stream processing: Illustration
Chapter 18. Lambda Architecture in depth
Index
List of Figures
List of Tables
List of Listings
Скачать Big data: Principles and best practices of scalable realtime data systems
|