Название: Streaming Data Mesh: A Model for Optimizing Real-Time Data Services (Final Release) Автор: Hubert Dulay, Stephen Mooney Издательство: O’Reilly Media, Inc. Год: 2023 Страниц: 223 Язык: английский Формат: epub Размер: 10.2 MB
Data lakes and warehouses have become increasingly fragile, costly, and difficult to maintain as data gets bigger and moves faster. Data meshes can help your organization decentralize data, giving ownership back to the engineers who produced it. This book provides a concise yet comprehensive overview of data mesh patterns for streaming and real-time data services.
Authors Hubert Dulay and Stephen Mooney examine the vast differences between streaming and batch data meshes. Data engineers, architects, data product owners, and those in DevOps and MLOps roles will learn steps for implementing a streaming data mesh, from defining a data domain to building a good data product. Through the course of the book, you'll create a complete self-service data platform and devise a data governance system that enables your mesh to work seamlessly.
Data mesh is one of the most popular architectures for data platforms that many are exploring today. This book will help you get a full understanding of this self-servicing data platform in a streaming context. Today, batch processing dominates all extract, transform, and load (ETL) processes in most businesses. This book will help show a different perspective of data pipelines and apply the same concepts you already understand in batch ETL, but in a streaming ETL in the context of a data mesh.
This book is designed to help you understand the essential concepts around streaming data mesh—the concepts, architectures, and technologies at its core. The book covers all the essential topics related to streaming mesh, from the basics of data architecture, to the use of big data tools for data warehousing, to business-oriented approaches for streaming data mesh architectures. Additionally, we will look at a stack of services involved in a successful streaming data mesh project.
This book does not require you to have pre-knowledge of the pillars that make up a data mesh. We will briefly introduce the pillars at a very high level but specifically define them with streaming in mind.
With this book, you will:
Design a streaming data mesh using Kafka Learn how to identify a domain Build your first data product using self-service tools Apply data governance to the data products you create Learn the differences between synchronous and asynchronous data services Implement self-services that support decentralized data
Who Should Read This Book This book is written for anyone who is interested in learning more about streaming data mesh, combining the exciting work done in data mesh with real-time streaming for data transformation, data product definition, and data governance. This book is also useful for data engineers, data analysts, data scientists, software architects, and product owners who want to implement a streaming data architecture for their projects. This book is useful for those who wish to become familiar with streaming data technologies and best practices for integrating them, at scale, into their projects.
Contents:
Preface 1. Data Mesh Introduction 2. Streaming Data Mesh Introduction 3. Domain Ownership 4. Streaming Data Products 5. Federated Computational Data Governance 6. Self-Service Data Infrastructure 7. Architecting a Streaming Data Mesh 8. Building a Decentralized Data Team 9. Feature Stores 10. Streaming Data Mesh in Practice Index
Скачать Streaming Data Mesh (Final Release)
|