Integrity Constraints on Rich Data TypesКНИГИ » ОС И БД
Название: Integrity Constraints on Rich Data Types Автор: Shaoxu Song, Lei Chen Издательство: Springer Год: 2023 Страниц: 154 Язык: английский Формат: pdf (true), epub Размер: 12.4 MB
This book examines the recent trend of extending data dependencies to adapt to rich data types in order to address variety and veracity issues in Big Data. Readers will be guided through the full range of rich data types where data dependencies have been successfully applied, including categorical data with equality relationships, heterogeneous data with similarity relationships, numerical data with order relationships, sequential data with timestamps, and graph data with complicated structures. The text will also discuss interesting constraints on ordering or similarity relationships contained in novel classes of data dependencies in addition to those in equality relationships, e.g., considered in functional dependencies (FDs). In addition to exploring the concepts of these data dependency notations, the book investigates the extension relationships between data dependencies, such as conditional functional dependencies (CFDs) that extend conventional functional dependencies (FDs). This forms in the book a family tree of extensions, mostly rooted in FDs, that help illuminate the expressive power of various data dependencies. Moreover, the book points to work on the discovery of dependencies from data, since data dependencies are often unlikely to be manually specified in a traditional way, given the huge volume and high variety in Big Data. It further outlines the applications of the extended data dependencies, in particular in data quality practice. Altogether, this book provides a comprehensive guide for readers to select proper data dependencies for their applications that have sufficient expressive power and reasonable discovery cost. Finally, the book concludes with several directions of future studies on emerging data.
Graph data have been widely observed in real-world applications, e.g., knowledge bases and social networks can be modeled as graphs. In such scenarios, entities are represented by the vertexes in the graphs, each of which has one class tag such as persons, attribute values and connections with other entities. With the consideration over three main components, class, relation and attribute, metalanguage for graph models is introduced to define how graph data is serialized and compiled in files or databases, e.g., eXtensible Markup Language (XML) and Resource Description Frameworks (RDFs).