Название: Taming Big Data Автор: Tanveer A. Издательство: Amazon.com Services LLC Год: 2020 Страниц: 344 Язык: английский Формат: pdf, epub Размер: 30.6 MB
Data has intrinsic value. But it’s of no use until that value is discovered. Equally important: How truthful is your data—and how much can you rely on it? Today, Big Data has become capital. Think of some of the world’s biggest tech companies. A large part of the value they offer comes from their data, which they’re constantly analyzing to produce more efficiency and develop new products.
Recent technological breakthroughs have exponentially reduced the cost of data storage and compute, making it easier and less expensive to store more data than ever before. With an increased volume of big data now cheaper and more accessible, you can make more accurate and precise business decisions.
Finding value in big data isn’t only about analyzing it (which is a whole other benefit). It’s an entire discovery process that requires insightful analysts, business users, and executives who ask the right questions, recognize patterns, make informed assumptions, and predict behavior.
Hadoop (an open-source framework created specifically to store and analyze big data sets) was developed that same year. NoSQL also began to gain popularity during this time.
The development of open-source frameworks, such as Hadoop (and more recently, Spark) was essential for the growth of big data because they make big data easier to work with and cheaper to store. In the years since then, the volume of big data has skyrocketed. Users are still generating huge amounts of data—but it’s not just humans who are doing it.
With the advent of the Internet of Things (IoT), more objects and devices are connected to the internet, gathering data on customer usage patterns and product performance. The emergence of machine learning has produced still more data.
While big data has come far, its usefulness is only just beginning. Cloud computing has expanded big data possibilities even further. The cloud offers truly elastic scalability, where developers can simply spin up ad hoc clusters to test a subset of data.
You are familiar with the terms Hadoop, Big data, and Data Science, for sure, and you might know their importance in today's life as well. But, do you have any idea about the job roles of Hadoop Developers, Hadoop Administrators, Hadoop Testers, and Data Scientists? By the time you finish reading this book, you will come to know all about the roles and responsibilities of these professionals.
Introduction to Big Data Hadoop Introduction Hadoop Ecosystem Starting HDFS Installation of Hadoop MapReduce in Hadoop YARN in Hadoop Pig in Hadoop Hadoop Hive Hadoop Streaming Sqoop Impala Oozie in Hadoop Apache Flume in Hadoop Zookeeper Hue Kafka overview Apache Atlas Overview Spark vs MapReduce How Does Spark Have an Edge over MapReduce Hadoop NoSQL Apache HBase Apache Cassandra MongoDB Hadoop Security Overview Hadoop Security Features Security Administration Top 5 Big Data Vendors Cloudera & Hortonworks Amazon Web Services Elastic MapReduce Hadoop Distribution Microsoft Hadoop Distribution HPE Ezmeral Data Fabric (formerly MapR Data Platform) IBM InfoSphere Insights 8 Applications of Big Data in Real Life Big Data in Education Industry Big Data in Healthcare Industry Big Data in Government Sector Big Data in Media and Entertainment Industry Big Data in Weather Patterns Big Data in Transportation Industry Big Data in Banking Sector Big Data in Transforming Real Estate Kick-start Your Career in Big Data and Hadoop Career prospects upon completion of Hadoop Certification Most Valuable Data Science Skills Of 2020 Non-Technical Expertise Big Data Job Responsibilities and Skills Hadoop Developer Roles and Responsibilities Hadoop Architect Roles and Responsibilities Hadoop Administrator Roles and Responsibilities Hadoop Tester Roles and Responsibilities Data Scientist Roles and Responsibilities Big Data Terminologies You Must Know