IS2021
Last modified:
6 min read

Part 10 - Future trends of Big Data

Table of Contents
  • Edge Computing
    • Explosive growth in data generated from cloud systems, sensors, smart deviced and video streaming is driving adoption of edge computing. Data processing is done on the periphery of the network as close to the originating source as possible.
  • Cloud and hybrid cloud computing
    • Cloud computing enables organizations to process nearly limitless amounts of data. Hybrid cloud approaches to enable companies in regulated industries to take advantage of cloud’s economic and technical advantages.
  • Data lakes
    • These large repositories store structured and unstructured data in its native format. Data scientists often extract just what’s needed for a project, eliminating costly ETL (Extract, Transform, Load) processes required of centralized data warehouses.
  • Machine learning and AI technologies
    • Machine learning and other AI technologies are revolutionizing big data analytics. AI’s ability to ingest and analyze massive amounts of structured and unstructured data is being used by companies to optimize and improve business operations.

More data, increased data diversity drive advances in processing and the rise of edge computing.

Big data storage needs spur innovations in cloud and hybrid cloud platforms, growth of data lakes.

Adoption of advanced analytics, machine learning and other AI technologies increases dramatically.

DataOps and data stewardship move to the forefront of big data management strategies.

The NoSQL Takeover

NoSQL technologies, commonly associated with unstructured data, have seen significant adoption over the past decade.

Going forward, the shift to NoSQL databases as a leading piece of the enterprise IT landscape becomes clear as the benefits of schema-less database concepts become more pronounced.

Nothing shows the picture more starkly than looking at Gartner’s Magic Quadrant for Operational Database Management Systems, which in the past was dominated by Oracle, IBM, Microsoft and SAP.

In contrast, the in the most recent Magic Quadrant, the NoSQL companies, including MongoDB, DataStax, Redis Labs and MarkLogic, are set to outnumber the traditional database vendors in Gartner’s Leaders quadrant of the report.

Hadoop Projects Mature

Enterprises continue their move from Hadoop as a proof of concept to production.

In a survey of 2 200 Hadoop customers, only 3% of respondents anticipated they will be doing less with Hadoop in the next couple of months and 76% of those who already used Hadoop planned on doing more within the next three months.

Big Data Grows Up

Hadoop adds to enterprise standards, as further evidence to the growing trend of Hadoop becoming a core part of the enterprise IT landscape, investment will grow in the components surrounding enterprise systems such as security.

Apache Sentry project provides a system for enforcing fine-grained, role-based, authorization to data and metadata stored on a Hadoop cluster.

These are the types of capabilities that customers expect from their enterpirse-grade RDBMS platforms and are now coming to the forefront of the emerging big data technologies, thus eliminating one more barrier to enterprise adoption.

Big Data Gets Faster

Options expand to add speed to Hadoop, with Hadoop gaining more traction in the enterprise, there will be a growing demand from end users for the same fast data exploration capabilities they’ve come to expect from traditional data warehouses.

To meet that end-user demand, adoption of technologies such as Cloudera Impala, AtScale, Actian Vector and Jethro Data that enable the business user’s old friend, the OLAP (Online Analytical Processing) cube, for Hadoop will grow - further blurring the lines behind the “traditional” BI (Business Intelligence) concepts and the world of big data.

The Number of Options are Growing

Self-service data preparation tools are exploding in popularity. This is in part due to the shift toward business-user-generated data discovery tools that reduce time to analyze data.

Business users want to reduce the time and complexity of preparing data for analysis, something that is especially important in the world of big data when dealing with a variety of data types and formats.

MPP Data Warehouse Growth is Heating up the Cloud

The “death” of the data warehouse has been overhyped for some time now, but it is no secret that growth in this segment of the market has been slowing.

The term “data lake” has been adopted to replace “data warehouse” in big data ecosystems.

A data lake is a storage repository that holds a vast amount of raw data in its native format until it is needed for analytics applications.

While a traditional data warehouse stores data in hierarchical dimensions and tables, a data lake uses a flat architecture to store data, primarily in files or object storage. That gives users more flexibility on data management, storage, and usage.

Analysts cite 90% of companies who have adopted Hadoop will also keep their data warehouses and with these new cloud offerings, those costumoers can dynamically scale up or down the amount of storage and compute resources in the data warehouse relative to the larger amounts of information stored in their Hadoop data lake.

Impact of AI to Big Data

Big data and artificial intelligence have a synergistic relationship. AI requires a massive scale of data to learn and improve decision making processes and big data analytics leverages AI for better data analysis.

The Buzzword Converge

IoT, cloud and big data come together.

The technology is still in its early days, but the data from devices in the Internet of Things (IoT) will become one of the “killer apps” for the cloud and a driver of petabyte scale data explosion.

For this reason, leading cloud and data companies such as Google, Amazon Web Services (AWS) and Microsfot will bring IoT services to life so the data can move seamlessly to their cloudbased analytics engines.

Though these changes and trends may seem disparate, they’re all linked by the need to work with data quickly and conveniently.

As big data changes and new ways of working with that data pops up, the details shift, but the song remains the same.

Everyone is a data analyst, and there has never been a more exciting job to have.

Summary

Big Data has 3Vs characteristics:

  • Affordable High Speed
    • Handled by MPP, Hadoop and its components e.g. MapReduce.
  • High Volume
    • Handled by cloud computing.
  • High Variety
    • Handled by NoSQL and Hadoop.