Content area
Full Text
EXECUTIVE SUMMARY | Today, the amount of data we are able to collect has been exploding. As a result, Big Data have become a new buzzword in information technology. Storing, managing, and analyzing Big Data is challenging, and will soon become a major differentiator between high-performing and low-performing organizations. This article discusses the issue of Big Data including the four dimensions of Big Data and the opportunities and challenges created by them. It also discusses various Big Data analytics applications.
Every day, we use several different devices to generate large amounts of data; for example, searching online, making purchases through e-commerce web sites, making transactions in the supermarket, reading data from sensors, using social media to interact with our friends, and using GPS. All the data are accumulated and stored somewhere, which we call "Big Data. "
WHAT IS BIG DATA?
According to the McKinsey Global Institute, "Big Data refer to datasets whose size is beyond the ability of typical database software tools to capture, store, manage and analyze. "The EdTech Report to the nation in 2013 states "Every day, we create 2.5 quintillion (1020) bytes of data- so much that 90% of the data in the world today has been created in the last two years alone. These data come from everywhere: sensors used to gather climate information, posts to social media sites, digital pictures and videos, purchase transaction records, and cell phone GPS signals to name a few. These data are "Big Data." The concept of Big Data actually is not new. We have been accumulating data since the beginning of recorded time. However, as technology advances, data are accumulating at an alarming rate.
FOUR DIMENSIONS OF BIG DATA
IBM data scientists break Big Data down into four dimensions: volume, velocity, variety, and veracity (4-Vs). The volume dimension refers to the scale of the data. From the beginning of recorded time until 2003, we created 5 billion gigabytes (exabytes) of data. In 2011, the same amount were created nearly every two days. In 2013, the same amount of data were created every 10 minutes. Velocity refers to the analysis of streaming data. As data are accumulated every second, data quickly become out-of-date. Therefore, it is important to use the data as...