What's Driving the Data Boom?
Around the world, servers hum in the dark, churning through data that may have started as store security footage, a tweet, or a blood pressure reading. In the end, this data tells a story about people — how they think, how they make choices, and what risks they’re willing to take. But first, you have to sort through the noise.
How does a company turn a massive stream of data into usable insights? Big Data enables companies to pluck time- and money-saving intelligence from massive information sets. In a world where seemingly every moment of our lives is being recorded and analyzed, that resource would go untapped if it weren’t for Big Data.
Big Data, defined as the process of analyzing large-scale data sets that are too massive or varied for traditional data-processing software, has become the de facto tool companies turn to in order to make decisions, reach consumers, and increase their operations’ efficiency. It takes huge amounts of raw data, extracting pieces useful for a specific business or individual, and using that data to make decisions.
The cornerstones of Big Data are volume, variety, and velocity. Variety means different types of data, and velocity measures how fast data is being uploaded onto and via these services. As for volume, Facebook stores an estimated 250 billion photos. Google processes 3.5 billion requests a day. Amazon’s Big Data engines take inventory on its 1.5 billion items in 200 warehouses worldwide every 30 minutes. For companies to gain actionable insights from data, that kind of scale is required.
A study by analysis firm International Data Corporation stated the big data and analytics software market was valued at $60.7 billion worldwide in 2018 and was expected to grow at a five-year compound annual growth rate of 12.5%. This growth expectation is due to an increase in Internet of Things (IoT) and artificial intelligence (AI) utilization within the data field, a shift to public cloud infrastructure, and “the increasing importance of data in the modern enterprise.”
“Increasingly, data-driven insights will be required to make rapid business decisions. Successful businesses will take advantage of these insights to adapt quickly to changes in the global economy,” said John Walicki, chief technology officer of IBM’s IoT advocacy group.
The Cornerstones of Big Data
Big Data is often seen as an umbrella term that comes with AI, machine learning, IoT sensors, Deep Learning, and edge computing. These are the tools the data industry uses to process and analyze data to turn it from raw inputs to gleaming insights.
AI is the use of algorithms to train computers to mimic human behavior. Machine learning uses AI to find trends and patterns within large amounts of data. Combined with IoT sensors like home assistant speakers, health tracking watches, and internet-connected lightbulbs, machine learning can extract key insights about consumers’ habits and preferences. It can predict when someone will need to buy that product next or — better yet — predict what will convince them to buy that item now.
Connected devices also provide insight into the machines that manage production. On the manufacturing floor, IoT devices streamline data collection processes to evaluate machines’ efficiency.
Rather than relying only on routine maintenance checks, IoT devices can alert an operator in real time when a machine is running outside of its stated parameters, catching issues earlier and minimizing expensive downtime. Neural networks have also been trained to estimate when a machine may need maintenance based on past performance, wrote Maciej Iwaniuk, Ernst & Young Europe, Middle East, India, and Africa Advisory Center IoT/ OT integration leader, in a company post.
“We have observed increased mean time between failures, shorter planned downtime, and a positive impact on overall equipment effectiveness. The results depend on multiple factors, such as the quality of historical data, the relevance of the collected information, and the impact of the applied solution on the maintenance strategy and procedures,” he wrote. “But based on the results, it is safe to say that a combination of data from sensors with machine learning can revolutionize the efficiency of operations and significantly increase asset utilization on the shop floor.”
Big Data became a practical tool for businesses and government agencies once algorithms were created that made it possible to analyze data “in ways that people were never able to,” said Nikolas Kairinos, CEO and founder of Fountech, an AI company focused on Big Data solutions. “We can analyze big obvious patterns, but AI can pick up subtle, seemingly obscure trends that lead to positive results.”
For example, when companies, organizations, and governments need to answer a question — such as the quickest route for shipping a package — they take data from sensors on cars, traffic cameras, other shipping vehicles that have completed drop-offs, and other sources of data. Then they sift through it to find situations that match their problem. From there, they use that relevant data to make decisions like changing routes, adjusting the weight of a vehicle, or adjusting drivers’ schedules to get packages to their intended destinations faster.
Putting Computational Power in More Hands
Until recently, those kinds of insights were out of reach for many companies. The ability to process massive data sets required expensive hardware that was difficult to secure outside of large companies, academic institutions, and government settings. Such capability was elusive for the average person until open-source data processing software system Apache Hadoop hit the internet in 2006, followed by Apache Spark in 2014. These open-source software tools made it possible to process large amounts of data off site using a network of computers to combine computational power, rather than requiring a supercomputer.
“With the introduction of Hadoop and Spark at the beginning of this decade, techniques to process and mine large data sets became more efficient,” IBM’s Walicki said.
The ability to process large amounts of data made its collection more important, and the cycle of increasing collection capabilities rolled along as computing power increased. With greater ability to turn raw data into useful insights, data began to become an essential tool for companies, and more resources were put into reducing the barriers to seeing what that data was trying to tell them. Even issues like network latency improved as access to the internet increased through the 2010s and most residential and industrial areas got connected.
“Data is everywhere,” Walicki said, “but the ability to extract value is still evolving.”
Big Data’s impact may be invisible to the average consumer, but its effects are apparent nearly everywhere in the U.S. economy.
“AI, right now, needs as much data as you can give it,” Kairinos said.
Speed Advances From Cloud Technology
Cloud computing’s emergence has also contributed to the rapid rise in Big Data over the past three or so years, said Dr. Julie Rosen, chief scientist and technical fellow at Leidos, a Reston-based research company with focuses in defense, aviation, biomedical, and information technology research. Cloud computing is typically performed at data centers, or “data lakes,” with substantial computing resources. But demand for faster processing without exceptionally high bandwidth began pushing data processing away from data centers and to distributed networks. This led to the rise of edge computing as a complement to data centers.
“With access to cloud-based servers, system architectures now are moving the processing of data to the data [collection device or sensor], rather than moving the data to the computing processors,” Rosen said. “Computation is moved away from centralized data centers toward the edge of the network, where tasks are performed on devices such as mobile phones and network gateways. Conducting the tasks ‘closer’ to the consumer improves response time, or latency, as well as addresses bandwidth from node to node in the distributed network.”
Today’s economy is built upon the tools Big Data has provided. E-commerce, bill payment, person-to-person transactions, and in-store digital transactions require secure, encrypted systems. This is only possible with large-scale data processing.
“The ability to perform trusted, secure, private transactions is at the core of every economy,” Walicki said. “No one would use their credit card or smartphone for commerce — typical retail and banking transactions — if their financial information and assets were at risk of being stolen. Secure data transactions, sometimes on a blockchain [system], have increased trust and reduced the friction of commerce.”
Rosen said the capability to extract data insights is just as important as their ability to collect data. Without this, it doesn’t matter how much data you have. It may as well be useless.
“This massive collection of data is not solely a demonstration of ‘the possible’ from developers of sensing mechanisms. As with most successful businesses, data collection must be conducted in close coordination with a mission or business need,” she said.
“Data sitting in a repository or data lake will grow stale quickly unless it is explored, investigated, and analyzed in the context of a business need, a consumer’s question, or a researcher’s hypothesis. And this is where the greatest transformations are happening in the industry today. It’s the accessible data retrieval that is critical to inform decision-making.”
Use Cases Across Industries
It may not be obvious how Big Data is playing a role in many traditional industries. But IoT and AI are touching every part of today’s industries.
A lot of people call it the next oil, the next currency. By itself, it's going to be crude unless you can refine it.
The real data deep dives are in emerging markets like unmanned vehicles. Cameras, light detection ranging systems, and other sensors on self-driving and driver-assisting cars catalog each person, car, stop sign, trash can, and bicycle as a unique element. They track traffic data, construction information, habits of pedestrians, and other variables that the vehicle would need to know to get from one place to another, in order to get to the destination quickly and safely.
That mapped data is highly valuable for other industries, such as logistics or software development. The information tracked can give a sense of how many people are walking on a street at a given time, how many have children or dogs with them, what types of cars are popular in an area, and which areas have a high concentration of bicycles. Insights gleaned from that information can also help the unmanned industry deploy their products safely and efficiently. That’s just the start of potential data uses since real-time surveillance has picked up speed with the proliferation of IoT and AI.
The aerospace industry uses Big Data in numerous ways: autonomous aircraft engine IoT sensors that measure fuel efficiency, and engineering decisions to increase safety, to name a few.
“Big Data, a lot of people call it the next oil, the next currency,” said Sumeet Vij, principal and director of artificial intelligence and data science at Booz Allen Hamilton, headquartered in McLean. “By itself, it’s going to be crude unless you can refine it. … Intelligence — the ability to process and add the edge itself and only get the [data] that’s valuable — is important.”
This real-time insight is solving problems that would inevitably arise if this data had to be parsed through by a team of humans. Leidos uses AI to predict movements in wartime theaters as well as developments in medical cases.
Big Data is currently used in circumstances such as determining which advertising video to display, evaluating which medical therapy could have the best efficacy for a given patient, or alerting operators to electrical grid components that need to be repaired. It’s particularly useful in situations where time is of the essence, whether in a boardroom or an operating room.
Where Does Big Data Go From Here?
Experts say there’s potential for more affordable services and products, more autonomous services, and ways to make life safer. In a world where household IoT sensors, personal computers, smartphones, and other devices are constantly monitoring consumers, there’s an avalanche of data ready for companies to turn into smart insights. Before those products even reach customers, the manufacturing process also provides massive amounts of data enabling companies to create greater efficiencies.
“Turnarounds become quicker, maintenance becomes more effective, productivity goes up, and all those things are clearly driven by the connectivity of devices in the field with what you have in your enterprise resource system and your maintenance system,” said Daryl Roberts, senior vice president and chief operations and engineering officer at DuPont.
Fountech’s Kairinos imagines a future where personal vehicles are unnecessary because AI and IoT within self-driving cars have become so efficient as to outpace human drivers’ safety records. Autonomous vehicles will be useful for myriad situations on and off highways, such as autonomous farming equipment, landscaping trimming, and drone usage for surveillance and other uses.
In general, the factors that make Big Data useful for businesses are consumer satisfaction, cost reduction, and safety, regardless of the industry being targeted.
Kairinos said that in the future, “artificial intelligence” will be as common a term as the internet is today: “Twenty years from now, we won’t be talking about AI. … It’ll just be weaved into how things are done.”