BETA
This is a BETA experience. You may opt-out by clicking here

More From Forbes

Edit Story

Fortnite Lessons: How Data Lakes Can Help Democratize Data

Forbes Technology Council
POST WRITTEN BY
Shant Hovsepian

The internet is aglow with tales of Fortnite, the hottest online multiplayer video game around. The game’s parent company, Epic Games, processes millions of events each minute, and its mountain of data grows steadily. Processing and analyzing this data -- petabytes worth -- must happen somewhere.

As it turns out, Epic uses a data lake for this massive undertaking. Chris Dyl, director of platform for Epic, referred to the company’s Amazon S3 deployment as its “data lake” during his talk at AWS Summit just this month. The reference is one piece of evidence that data lakes are not only useful pieces of data architecture but are also crucial building blocks.

That thought may come as a surprise given warnings about data lakes turning into unmanageable data swamps and hand-wringing over big data overhype. To uncover the truth, we surveyed hundreds of organizations about their data lake usage. The results: Most are bullish on data lakes. In particular, it seems organizations are excited by how well data lakes enable nontechnical users (“casual” or “business” users) to analyze data. In a business world, the ability to democratize data -- empowering employees with intelligence, raising the collective IQ of the organization -- is a cornerstone of competitive advantage.

Though Data Lake Technology Has Improved, Growing Pains Exist

Data lakes have been a playground for data scientists and highly skilled analysts for a while now, but they have been difficult puzzles to decipher for those who are less technically inclined. Unfortunately, it’s exactly those roles -- managers, field employees, and even end customers and supply chain partners -- that need access to digestible analytics quickly. These days, it’s more commonplace to encounter point-and-click visual analytics and reporting tools that also perform well with vast quantities of data. But those types of solutions and the access they provide aren’t ubiquitous yet.

Performance Analysis For Everyone -- Not Just Data Scientists

One perceived drawback of data lakes is that they began as big data dumping grounds for data of unknown value, which organizations hoped to explore and turn into valuable insights. These wastelands were nearly impossible for anyone without a PhD in data science to access (I exaggerate, but the point remains).

While data lakes have become more and more performant, that performance means nothing if business users without mathematics or SQL backgrounds cannot easily access the data they need for analysis. Solutions with the features and speeds that enable data access for everyone are rare beasts.

Respondents in our survey, however, seemed to believe they have entered a new era of data lake technology: Nearly two thirds felt business users can use tools to explore data to get the views they want, and half said business users can blend data sets located within or outside the lake. And more than half said users can view complex correlations within their data. Our survey data isn’t the only piece of evidence pointing toward the rise of data lake usage. Markets and Markets estimates the market for data lakes will be worth nearly $9 billion by 2021, up from just $2.5 billion in 2016. Other projections are similarly bullish.

The message is clear: Data lakes are becoming a great way to give your customer-facing staff, managers and other nontechnical employees a way to turn data into something useful.

Close The Usage Gap

Organizations understand the power of data lakes as cornerstones for data architectures, and they are starting to invest in giving their business users more access to that information. Despite the forward progress, we still have a usage gap to close before we can hail data lake-enabled BI and analytics as the solutions of choice.

According to our survey, only half of all data lake users are business users. Nearly half of all organizations have fewer than 100 business users accessing data lake-based analytics. For all the respondents who answered that their business users can access complex views and correlations, those endorsements only ring true for half, at most, of all respondents.

Organizations would be wise to invest in closing that gap with accessible analytics. Leaders can democratize data for their organization by following these tips:

• Avoid moving big data: Moving massive quantities of data is expensive and time-consuming. To democratize data, leverage analytics and other tools as close to the data as possible.

• Look for unified security models and tools that can leverage the security model already in place: If you have to reimplement your whole security model -- in your storage layer, in your database layer and in your application BI layer -- you’re either not going to follow through with the implementation or you’re going to lose information along the way.

• Build apps, not reports: It’s important to think about the developer side first. Rails really made it easy for a bunch of people to build web applications. To make data more accessible, you want similar functionality from your data analytics apps. Democratizing data require data-driven apps that, just as with web apps, offer asynchronous data for multiple sources to get data quickly.

To truly democratize data, it is essential to leave it where it naturally resides without impeding the types or forms of analysis that are possible. A data lake allows the same type of data to be analyzed for machine learning, search, streaming, batch and BI use cases without forcing lock-in into a special-purpose system. Democratization requires enabling data access for everyone, and building a data infrastructure directly around a data lake provides that accessibility.

Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives. Do I qualify?