A data-driven culture drives superior stakeholder experiences and empowers business with a deep understanding of customer’s needs. However, the caveat is working with the right data. Analytics is only as accurate as the data fed into the engine. For most enterprises, “Dark Data” creates a severe distortion.
Gartner defines “Dark Data” as “the information assets organizations collect, processes and store during regular business activities, but generally, fail to use for other purposes”.
Dark data include data from instant messages, emails, partly developed applications, log files, PDF downloads, and more. Such data, usually stored across a distributed IT environment with no single owner, is difficult to manage. Worse, a large proportion of dark data consist of difficult to classify video, audio and image files.
IDC and Dell EMC estimate 40 zettabytes of data in existence across the world, 2020. A 2016 Veritas Global Databerg Survey indicates about 85% of all data being Dark Data. A survey by IDG estimates dark data growing at a rate of 62% every year. By 2022, 93% of all data will be unstructured. While the importance of analytics and data-based decision is obvious, these statistics indicate the vast majority of the world’s consumer and business data remains inaccessible.
The ‘save everything, just in case’ culture fuels the creation of dark data. Most dark data, thus created never see the light of day again. The Data saved become redundant as employees leave and take the passwords with them, customers move on, and business priorities change.
Why Enterprises Need to Control Dark Data
Dark Data make predictions go awry, inflate client list, and create several distortions enterprises would like to do without. If nothing else, it creates unnecessary bloat, which is not just costly but against the basic principles of the overriding lean philosophy of the day. Storing Dark data also increases security risks manifold.
Enterprises now have an even more compelling reason to get rid of dark data. The new European GDPR requires businesses to have tight control over their data, including how data flows across the enterprise. Enterprises become bound to share the data stored on “data subject,” such as client, employee or other stakeholders. Another key feature of the regulation is consumers being able to exercise their “right to be forgotten,” which would oblige the enterprise to delete all personal data pertaining to the customer.
Controlling Dark Data
A key step in controlling dark data is structuring unstructured data. Standard artificial intelligence tasks such as teaching computers to recognize images or human speech go a long way in properly attributing unstructured dark data. A powerful AI engine may unlock unstructured dark data from social feeds, categorize it, apply predictive analytics, subject it to heuristic analysis, and finally decide whether to retain or discard the data based on merits of the data.
Another key exercise in taming dark data is sorting the wheat from the chaff or identifying which data is beneficial and which data is obsolete. Artificial Intelligence allows identifying useful data with a high degree of accuracy. AI tools enable the enterprise to understand the implications of each piece of data and take appropriate action. Determining what each document is about requires identifying a business problem or question, and pinpointing where the relevant information related to the business problem or question is located within any specific document, if at all. Artificial Intelligence applies context and background knowledge to the analytics, allowing enterprises to perform the exercise with a high level of accuracy.
AI branches such as natural language processing (NLP), computational linguistics and machine learning enable the analytic engine to understand the contexts of information to pull the relevant data for each business problem or question. For instance, applying computational linguistics helps to extract relevant information based on the linguistic context of the words surrounding it.
Artificial Intelligence tools allow enterprises to do these tasks easily, at scale, with the bulk of the process automated.
Extracting Value from Dark Data
Across the waste swamps of dark data may lay gems of strategic intelligence. Analyzing unstructured data also offers the opportunity to extract invaluable business insight which would otherwise lie dormant. Here again, Artificial Intelligence offers a solution.
Artificial Intelligence tools allow enterprises to glean insights from dark data. For instance, server log files give clues to website visitor behavior, customer call detail records indicate consumer sentiment and mobile geolocation data reveal traffic patterns. Subjecting such data to analytics allow employees to act based on insights rather than by intuition or hunches.
Retailers traditionally apply psychology in product placements. They understand how customers move around a store, and place products accordingly. Applying Artificial Intelligence tools to video images of shoppers’ posture, facial expressions, or gestures offer a deeper understanding of customer psychology, allowing retails and other customer-facing enterprises to position their products and services even better.
In healthcare, it is common for doctors take handwritten notes and capture voice recordings during consulting. Collecting all such information and making it accessible improve treatments and insights manifold, and also contributes to the wider field of medical research in a big way. Without AI tools which properly classifies and gives a context to such data, it remains siloed, inaccessible, and pretty much worthless in the wider scheme of thing.
Data is now a critical source of competitive advantage and success depends on the ability of enterprises to identify relevant data and extract the optimum value out of it. A strong AI-backed analytical engine delivered to key business managers and other stakeholders in the form of customer-friendly apps, is the way to go.
I have been programming since 2000, and professionally since 2007. I currently lead the Open Source team at Fingent as we work on different technology stacks, ranging from the “boring”(read tried and trusted) to the bleeding edge. I like building, tinkering with and breaking things, not necessarily in that order.
Hit me up at : https://www.linkedin.com/in/futuregeek/