Dark Data, a subset of Big Data has been finding its way into the spotlight.
We live in the midst of an atmosphere of numinous, yet docile data revolution. Our bromides about big data say a lot about our penchant for it. A single, personal object like the smartphone epitomizes the synthesis of Herculean information. It reminds us that data is encrypted into the air we breathe. Our love for social conversations, for more privileged, inside information, pales before the colossal volume of data contained in the web. Top analysis firm IDC, observes that the world’s data is unprecedentedly doubling every two years. People are creating, sending and storing more unstructured content within documents and files including, email, social media, apps, file-sharing, mobile and cloud storage. This information comes computationally from programs designed by us, frail human beings. Add a dollop of consumerism to it, and we’re ready to hail the Big Data taxi.
Today, Big Data is measured in terabytes and petabytes; within a decade, even those numbers may seem quaint. To boot, the rise of the “Internet of Things”, has increased use of analytic and business intelligence tools to extract data from mobile and sensor devices to provide accurate support to consumers. At this rate of data production, it means that a vast pool of the data preserved for a number of standardized uses, comes to the end of the line when it is not put to immediate use. Such uncategorized, unmanaged and unknown data is widespread in most companies actively planning Big Data initiatives.
Hidden Value of Dark Data
Described as ‘Dark Data’, it represents abundant opportunity. Dark data is usually put behind retention programs, classification schemes and retrieval systems that companies rely upon to meet compliance liabilities, abate electronic discovery and ensure accurate decisions are made using relevant and accurate information.
Huge data sets and registers in companies are steamrolled by evils such as Dark Data, apart from vulnerability and data protection for IT departments. The inevitable costs and risks of dark data meteorically rise during regulatory investigations and cybersecurity events. Although, Dark Data is not as two-dimensional as it looks. Speciously considered as uneconomical and good-for-nothing, it is quite the opposite.
In today’s remotely connected marketplace, a business needs to innovate to radically optimize productivity, look for market opportunities and influence customers. The key to this, lies in Dark Data. Just like an overwhelming majority of the Universe is made of mysterious substances like dark matter and dark energy, Big Data is surrounded by data that cannot be comprehended or seen. This data, offers the potential to shape the businesses of the future. One needs to extract the juice from Dark Data, to be able to fully understand it much better, and categorize it accordingly.
Let’s look at the algorithm of spelling correction used by search engines. A few years ago Google significantly improved spell-checkers by integrating colossal databases of users’ self-corrections that previously utilized manual algorithms by digging the complexities of English language and the psychology of typing. Google’s new trick wouldn’t have worked if it had used only a few user searches to create the algorithm. The search engine utilized a language model to find possible keywords and an error model to find possible errors, and combined them to suggest the best alternative. The most important ingredient to build something like this is data. Google took cues from trillions of web searches from millions of users, across 146 languages. A pure genius technique, it can now be rapidly applied to other languages outside of the vector with little manual labor. This is perhaps one of the hundreds of innovations driven by the sheer mass of data.
Dark Data in the Digital Era
Businesses would have a very limited and inaccurate view of consumers, if they are not able to coalesce and ruminate data. According to McKinsey Research, a retailer using data to the full could increase its operating margin by more than 60 percent. Moreover, using data, services enabled by personal-location data could procure $600 billion in consumer surplus.
Products like Google search and Siri heavily depend on data. Companies like Facebook and Google have shown, more data means innovative and better solutions to perennial problems.
For instance, Cukier and Mayer-Schonberger’s book, elaborates how AI researcher Oren Etzioni developed Farecast (later sold to Microsoft, and now a part of Bing Travel) to scrap data from the Internet to make accurate guesses about whether airline fare would rise or fall.
Dark data is the watchword of the year of 2015. It is a tagline for everything that is uneconomical and good-for nothing, although it is quite the opposite.
A bank or retailer only looking at CRM or transactional information in order to target a customer with promotion is only seeing one part of the whole picture and probably understanding a fraction of the interests and preferences of the consumer. Immeasurably valuable data about who these customers are interacting with and what they’re saying on social channels, i.e. how they feel about brands, what they’re looking for, where they shop, their personal network, a recent customer service experience in a bar or a store, is left dark.
For a lot of companies, the usual types of unused data include website, mobile data and social media. Tech Goliaths like Google and Amazon bank on this kind of data to gain a foothold over consumer intelligence. In recent years, it has been observed that consumer-centric businesses with a strong footing on consumer intelligence have been able to gain a lot more customer data compared to long-established businesses like banks.
This type of unstructured data generated from social media and web can hold an abundance of information on consumer’s browsing preference. For instance, social networking sites that detail particular consumers preferences, affiliations and interests, as well as mobile technologies , such as geolocation or mobile wallet applications that when pooled together, can help businesses target a buyer on-the-go. Basically, there are billions of data points circulating at any given point of time, which when used, can help companies attain optimal success.
There’s another brilliant example of it – a digital wallet promotion that shows up on your smartphone for beer based on previous transaction data. However, you receive this promotion only after you’ve left the convenience store with a bag full of other items. The offer is going to be useless to you, to the store, the brand or your credit/debit card provider. However, if the card provider or the brand were able to properly use real-time data from social media channels, showing key terms like “Fourth of July,” or “Kegger on Saturday,” or geolocation data showing your location, then a well-timed offer would be relevant to you.
Dark Data for Action Driven Insights
Companies are now starting to gain a competitive advantage by casting the spotlight on Dark Data. Data that was earlier abjured due to constraints related to traditional data management technologies intended to handle volume, variety and velocity. The growing spotlight on Dark Data has accelerated the adoption of new analytic tools and techniques that empowers new age businesses like Apple and Google to not only query their data in a more efficient manner, but also develop new queries that have never been considered previously.
Amazon Web Services, IBM, Microsoft and Hortonworks have Big Data solution in place, like Hadoop. Although, a lot of times, it is not being maximized to its full potential. Companies need to start investing in consumer intelligence solutions that come off such massive databases in order to search, index and provide real-time assessments of consumer activity. More and more companies need to prepare a mobile wallet strategy, in order to meet their consumers on their mobile devices. Moreover, transactions from web traffic and operational systems also need to be embraced. Businesses need to mix enterprise data that comes from classic sources such as CRM and ERP with other Big Data sources such as social media and web traffic. Relevant consumer data can help spur a sale.
There will be 200 billion connected devices by the year of 2020. Such interconnectivity will lead to a volcano of data accumulation. In fact, in the coming years, data will be richer – it will be easier to use, accessible and widespread, helping create an opulent, single-customer view. Turning data – including dark data into actionable intelligence will help businesses race against each other by being more in harmony to the present and future needs of individual customers. Today, it’s not just about data, it’s more about people-process-technology elements that allow companies to use the data available to make insight-driven decisions.