Silicon Valley

The total data accumulated in the past two years - a Zettabyte - eclipses the previous record of human civilization. It is fast becoming mainstream with companies from across all industries combining a plethora of digital data with softwares to analyze it. Software makers are eager to drill into the big data goldmine, with analysts predicting exorbitant growth for the overall market.

However, building a successful business exploiting the benefits of big data is anything but easy, and requires dexterity along with the right amount of patience.

Let’s take an actual case here. Cask, a Silicon Valley start-up, founded in 2011 and back by pre-eminent VCs and headed by former Facebook and Yahoo engineers. In September 2014, the upbeat young company entirely changed both its name as well as its business model, by moving into supplying open-source software and capitalizing on technical support and consulting as opposed to doing it on proprietary products.

Today, major technology companies are chasing big data, however, it is a hard knock life for start-ups, especially those that lack financial anchor of the large tech corporations. Every start-up wants to instantly grab a foothold in emerging markets and turn into a profitable business, and ultimately transform into large software companies like Microsoft.

Here’s why: softwares companies only have to deliver codes over the Internet as a cloud software or offer technical support subscriptions for open-source softwares that are available for free in the market, rather than selling hardware or their time as consultants.

Data science, although promising, is still in its infancy. It will have to be applied into every company, every industry across the globe to become conspicuously eminent. At this moment, it requires a lot of craftsmanship as a basic foundation step rather than simple software automation.

Hence, aspiring software companies often find themselves acting as service companies by providing training, advising and building pilot projects for their commercial customers than they had hoped previously. Most young start-ups today are building a market for commercial users in industries such as retail, finance, consumer goods and healthcare for which they will eventually sell software.

“They are setting up and doing first deployments to get corporate customers up to speed,” said Merv Adrian, an analyst at Gartner.

The reward for such young companies that survive the fatal bolting of suppliers will be even more promising.

According to research firm IDC, the total market for big data technology is estimated to triple in the coming five years, reaching $41.5 billion by 2018.

These optimistic forecasts will of course cover only those businesses that survive by adopting sustainable business models.

In truth, the cost of chasing after big data opportunity is dauntingly expensive. This is evident from the recent case of Hortonworks filing documents when it decided to go public.

Hortonworks is a dominant distributor of the famed open-source Hadoop software. Hadoop is a database for managing unstructured data from the web, sensors and devices utilized in big data analytics.

According to the company’s financial statement, it experienced a rapid growth in terms of revenue that more than doubled in a span of nine months reaching $33.4 million. However, its costs increased, ending with a net loss of $86.7 million, which is more than double its total revenue.

In December 2014, on the first day of trading, investors bid roughly $26.38 per share, which is 65 percent beyond the offering price of $16 per share.

The big operating loss at Hortonworks, said Mike Gualtieri, an analyst at Forrester Research, “shows that it is still early for this market, and that you need to spend a lot.”

A lot of big data startups have raised noteworthy amount of funding from VCs and corporate investors, but are still shying away from going public. By choosing to remain private, they can bear up against market shifts, grow and develop sustainable businesses before, rather than reporting quarterly results and facing extreme pressure from shareholders.

Cloudera, a distributor of Hadoop, raised around $900 million in 2014, with $740 million coming from Intel. Analysts estimate that Cloudera will be twice as large as Hortonworks.

Palantir Technologies, that was in 2004 has raised around $900 million to innovate its hybrid model of data analytics software and services. Initially, it worked for American intelligence agencies, but now it has a growing clientele that includes banks, hedge funds, insurance companies, as well as healthcare agencies.

Over the years, the company has refined its foundations of the big data softest, however, all of its work is custom projects for clients that require a lot of consulting and advising. Palantir is expected to generate revenue of more than $1 billion in 2014.

In today’s age, companies often need help beyond data handling and basic know how.

Silicon Valley Data Science, is a consulting firm that was founded in 2013 by a group of data experts. In its earlier stages, the company expected to offer data skills to enlarge corporate projects. However, they came across a lot of companies that, in spite of having data could not develop projects to explore and experiment with it. Soon, data strategy and identifying a road map became one of Silicon Valley Data Science’s many roles in such companies.

Cask, is creating software tools to make it even more easier for corporate developers to write big data programs that run on Hadoop. However, the software tool is now how data science will become mainstream in corporate. Programmers that work in various industries like banks, retail sector, healthcare, media will have to build these applications according to the need for innovation.

Until 2013, Cask sold its software tools as a proprietary subscription service delivered on  cloud which helped the company to generate some revenues. But, in order to gain wider adoption of its services, the start-up changed its name and made its software open source. Similar to many open-source companies in the market, Cask may charge for added features such as management and security software.

Today, revenue for Silicon Valley big data startups comes from assisting large corporate clients with writing big data application. It is similar to setting up, and doing initial deployments to get customers up to speed.