Both of my last two companies have been involved in the aggregation of large data sets.

At the company I co-founded, Ten Ton Labs, we collected and normalized music reviews from across the web to support our music search engine, Squishr.

Our original plan, however, was to collect all reviews, of anything and everything, and then build an application that analyzed and tracked various metrics from the data to find out how consumers felt about certain products. We scaled the idea back to focus on music to allow us to continue building the underlying review platform and to have an easier application to implement on top of it to prove it out.

At my current company, Collaborative Drug Discovery we collect huge amounts of data about experiments in the field of chemistry (there are about thirty million known chemical compounds) and allow people to search through the data in various ways, as well as add their own data. Our users can use our toolset to keep their data private, but search across the aggregated dataset.

Both of these companies have a lot in common. When I was at Ten Ton Labs, our music search engine held a huge amount of short-term value for us. It was the cool thing that we could show to investors to get them excited about the company, and it gave us attainable short-term goals to go after. CDD is very similar, we currently put most of our effort into our cheminformatics application that runs on top of our data, because it's what we sell to people and show to our investors.

Long term though, I don't think that our music search engine or the CDD cheminformatics app is really where most of our value lies.

It lies in the cloud.


A huge amount of value is created by building tools to make it easy to get data into the cloud, and to make it easy for people to process the data and get it back out in a manner that they're used to. Once you have those tools, especially if you're allowing people to build applications and communities on top of your data, you're creating real value.

You've created an app that people will come back to everyday to see what's new.

And you've created an ecosystem around your data that won't allow it to die. People will build tools to interact with it in ways that you never even thought of, and again, more value will be created because of it.

I think that building a platform for data like this, not just a one-off application, is where the opportunity for huge value creation eventually lies.