BY SUSAN LAHEY
Special Contributor to Silicon Hills News
Dhruv Bansal and Flip Kromer, two of Infochimps’ founders, were budding research scientists, graduate students at the Center for Nonlinear Dynamics at the University of Texas Physics Department. They had no real thought of building a startup. But it did occur to them that not only they, but lots of other people, had the daunting task of looking for answers in giant sets of data—Big Data. Data sets too big to be accessed by normal computers in normal time frames. Sets that require tons of storage and processing capacity.
Bansal, for example, had a school project that required him to laboriously collect and assimilate demographic information on five million students who had taken the Texas Assessment of Knowledge and Skills.
Kromer understood Big Data not only as a scientist but as someone who held a degree in computer science.
So they suspected that the people who dealt in Big Data would rejoice if someone created a marketplace where you could find whatever chunk of Big Data you might need–like stock prices over the last 30 years or weather patterns over the last 100. They just couldn’t figure out how to monetize it.
“We didn’t perceive it to be a business project,” said Bansal. “It was just two graduate students building this service that would make our lives easier…but it was difficult to garner the resources needed to do this right. How do you get funding if you’re not planning to make money?”
Along came Joe Kelly, who responded to a Craigslist ad placed by Bansal and Kromer, seeking a developer for the physics department’s website. They didn’t hire Kelly. But he didn’t go away, either. Kelly was fascinated by chaos theory and data sets. Not a physicist, he had taken a year of business school, run a Chinese import company, an adventure travel company, and traveled around the Caribbean in a sailboat for three years. The way Bansal put it, Kelly kept bugging Bansal and Kromer, wanting to hang out with them and learn more about what they were working on. One day it dawned on them he might be just the guy to turn Infochimps from a graduate school project to a real business.
Now the guy who wasn’t quite up to par to build a website is the CEO.
Infochimps started as a data marketplace—a place you could sell all the data you compiled on coniferous plants of the Northern Hemisphere or incidences of actual injury involving slipping on a banana peel. It’s a place you could go to buy someone else’s research on geologic findings on a particular igneous rock.
In 2010, about a year after it started, Infochimps got $1.2 million in funding from venture capital firm, DFJ Mercury. This followed $375,000 in seed financing from angel investors. The company said it would use the money to increase the amount of data available to its customers. Currently it manages about 15,000 public and proprietary data sets for download and API access.
As a business model, Bansal said, that worked fine. But customers kept asking if Infochimps would help them turn their tidal waves of data into actionable information sets.
“A lot of our customers were saying ‘We already have too much data internally. We can’t handle it. We’d love to be able to take advantage of the data we have.” At first, Bansal said, they said no.
“Then we realized it was better to say ‘Yes.’ There’s immediate revenue.”
So, over the last several months, the company has been adding a whole new set of skills to its business model. It acquired Data Marketplace, a data company and Keepstream, that curated Tweet data. The second company was, Bansal said, a talent acquisition. It replaced its original CEO, attorney and co-founder Nick Ducoff, with Kelly in November 2011. When a company’s vision changes, Bansal said, everyone doesn’t see the future the same way. Ducoff and Infochimps “parted amicably” according to public reports.
And last month, Infochimps introduced its platform for helping customers use data more meaningfully. The company has developed specific tools: Ironfan for handling stack data; Wukong which simplifies Hadoop streaming; Swineherd, which runs scripts and workflows for file systems; and Wonderdog, a Hadoop interface for elastic search.
With these tools, Bansal said, and some customization, Infochimps can help companies of various sizes from multiple industries translate its Big Data into actionable information.
“Every major company I talk to is looking at ways to use Big Data technology to extract insights,” said Paul D’Arcy who is connected in the Austin Big Data community because of his role as executive director for America’s Marketing for Dell. But he’s offering his personal opinions here.
“None of them has the expertise to piece together open source technology to develop the components to do this. It takes time and investment…. Big data is one of the three or four biggest trends in technology right now and Infochimps is innovative in that they’ve built one of the first systems with all the pieces for organizations of any size to take advantage of all these technologies.”
One of Infochimp’s customers is Austin startup Black Locus, which provides pricing information on thousands or millions of products across retailers. The service helps retailers make adjustments to boost their place in the market.
Infochimps was able to speed Black Locus’s implementation of its service by months, as it does for many startups, Bansal said. Black Locus said Infochimps helps it help its customers.
“Infochimps provides us with a scalable infrastructure for dealing with the sheer quantities of data we collect and process,” said Trebor Carpenter, director of engineering for Black Locus. “This allows us the ability to focus on our core technology and algorithms. As trendy as Big Data has become, there are plenty of people claiming to be data scientists simply because they can correctly spell “hadoop.” But the Infochimps platform helps us transform a firehose of data into insight our customers can use to win in the marketplace.”
The percentage of companies that can really leverage their Big Data is tiny, Bansal said. But the number of companies that use it growing fast. Infochimps aims to help companies at all ends of the spectrum. Startups are a big target market because of Infochimps capacity to speed their process to market by months. But its prospective customer base is broad, especially with newer open stack technology that allows companies to get cloudlike technology from their computers.
“A lot of bigger companies started trying to solve their own data problems and came up with their own solutions and shot themselves in the foot,” Bansal said, referring to clients like the one that realized it was running more than 150 servers that weren’t producing anything.
Now everyone from Mom-and-Pop operations to giant corporations needs more efficient ways to pull valuable information from the giant, growing, waves of data being created through the internet, social media and other sources.
“When we first started,” Bansal said, “we had to explain what Big Data was. Now it’s everywhere.”