[Photograph by WDnet under Creative Commons]
Growing your business without a large stash of funds is like trying to grow crops on parched land. Still, if you want to make it happen, you have to find those little pools of water buried deep inside the earth. You’ve got to dig deep to tap that little reservoir. And then find many such little pools of water. Again. And again.
The Little Data approach that I have advocated in the previous two articles (here and here) is like that. If Big Data, with its mighty investments and equally spectacular returns is not an option, you can turn your attention to those little data pools—within the organization and outside.
The insights you can derive through those little but rich data veins can deliver a lot of value. And because they are smaller, they are easier to work with. You don’t need a lot of money to get it going.
What you need is some common-sense hacks. Here’s what works wonderfully for us.
Automation is not the only answer
Let’s start with the most un-cool part. We do a lot of manual digging for data. Yes, it is tedious. Yes, it is not scalable. And yes, it is more prone to errors.
And yet, it has a place in our scheme of things. Actually, it’s quite simple. Often, we are trying to map content trends in some very niche topics, on content platforms that do not “talk” so well with external entities.
We don’t have deep data management skills—we know mySQL (an open-source relational database management system) and we know some PHP (a scripting language for creating interactive web pages). But that’s not adequate in many contexts. So we go manual. We extract the data manually, and then do some robust Excel macros that throw up all sorts of interesting results. It works.
If we didn’t go this way, that initial step—of extracting data from data-unfriendly sites—can be quite time and resource-intensive. But perhaps because we don’t have recourse to sophisticated data handling capabilities, we ended up with this hack.
We are currently launching a beta version of a program, a social index tracker for recently launched cars that will highlight launches generating the maximum social buzz. The data we are seeing is pretty insightful—and while some of it is generated through scripts, a large part is manually extracted.
It depends on your objective. We just want a useful marker about current conversations around cars, more in the form of a ‘general trend’ than ‘nailing down the actual numbers’.
There is no such index in the market, so I guess, as a start, it serves a purpose.
Always start with a prototype
Yes, this applies not just to product build-outs, but to data build-outs as well.
Often, we are trying something new and so the questions being thrown at us are also novel. Seeing the problem for the first time, we may have a hunch about how to solve it.
First step, we gather some tentative data. That initial data set either strengthens the hunch, or sends us off on a new iteration.
For instance, currently we are in the middle of testing a new content format. We haven’t fully worked out the template yet, but we are gathering some data from used-car transaction sites first. And by the way, we are picking up this data manually.
This process will take our data team a couple of weeks, by the end of which we will have a sample size large enough to answer our questions. That preliminary analysis will feed into our content template tweaks, and also set the stage for more comprehensive data gathering later.
It always starts with tentative data forays like this. In the past 12 months or so, we have introduced at least three content innovations in this manner. Each has been a fabulous hit with our readers. Today, it isn’t surprising for us to get half a million page views in a day with just three new stories published.
Tap easiest-to-access data sources first
This often surprises me. Usually, all of us have a tendency to take on a more complex problem first. I guess the lure of a stimulating challenge is all too overwhelming. But is that always smart? I am not so sure.
Tracking user intent online is one of the most attractive problems to solve, especially for e-commerce sites. For instance, fashion products recommendation sites use formidable algorithms to calculate intent based on social recommendations. How much weight to give to a social like or a share? How to map that to user profiles? And so on.
There’s a lot of dynamic data crunching that goes on to throw up a list of trending products. The more accurate that list, the better the conversions. And they get tremendous results too.
But here’s another way you could gather social signals. Identify a bunch of social media pages relevant for your product. Pull the likes and shares data from those pages through the APIs (application program interfaces). Arrange that data in a useful context (that’s often a tricky part), decide on how frequently you want to mine it (daily or weekly), do some common-sense sanity checks on the data.
And that’s it. You will start seeing some very useful insights.
We do something like this to figure out trending content for cars—in India and globally. It’s a big input for the content team to decide their content plan. We must be getting some of it right, considering that 18 million people see our content posts every month on Facebook, and those posts generate more than 100 million impressions monthly.
All this may seem a little trivial. You could argue that we resort to such hacks because we don’t have the resources to do it any other way. That’s true.
But in the end, these hacks get us the results. In less time, and with a lot less money.