Anirban Sen | First Published: Mon, Apr 15 2013. 11 57 PM IST
Bangalore: An expert on web search, big data, webpage optimization and search recommender technology, Yahoo Labs chief data scientist Ronny Lempel works on solutions that make consumption of online content personalized and improving recommendation technology. During a brief visit to Bangalore, Lempel spoke in an interview about the evolution of big data, challenges of maintaining a balance between personalization and contextualization of web search and life under Yahoo chief executive Marissa Mayer. Edited excerpts:
When it comes to big data, a tremendously huge amount of data is generated now from sources that go beyond the traditional channels. Has the world been able to keep pace with the rapid evolution of this technology?
People talk about big data as if it’s a new thing—it’s actually not so new. Financial institutions and weather forecasters have always had a lot of data to analyse... Whether they needed forecasting on mainframes or supercomputers, they had an abundance of meteorological signals and they tried to process them. Same with banks and insurance companies—those guys had transactional data on large populations over many years.
What has changed is democratization of big data. Today, every three folks with a popular smartphone app suddenly start receiving signals about millions of users. So it’s not just the physicists who have these problems or the big bankers, it’s now everybody. It’s not just the large Internet companies, it’s also the startups. And they’re also realizing the value in collecting signals, instrumenting their experiences, analysing them to understand how people are using their applications or services and they see the value in it, so they start collecting the data and analysing it.
So, what’s the next big shift in big data analytics?
Even small companies today hire data centres. They’ve seen the value or understood the value of analytics beyond what might be called simple aggregation. And you see people with machine learning and data mining expertise being in demand in smaller shops that traditionally wouldn’t have hired those sort of people.
Startups that would’ve traditionally hired software engineers and focused on coding experiences and that sort of work, now are looking to hire data scientists to actually personalize and analyse and squeeze every last drop of monetization from their services. It’s not going to be just the big companies that are looking for this sort of expertise, every mom-and-pop shop will have a data scientist at their disposal.
How does India rank in the scheme of things of big data analytics?
India is not behind in anything—if you look at this town, it’s all about the leading-edge companies... I know the top schools here have students in them very much on the top of these technologies that publish in the right conferences and get extremely good education in these fields. These guys will all find good jobs whether it’s in the big multinationals or smaller startups. There’s absolutely no reason for what’s happening in the rest of the world to not happen here.
What are the biggest challenges that you see when it comes to maintaining a balance between personalization and contextualization on web search?
In web search, because of its peculiar nature, a person coming in with an information need, personalization has not yet played as major a role as it has in leisure, entertainment consumption sort of experiences where recommender technology is the driving force. This whole taxonomy of web search queries is known for a decade now where people have navigational needs—they know what website they want to go to but instead of typing the URL, they would just go to the search engine and say American Airlines or Coca Cola and that would lead them to the homepage. And then there’s a transactional need, where you want to print a boarding pass from an airline site or book a hotel—there’s not too much to personalize there.
Then there’s the wide open informational queries where people want to learn about stuff that matters to them. And that’s where disambiguation comes in. You and I might be interested in the same stream, but it’s actually different entities—it happens a lot with sports teams names. The same name reappears in different countries in different areas...so in the information space, you’ll start to see more personalization, in the transactional space you could see a little of it..not at all in navigation. Navigational is pretty much a solved problem.
How has life changed for you since Marissa Mayer took over the company?
We’ve always been at the forefront of technology regardless of the chief executive and we’re continuing that right now. There have been several great launches of Yahoo products over the past few months and there will surely be more launches in the near future and we’re just excited with the technological challenges that lie ahead.