eBay Inc. has been steadily building its capabilities to draw important insights from large sets of data.
There are dedicated leaders at the company focused on data, and eBay Research Labs has statistical and data scientists working on significant projects as well.
David Draper, Interim Director of the Center of Excellence in Statistical Research in eBay Research Labs, is one of the world’s most distinguished statistical scientists.
In addition to Draper's work on behalf of eBay Inc., he’s a faculty member in, and was founding chair of, the Department of Applied Math and Statistics at the University of California, Santa Cruz. We caught up with Draper to learn about some of the significant data-centric projects going on in the labs. Here are his thoughts:
I’m currently focused on two joint eBay Research Labs/eBay inc projects that are intensely statistical. The first is called Good-to-Great in Experimentation and Measurement.
The idea is that over the next 18 months we’ll take a comprehensive look at how we do experimentation and how we measure business outcomes.
Most people who use popular websites (Google, Facebook, eBay) don’t realize that every time they visit them, they get randomized into one of a series of treatment groups, each of which offers a slightly different version of the basic web experience. This form of experimentation, when done well, can produce good insights about how to increase user satisfaction and the rate of successful transactions.
So we’re working to improve the quality with which we perform this important task.
I’m also excited about a project called The Hulk, which aims to provide employees at eBay with a dashboard that can instantly monitor any quantitative metric of particular interest to them.
They can visit the dashboard whenever they want, and a red light shows up if there’s a significant change in an important outcome. This will provide excellent real-time analysis of what’s going well for the business and what isn’t.
For example, if there’s a sudden decline in buying or selling activity in some segment of eBay’s marketplace, we’d like the dashboard to generate an immediate alert, so that a search can begin for the cause of the decline.
I’ve been working on applied statistical problems for 35 years. Statistics is a field where every problem you’ve worked on can help you with the next one.
For instance, I’ve done applied work in the medical realm, designing and analyzing clinical trials to improve outcomes for hospitalized patients. I’ve also done work on environmental risk analysis, studying things like the safest way to store radioactive waste from nuclear power plants.
That work may sound like it doesn’t have a lot to do with ecommerce, but many of the statistical problems I’ve tackled in the past are similar in structure to problems that need solving at eBay. For example, in all walks of life, everyone would like to be able to predict the future better than we do now, and good statistical work is central to this process.
Decision making is another area where we can make a conscious effort to improve performance by applying basic principles drawn from statistics. Statistics can help quantify how much uncertainty there is when making decisions, and can improve the likelihood of arriving at good choices.
We’re not far from an era where your refrigerator can send you a text saying that your milk has just gone sour, or your house can tell you that a light bulb is about to blow. As soon as we equip individual devices like these with smart sensors and the capacity to talk to the network, they’ll be able to transmit all kinds of useful information.
I would personally like to interact with an eBay robot to whom I can speak using natural language and tell it that I want to order a certain book, after which it will scour the web for where I can get it at the best price, or where I can get it fastest. The robot might also tell me that I don’t need something I’m asking about, because I already have one but forgot about it. The principal new feature here is much better natural-language processing than we currently know how to provide; that would be reallyuseful.
We say that we’re in the Big Data era now, but people in 25 years will laugh at that. They’ll think that the amount of data that we’re dealing with now seems trivial. We’re rapidly becoming able to capture more data and improve our ability to find meaning in large data sets. The need for better and better data science will be ongoing.