Facebook's 'Deep Learning' Guru Reveals the Future of AI

New York University professor Yann LeCun has spent the last 30 years exploring artificial intelligence, designing "deep learning" computing systems that process information in ways not unlike the human brain. And now he's bringing his work to Facebook.
Image may contain Human Person Interior Design Indoors Crowd Furniture Audience and Table
Yann LeCun.Photo: WIRED/Josh Valcarcel

New York University professor Yann LeCun has spent the last 30 years exploring artificial intelligence, designing "deep learning" computing systems that process information in ways not unlike the human brain. And now he's bringing this work to Facebook.

Earlier this week, the social networking giant told the world it had hired the French-born scientist to head its new artificial intelligence lab, which will span operations in California, London, and New York. From Facebook's new offices on Manhattan's Astor Place, LeCun will oversee the development of deep-learning tools that can help Facebook analyze data and behavior on its massively popular social networking service -- and ultimately revamp the way the thing operates.

With deep learning, Facebook could automatically identify faces in the photographs you upload, automatically tag them with the right names, and instantly share them with friends and family who might enjoy them too. Using similar techniques to analyze your daily activity on the site, it could automatically show you more stuff you wanna see.

In some ways, Facebook and AI is a rather creepy combination. Deep learning provides a more effective means of analyzing your most personal of habits. "What Facebook can do with deep learning is unlimited," says Abdel-rahman Mohamed, who worked on similar AI research at the University of Toronto. "Every day, Facebook is collecting the network of relationships between people. It's getting your activity over the course of the day. It knows how you vote -- Democrat or Republican. It knows what products you buy."

But at the same time, if you assume the company can balance its AI efforts with your need for privacy, this emerging field of research promises so much for the social networking service -- and so many other web giants are moving down the same road, including Google, Microsoft, and Chinese search engine Baidu. "It's scary on one side," says Mohamed. "But on the other side, it can make our lives even better."

This week, LeCun is at Neural Information Processing Systems Conference in Lake Tahoe -- the annual gathering of the AI community where Zuckerberg and company announced his hire -- but he took a short break from the conference to discuss his new project with WIRED. We've edited the conversation for reasons of clarity and length.

WIRED: We know you're starting an AI lab at Facebook. But what exactly will you and the rest of your AI cohorts be working on?

LeCun: Well, I can tell you about the purpose and the goal of the new organization: It's to make significant progress in AI. We want to do two things. One is to really make progress from a scientific point of view, from the side of technology. This will involve participating in the research community and publishing papers. The other part will be to, essentially, turn some of these technologies into things that can be used at Facebook.

But the goal is really long-term, more long-term than work that is currently taking place at Facebook. It's going to be somewhat isolated from the day-to-day production, if you will -- so that we give people some breathing room to think ahead. When you solve big problems like this, technology always comes out of it, along the way, that's pretty useful.

>'Mark Zuckerberg calls it the theory of the mind. How do we model -- in machines -- what human users are interested in and are going to do?'

Yann LeCun

WIRED: What might that technology look like? What might it do?

LeCun: The set of technologies that we'll be working on is essentially anything that can make machines more intelligent. More particularly, that means things that are based on machine learning. The only way to build intelligent machines these days is to have them crunch lots of data -- and build models of that data.

The particular set of approaches that have emerged over the last few years is called "deep learning." It's been extremely successful for applications such as image recognition, speech recognition, and a little bit for natural language processing, although not to the same extent. Those things are extremely successful right now, and even if we just concentrated on this, it could have a big impact on Facebook. People upload hundreds of millions of pictures to Facebook each day -- and short videos and signals from chats and messages.

But our mission goes beyond this. How do we really understand natural language, for example? How do we build models for users, so that the content that is being shown to the user includes things that they are likely to be interested in or that are likely to help them achieve their goals -- whatever those goals are -- or that are likely to save them time or intrigue them or whatever. That's really the core of Facebook. It's currently to the point where a lot of machine learning is already used on the site -- where we decide what news to show people and, on the other side of things, which ads to display.

Mark Zuckerberg calls it the theory of the mind. It's a concept that has been floating in AI and cognitive science for a while. How do we model -- in machines -- what human users are interested in and are going to do?

WIRED: The science at the heart of this is actually quite old, isn't it? People like you and Geoff Hinton, who's now at Google, first developed these deep learning methods -- known as "back-propogation" algorithms -- in the mid-1980s.

LeCun: That's the root of it. But we've gone way beyond that. Back-propagation allows us do what's called "supervised learning." So, you have a collection of images, together with labels, and you can train the system to map new images to labels. This is what Google and Baidu are currently using for tagging images in user photo collections.

That we know works. But then you have things like video and natural language, for which we have very little label data. We can't just show a video and ask a machine to tell us what's in it. We don't have enough label data, and it's not clear that we could -- even by spending a lot of time getting users to provide labels -- achieve the same level of performance that we do for images.

So, what we do is use the structure of the video to help the system build a model -- the fact that some objects are in front of each other, for example. When the camera moves, the objects that are in front move differently from those in the back. A model of the object spontaneously emerges from this. But it requires us to invent new algorithms, new "unsupervised" learning algorithms.

This has been a very active area of research within the deep learning community. None of us believe we have the magic bullet for this, but we have some things that sort of work and that, in some cases, improve the performance of purely supervised systems quite a lot.

WIRED: You mentioned Google and Baidu. Other web companies, such as Microsoft and IBM, are doing deep learning work as well. From the outside, it seems like all this work has emerged from a relatively small group of deep learning academics, including you and Google's Geoff Hinton.

LeCun: You're absolutely right -- though it is quickly growing, I have to say. You have to realize that deep learning -- I hope you will forgive me for saying this -- is really a conspiracy between Geoff Hinton and myself and Yoshua Bengio, from the University of Montreal. Ten years ago, we got together and thought we were really starting to address this problem of learning representations of the world, for vision and speech.

Originally, this was for things like controlling robots. But we got together and got some funding from a Canadian foundation called CIFAR, the Canadian Institute For Advanced Research. Geoff was the director, and I was the chair of the advisory committee, and we would get together twice a year to discuss progress.

It was a bit of a conspiracy in that the majority of the machine learning and computer communities were really not interested in this yet. So, for a number of years, it was confined to those workshops. But then we started to publish papers and we started to garner interest. Then things started to actually work well, and that's when industry started to get really interested.

The interest was much stronger and much quicker than from the academic world. It's very surprising.

>'You have to realize that deep learning -- I hope you will forgive me for saying this -- is really a conspiracy between Geoff Hinton and myself and Yoshua Bengio, from the University of Montreal'

Yann LeCun

WIRED: How do you explain the difference between deep learning and ordinary machine learning? A lot of people are familiar with the sort of machine learning that Google did over the first tens of its life, where it would analyze large amounts of data in an effort to, say, automatically identify web-spam.

LeCun: That's relatively simple machine learning. There's a lot of effort that goes into creating those machine learning systems, in the sense that the system is not able to really process raw data. The data has to be turned into a form that the system can digest. That's called a feature abstractor.

Take an image, for example. You can't feed the raw pixels into a traditional system. You have to turn the data into a form that a classifier can digest. This is what a lot of the computer vision community has been trying to do for the last twenty or thirty years -- trying to represent images in the proper way.

But what deep learning allows us to do is learn this representation process as well, instead of having to build the system by hand for each new problem. If we have lots of data and powerful computers, we can build a system that can learn what the appropriate data representation is.

A lot of the limitations of AI that we see today are due to the fact that we don't have good representations for the signal -- or the ones that we have take an enormous amount of effort to build. Deep learning allows us to do this more automatically. And it works better too.