BETA
This is a BETA experience. You may opt-out by clicking here

More From Forbes

Edit Story

Inside Facebook's Bet On An Augmented Reality Future

This article is more than 6 years old.

Courtesy of Facebook

Mark Zuckerberg got his first taste of the Oculus Rift, the pioneering virtual reality headset, in January 2014. Standing in one of the few Facebook offices equipped with blinds, with the brick-like device strapped to his face, he was suddenly transported to the ruins of a medieval castle, thick snowflakes falling all around him as gargoyles sprouted lava from their beaks. The dazzling virtual escapade immediately convinced him that VR would one day become a major computing platform. Two months later, he backed that conviction with Facebook’s $2 billion purchase of Oculus.

But Zuckerberg left the demo with another, far less talked about hunch: a younger, related technology, called augmented reality (AR) had a shot at leapfrogging VR, with its potential to bring digital overlays of contextual information or special effects onto the physical world through a simple smartphone. While there was no headline-making billion-dollar deal, Zuckerberg ordered his engineers to begin building toward an AR and VR future at the same time. The dual push made sense given that the two mediums share so much of the same underlying technology, from hardware components to sophisticated computer vision software.

“Mark [Zuckerberg] was the one who really pushed us to invest in AR right around that time,” Facebook CTO Mike Schroepfer told Forbes. Together, AR and VR rank among Facebook’s top three tech priorities – along with connectivity and artificial intelligence – Schroepfer said. Inside Facebook, hundreds of engineers are working on underlying technologies like computer vision which will enable a phone to do everything from tracking facial movements in real time to identifying a coffee mug or recommending context-specific image effects. Artificial intelligence is so fundamental to powering AR that Facebookers often call their in-app camera an “AI camera.” The overall effort involves “significant capital outlays,” Schroepfer said.

Today, Facebook is fighting its fellow technology powerhouses, Apple and Google -- and still to some extent, Snap -- in a high-stakes battle to rule as the platform of choice for AR developers. The technology itself, while still in its infancy, has exploded in popularity, confirming Zuckerberg’s more recent intuition that AR could sprint toward mass adoption even while VR remained an awkward technology whose appeal is largely limited to hardcore gamers. AR’s key advantage is that it doesn’t depend on a pricey, bulky headset that isolates its users. It works on a device already owned by more than one fourth of the world’s population.

“The big epiphany is that you can use your phone for AR, and we have about 1.3 billion people who use Facebook on the phone,” said Joaquin Candela, Facebook’s head of applied machine learning, the group building the AI backbone behind the company’s AR efforts. “One hundred times more people have phones than VR headsets. That makes AR really interesting and obvious to focus on.”

Obvious, too, because early evidence suggests AR has the power to draw consumers in, sometimes fundamentally changing how they interact with their phones. Witness the popularity of pioneering AR applications, like puppy masks on Snapchat and virtual treasure hunts on games like Pokémon Go. They make clear that AR, not VR, is the next major phase in our mixed reality future. The smartphone interactions that power AR -- using a mask to turn into a singing rock star or pointing a phone to capture a Pokémon – are already acceptable social behavior.

But the reason tech powerhouses are investing so much in AR is that the applications go far beyond social media, games and goofy special effects. The technology could give rise to practical applications in areas ranging from navigation to e-commerce, where virtual dressing rooms could grease the wheels of online apparel purchases. An app from IKEA already helps shoppers virtually arrange furniture in their home. Pharmaceutical companies are playing with the idea of using AR to display real time information about drugs. Hyundai uses an AR app to walk consumers through the features of some cars. AR could also make its way into customer support tutorials that integrate with chat bots. “The smartphone can basically be a magic lens that you hold up to the world,” said Facebook Camera team engineering manager Tom Meyer.

Inside Facebook, engineers and executives acknowledge the stakes are high. Without getting its AR push right, the company risks seeing a decline in usage of its apps. Its battle with Snapchat for younger users shows loyalty to social apps can be fleeting. Users will quickly migrate to those that have the most alluring tools, and those that can turn communication, especially through images and videos, into engaging, constantly-evolving experiences.

Facebook didn’t launch its AR effects until several years after Snapchat, eons in Internet time. However, Facebook pulled off the gargantuan task of offsetting its late debut thanks to the social media giant's powerful in-house AI, which supported more advanced effects at scale, and its strong product design. Together, these assets enabled Facebook to rapidly match Snapchat's features and hamper its younger competitor's growth, saving Facebook from the fate of other tech giants such as Google (with Google Plus), whose late product timing cost the search company its shot at becoming a social media player.

Success in AR could bring big rewards for Facebook. The company's advertising business, fueled by activity and time spent across its apps, translated into $26.9 billion in revenue in 2016. AR effects generate a growing portion of overall time spent on the social network, encouraging users to message more frequently and spend more time viewing friends’ posts and making their own. The average Facebook user already spends about 50 minutes per day across its main app, Instagram and Messenger – and Facebook needs to continually roll out new product features that capture eyeballs and foster growth in image-heavy communication to keep this metric high and stave off competition, particularly as the company seeks to minimize other news feed thumb-stoppers like click bait and hoaxes.

Getty Images

The prevalence of masks and filters across Facebook’s apps marks the company’s first major step in transitioning from being a “past camera” for still images and albums to a “future camera,” as Meyer calls it -- one that’s powered by AI for intriguing effects and attached to a network of friends and family. Masks might seem frivolous, but the implications of getting AR right for Facebook are existential. The social network’s ongoing shift toward image-heavy features is as critical to the company’s future as was the transition from desktop to mobile, according to Facebook’s chief product officer Chris Cox.

“As a broader story, Facebook has to get really good at AR if we want to be relevant in the next 10 or 20 years,” Schroepfer said.

From ‘Past’ To ‘Future’ Camera

On a morning last May, Zuckerberg appeared on the social network wearing taped glasses with math equations swirling above his head. The nerdy digital accessories were Zuckerberg’s attire of choice for announcing Instagram Stories’ first “face filters” (known as face “masks” on Snapchat). “This is my favorite one so far,” Zuckerberg said looking wide-eyed into the camera with a smile.

The effects, which include colorful confetti, a bubbly underwater scene, twitching koala noses and bunny ears, are inspiring 300 million people to experiment, everyone from celebrities like actress Reese Witherspoon and model Karlie Kloss to teenagers lounging at home use Instagram’s AR-heavy Stories feature each month. Buoyed by Instagram Stories (a series of photo and video clips that disappear after 24 hours) and its analogues on WhatsApp and Facebook’s main app, Facebook is now the largest social AR ecosystem in the world, just four years after Zuckerberg’s Oculus demo and six years after the company started picking away at the core AI technology.

Hundreds of developers are building apps atop Facebook’s AR camera platform using software called “AR Studio,” which the company opened widely in December. Now anyone with a Facebook account can create AR effects for the social network, including masks, animations and 3D objects.

Even though Facebook is cracking social AR, it wasn’t the first to popularize it. That claim belongs to one of its top rivals, Snapchat, which was originally written off by many as a sexting tool. It was Snapchat that pioneered “Stories” -- the primary home of AR effects -- three years before Facebook launched a near carbon copy of the format in 2016. Until a little over a year ago, Snapchat's AR features were well ahead of Facebooks. However, Facebook had been intently preparing the core technology behind the scenes for years because of Zuckerberg’s early hunch that AR would someday be a mainstream communication tool.

Snapchat’s first AR effects were full-screen overlays for photos and videos, followed by location-specific art called “geofilters” in 2014, “lenses,” like its famous barfing rainbow mask in 2015, and customizable avatars called “Bitmojis” in 2016. Snapchat’s effects were instant hits, helping the app garner 150 million daily users. Teens, perhaps the most valuable and elusive demographic, came to view it as an everyday chatting essential. Now, Facebook executives avoid discussing its younger competitor in interviews, but at the time, the social network anxiously and attentively tracked Snapchat’s growth.

As Facebook monitored its new rival, it was well aware of a trend on its own app. Posts on the social network were increasingly veering toward photos, GIFs and video and away from text. More striking was the response to its live video launch in April 2016. Executives - including Zuckerberg – were stunned by its hockey-stick like adoption, which clearly signaled that people wanted to speak through images and animations on Facebook, and they wanted to do it in real-time.

Upping the investment in image-based sharing was inevitable. Facebook’s product roadmap hit a major turning point when Facebook was rolling out live video and took note of the Belarusian selfie-mask app called Masquerade (MSQRD). It had taken off in Eastern Europe and was gaining steam in the U.S. with a total base of 16 million users. Its mask tool looked nearly identical to Snapchat lenses, thanks to 3D graphic rendering technology MSQRD cofounder Eugen Zatepyakin had spent nearly three years building. Facebook’s VP of design Julie Zhou recalls watching Facebook executives try out the app at the company’s sprawling Menlo Park, Calif. headquarters. They were quickly captivated.

Courtesy of Facebook

“There was something instantly appealing about trying to transform yourself -- you want to be someone else, to dress up,” Zhou said. “It’s harkening back to being a kid again and having fun, but it doesn’t stay a toy forever.” (Read more about future applications for augmented reality in "Six Ways AR Will Matter Beyond Puppy Selfies.")

Facebook acquired the startup for an undisclosed amount in March 2016 and began working feverishly to make up for lost time. Soon after, Zuckerberg posted a video of himself using MSQRD’s Iron Man mask on the social network, and quietly launched a major internal effort to rapidly build the technological backbone to support a more sophisticated in-app camera.

AI Breakthrough

Facebook had been working on AI such as computational photography that would later support AR since 2015. However, Zuckerberg didn’t begin an overt AR effort until he called for the formation of a dedicated “Camera Group” in the summer of 2016. The group began as a handful of AI engineers and researchers within Facebook’s Applied Machine Learning Group, which sits in the open-floor plan of “Building 20,” near Zuckerberg’s usual glass-walled workspace.

Over the next year and half, the Camera Group expanded to more than 100 people, including designers nabbed from Hollywood and gaming companies, who work with camera product heads within standalone Facebook apps like Messenger and Instagram to help them launch features and iterate quickly. The group has been busy doing everything from advancing the underlying visual identification and deep learning technology, to running focus groups outside of Facebook to experiment with AR tools.

Courtesy of Facebook

In one of the Camera team’s research studies, Dantley Davis, the Applied Machine Learning team’s head of product design who previously led mobile design efforts at Netflix , recalls visiting people to talk to them about the experience of wishing “Happy Birthday” to someone they care about. The team gave individuals in the study a range of tools they could use to send special effects and messages. One man in their study group, Dantley noted, was relieved to have animated effects to wish happy birthday to his wife because it was easier and faster than trying to express himself through spoken word or text.

“He created an experience based on tools that we provided that was very intimate and personable, based on communication with his wife,” Dantley said. “The AR tools gave him a shorthand to express himself in ways that the he found a lot of value in. They allow people to feel more confident about being goofy to communicate emotion.”

Soon after the Camera team formed, the success of a single app by game developer Niantic served as major confirmation that AR could have mass consumer appeal. Nineteen days after debuting in July 2016, Niantic’s game Pokemon Go grew to 50 million players, who would walk miles around cities (and gyms), using their smartphones to catch AR characters. Zuckerberg himself was a fan of the game, which he mentioned he was enjoying along with “everyone else” when Facebook reported quarterly earnings that month. The game served as affirmation for Facebook executives that AR wouldn’t just be limited to a few messaging apps like Snapchat.

Meanwhile, Facebook’s Camera team engineers were heads down building in-house AI processing software that they hoped would serve as the backbone for future AR effects. While competitors at the time, like Snapchat, relied on outside servers to power AR features, reducing their speed and complexity, Facebook sought to invent a system for processing AI directly on the smartphone, a technology it ultimately named Caffe2Go. Facebook didn’t want to launch AR effects until it had built the infrastructure to ensure its effects would scale seamlessly, and with better renderings, face tracking and speed than those offered by Snapchat.

While Caffe2Go was in the works, product teams at Facebook experimented with AR tools, with the help of MSQRD. In August 2016, ahead of the Summer Olympics, for example, in Brazil and Canada, Facebook tested opening its flagship app directly into a full-screen camera for the first time (Snapchat-style), and launched Olympic-themed frames and face-paint masks for profile pictures. That month, Facebook laid the groundwork for AR effects on Instagram by debuting its first version of “Stories” on Instagram, a near-clone of Snapchat’s signature feature for casual, disappearing posts.

That fall, Facebook completed Caffe2Go, creating the first system that could capture and analyze pixels in real-time by processing AI directly on the smartphone. After Facebook tested the technology in fall 2016 with “style transfer,” a process that transforms a photo or video into the style of an artist like Picasso or Van Gogh, it was ready to use Caffe2Go to power AR across its apps, beginning with Messenger in December 2016. Soon, Messenger looked strikingly Snapchat-like, with masks, filters and frames for photos and videos via a “Stories”-like feature called “My Day.” In March, Facebook expanded the AR effects to its Facebook’s flagship app, letting one swipe take users to a full screen camera with a central effects button for masks and animated overlays. (Read more about the technology powering augmented reality on Facebook in "Five Breakthroughs Behind Facebook's AR Play.")

Courtesy of Facebook

The next month at Facebook’s annual F8 developer conference in April 2017, Zuckerberg’s key announcement was the unveiling of the first “Camera Platform,” which allowed a handful of developers to build AR features atop the social network. Standing on stage in his standard gray t-shirt and jeans, Zuckerberg prefaced his camera announcement by philosophizing about the future of work. Ultimately, Zuckerberg argued, technology will free up people’s time to socialize, be more creative and make more art.

“In the future, more of us are going to contribute to culture and society in ways that are not measured by traditional economics or GDP,” said Zuckerberg, in front of screen displaying the company’s 10-year road map. “A lot of us are going to do what today is considered the arts, and that’s going to form the basis of a lot of our communities.”

“That’s why I’m so excited about augmented reality,” Zuckerberg continued, gesturing. “It’s going to make it so that we can create all kinds of things that until today have only been possible in the digital world, and we’re going to be able to interact with them and explore them together.”

For the first time, Zuckerberg pitched Facebook’s in-app camera as the heart of communication on Facebook, riffing on Snapchat’s motto of being “a camera company.” While AR glasses and contacts will likely be the first AR wearable devices down the road, Zuckerberg predicted, people are starting to enjoy an AR heyday now on their smartphones.

Since Facebook began launching AR effects in late 2016, Snapchat’s growth has stagnated. The rise of its most popular AR platform, Instagram Stories, correlates directly with a decline in user growth on Snapchat. Now only about 190 million people use Snapchat per day, while 500 million people use Instagram every day.

A Broader AR Race

Even though the biggest consumer use of AR is on social media, just about every technology giant is racing to build AR functionality into their products and ecosystems for developers, most of whom don’t yet know how to build for AR. Two months after Facebook debuted its AR camera platform, Apple launched its own developer tool set called ARKit, for iOS 11, which makes it easy for developers and marketers to integrate AR into their existing apps. Already 400 million devices are estimated to be compatible with Apple’s ARKit, according to research firm Forrester. In September, Apple also unveiled “Animoji” for iMessage, which uses face recognition on the iPhone X to let people customize emoji with their facial expressions.

Courtesy of Facebook

Google, however, may have been working on AR the longest of any tech company, launching smart glasses (Google Glass) as early as 2013. The product was a high-profile flop, intriguing technophiles but failing among consumers driven away by concerns about privacy and social acceptability. Google then launched an AR platform called Tango in 2016, which uses depth sensors to map indoors spaces, but is only compatible with a few narrowly used devices. In a push to bring Tango’s abilities to more phones without adding cameras and sensors, Google rolled out its own version of ARKit in August, called ARCore. The developer kit is built to reach existing and future Android devices, including the Samsung Galaxy 8.

By the end of winter, ARCore is expected to run on 100 million Android devices, according to a Google spokesperson. Like Facebook, Google is naturally also interested in using AR for search. In May, Google announced “ Lens,” a computer vision tool that sorts albums in Google Photos, and now lets Pixel and Pixel 2 owners point their camera at objects such as storefronts to get information in real time. Not only do Apple and Google have the benefit of owning their own operating systems, on which developers can build a wide swath of AR apps, they also have a history of successful hardware products that Facebook so far hasn’t matched. Facebook doesn’t have nearly the same track record of building devices, which makes its hardware ambitious a bigger stretch.

“Facebook’s camera platform will enable developers and marketers to reach greater audiences over time, but Facebook’s limitation is that it doesn’t control the hardware,” said Forrester analyst Thomas Husson. “To truly deliver an amazing AR experience it’s about software and hardware integration.”

While Facebook, Apple and Google are focused on smartphones, Microsoft has been focusing on business customers with wearables. The company launched Hololens, a $3,000 visor-like headset in 2016. Hololens runs on an operating system called Windows Mixed Reality, which can be used by other VR and AR hardware makers. It could take years, however, for an AR device to be affordable and appealing to a mass market.

Facebook may not ever own its own operating system or the next AR device (although it’s trying), but it does have key unique assets: First, it reaches more people than any other social or messaging app like Snap, Kakao, Line and WeChat. (Data about people’s interests and networks is powerful for personalizing AR effects.) Second, Facebook has one of the largest computer vision teams of any company in the world, which will can help it create better features than newcomers. Even if Facebook never makes its own AR hardware popular, Facebook will likely remain one of the largest ecosystems of AR content for years.

The Next Trend In Advertising

During the seventh season finale weekend of the hit show “Game of Thrones” this July, fans around the world turned to social media per usual to share reactions and grievances for characters’ fates in the most-watched HBO premiere to date. But a more eye-catching type of post, alongside traditional status updates, also went viral: More than 1 million people created video clips of themselves on Facebook gradually transforming into a terrifying, icey blue-eyed Night King villain, complete with horns, a voice changer and a backdrop of falling snow. The face-tracking mask adjusted in real time to people’s faces as they roared or broke into song, and the clips circulated on news feed and in messages to friends.

The mask’s transformational power wasn’t its only twist. The mask was made by HBO, not Facebook (and featured the “GOT” logo in the upper left corner). It quickly became one of the most successful AR campaigns on Facebook and showed that high-quality AR effects can become advertisements that people choose to send to their friends. AR is a tantalizing format for many marketers, as the Night King mask showed, because it offers new ways to engage smartphone holders for longer amounts of time. Facebook is positioning itself to become the default platform for AR marketing campaigns built around encouraging someone to play a game with friends or post an animated selfie.

Courtesy of HBO, Facebook

“Game of Thrones fans are dying for new engagement every day of the year,” Emily Giannusa, director of digital media and marketing at HBO, told Forbes. “The Facebook camera platform appealed to us because it was such a simple and simultaneously sophisticated avenue that allowed fans to transport themselves directly into the Game of Thrones universe. All fans needed was a smartphone.”

Only an estimated 5% of marketers are using AR technology now. However 17% of marketers plan to use the technology this year, according to a recent Forrester study, and AR is poised to be more relevant to marketers than VR for at least the next three years. While VR experiences make more sense for aspirational brands that spend court customers over long periods of time, Forrester’s Husson noted, vastly more marketers will benefit from experimenting with AR now.

An ‘Exploration Phase’

As AR extends from smartphones to wearables, the technology could become a nearly always-on enhancer of the human senses and a routine replacement for the search bar tailored to our location, interests and social network. But consumer uses for AR are still nascent. For the next several years, tech giants will be busy improving the core technology and building AR for the smartphone, with Facebook focusing on its forte, messaging and personal expression. With time, AR will likely make Facebook and the news feed look entirely different -- more immersive, video intensive and interactive, although the specifics are fuzzy.

Facebook will need to continually improve its AI to become faster and more precise, for example, at identifying objects in video, understanding how a scene is pixelated when it’s viewed from different angles and mapping the relationships between objects in a scene. To reach a wider range of users in the long term, including in developing markets, Facebook will need to make its AR features more compatible with weaker cell networks and older phones.

“We’re still at the basic exploration phase and we’re still building basic technology,” said Facebook’s Candela. “So my bias right now is on execution. At the same time, in the back burner we’re going to have some crazy exploratory projects as well.”

In the meantime, Facebook is “heavily investing” in hardware to support AR and build more social tools, according to Schroepfer. Beyond protecting Facebook’s relevance, improving the underlying AI has other benefits too, like helping fight spammers and problematic content with better visual recognition and language understanding tools, as well as advancing games and robotics.

“It would be misleading to think, ‘This is all about building space cat masks,’” said Candela. “You take our social infrastructure, the tech we’re pushing, and it’s hard to imagine what the applications would be. But I know for sure that this will unlock things that we haven’t thought about today -- The meaningful AR experiences will be very social, where you are yourself.”

Follow me on TwitterSend me a secure tip