Mar 5, 2013 6:30 AM

Return of the Borg: How Twitter Rebuilt Google's Secret Weapon

As he was still trying to wrap his head around the enormity of Google's data center empire, John Wilkes went to work on the software system that orchestrates the whole thing. This software system is called Borg, and it's one of the best kept secrets of Google's rapid evolution into the most dominant force on the web. Wilkes won't even call it Borg. But he will say that Google has been using the system for a good nine or ten years and that he and his team are now building a new version of the tool, codenamed Omega.

Image may contain Human Person Electronics Text and Stereo

John Wilkes says that joining Google was like swallowing the red pill in The Matrix.

Four years ago, Wilkes knew Google only from the outside. He was among the millions whose daily lives so deeply depend on things like Google Search and Gmail and Google Maps. But then he joined the engineering team at the very heart of Google's online empire, the team of big thinkers who design the fundamental hardware and software systems that drive each and every one of the company's web services.

These systems span a worldwide network of data centers, responding to billions of online requests with each passing second, and when Wilkes first saw them in action, he felt like Neo as he downs the red pill, leaves the virtual reality of the Matrix, and suddenly lays eyes on the vast network of machinery that actually runs the thing. He was gobsmacked at the size of it all – and this was a man who had spent more than 25 years as a researcher at HP Labs, working to push the boundaries of modern computing.

"I'm an old guy. Megabytes were big things," Wilkes says, in describing the experience. "But when I came to Google, I had to add another three zeros to all my numbers." Google is a place, he explains, where someone might receive an emergency alert because a system that stores data is down to its last few petabytes of space. In other words, billions of megabytes can flood a fleet of Google machines in a matter of hours.

‘I prefer to call it the system that will not be named.’ John Wilkes

Then, as he was still trying to wrap his head around the enormity of Google's data-center empire, John Wilkes went to work on the software system that orchestrates the whole thing.

This software system is called Borg, and it's one of the best-kept secrets of Google's rapid evolution into the most dominant force on the web. Wilkes won't even call it Borg. "I prefer to call it the system that will not be named," he says. But he will tell us that Google has been using the system for a good nine or 10 years and that he and his team are now building a new version of the tool, codenamed Omega.

Borg is a way of efficiently parceling work across Google's vast fleet of computer servers, and according to Wilkes, the system is so effective, it has probably saved Google the cost of building an extra data center. Yes, an entire data center. That may seem like something from another world – and in a way, it is – but the new-age hardware and software that Google builds to run its enormous online empire usually trickles down to the rest of the web. And Borg is no exception.

At Twitter, a small team of engineers has built a similar system using a software platform originally developed by researchers at the University of California at Berkeley. Known as Mesos, this software platform is open source – meaning it's freely available to anyone – and it's gradually spreading to other operations as well.

The Borg moniker is only appropriate. Google's system provides a central brain for controlling tasks across the company's data centers. Rather than building a separate cluster of servers for each software system – one for Google Search, one for Gmail, one for Google Maps, etc. – Google can erect a cluster that does several different types of work at the same time. All this work is divided into tiny tasks, and Borg sends these tasks wherever it can find free computing resources, such as processing power or computer memory or storage space.

Wilkes says it's like taking a massive pile of wooden blocks – blocks of all different shapes and sizes – and finding a way to pack all those blocks into buckets. The blocks are the computer tasks. And the buckets are the servers. The trick is to make sure you never waste any of the extra space in the buckets.

"If you just throw the blocks in the buckets, you'll either have a lot of building blocks left over – because they didn't fit very well – or you'll have a bunch of buckets that are full and a bunch that are empty, and that's wasteful," Wilkes says. "But if you place the blocks very carefully, you can have fewer buckets."

‘Mesos makes it easier for Twitter engineers to think about running their applications across a data center. And that’s really powerful.’ Ben Hindman

There are other ways of doing this. You could use what's known as server virtualization. But virtualization provides an extra layer of complexity you may not need, and in cutting this out, Wilkes says, Google can reduce the size of its infrastructure by a few percent. At Google's size, that amounts to an entire facility. "It's another data center we can not build," Wilkes says. "A few percent here, a few percent there, and all of the sudden, you're talking about huge amounts of money."

At Twitter, Mesos doesn't have quite the same effect. Twitter's operation is significantly smaller. But the Twitterverse is always growing, and Mesos gives the company a better way to handle that growth. Borg and Mesos don't just wring extra computing power out of a server cluster. They let companies like Google and Twitter treat a data center like a single machine.

Google and Twitter can run software on these massive computing facilities in much the same way you run software on your desktop PC – and that simplifies the lives of all those engineers who build things like Gmail and Google Maps and any number of Twitter applications.

"Mesos makes it easier for Twitter engineers to think about running their applications across a data center," says Ben Hindman, who founded the Mesos project at UC Berkeley and now oversees its use at Twitter. "And that's really powerful."

It's a Data Center. But It Looks Like a Chip

Borg and Mesos are big things. But to understand them, it's best to think small, and a good place to start is one of the experimental computer chips Intel would send to Ben Hindman.

This was about five years ago, when Hindman was still at UC Berkeley, working on a computer science Ph.D, and the chips were "multi-core processors." Traditionally, the computer processor – the brain at the center of a machine – ran one task at a time. But a multi-core processor lets you run many tasks in parallel. Basically, it's a single chip that includes many processors, or processor cores.

At UC Berkeley, Ben Hindman's aim was to spread computing tasks across these chips as efficiently as possible. Intel would send him chips. He would wire them together, creating machines that spanned 64 or even 128 cores. And then he worked to build a system that could take multiple software applications and run them evenly across all those cores, sending each task wherever it could locate free processing power.

"What we found is that applications were smart about scheduling their computations across these computing resources, but they were also greedy. They would ignore other applications that might be running and just grab everything for themselves," Hindman says. "So we built a system that would only give an application access to a certain number of cores, and give others to another application. And those allocations might change over time."

‘Sixty-four cores or 128 cores on a single chip looks a lot like 64 machines or 128 machines in a data center.’ Ben Hindman

Hindman was working with a single computer. But as it turns out, he could apply the basic system to an entire data center. "Sixty-four cores or 128 cores on a single chip looks a lot like 64 machines or 128 machines in a data center," he says. And that's what he did. But it happened by accident.

While Hindman was working with his multi-core processors, some friends of his – Andy Konwinski and Matei Zaharia – were in another part of the Berkeley computer science department, working on software platforms that run across massive data centers. These are called "distributed systems," and they now provide the backbone for most of today's big web services. They include things like Hadoop, a way of crunching data using a sea of servers, and various "NoSQL" databases, which store information across many machines.

Then, Hindman and his friends decided they should work on a project together – if only because they liked each other. But they soon realized their two areas of research – which seemed so different – were completely complementary.

Traditionally, you run a distributing system like Hadoop on one massive server cluster. Then, if you want to run another distributed system, you set up a second cluster. But Hindman and his pals soon found that they could run distributed systems more efficiently if they applied the lessons learned from Hindman's chip project. Just as Hindman had worked to run many applications on a multi-core processor, they could build a platform that could run many distributed systems across a single server cluster.

The result was Mesos.

'We Miss Borg'

In March 2010, about a year into the Mesos project, Hindman and his Berkeley colleagues gave a talk at Twitter. At first, he was disappointed. Only about eight people showed up. But then Twitter's chief scientist told him that eight people was lot – about ten percent of the company's entire staff. And then, after the talk, three of those people approached him.

These were Twitter engineers who had once worked at Google: John Sirois, Travis Crawford, and Bill Farner. They told Hindman that they missed Borg, and that Mesos seemed like the perfect way to rebuild it.

Soon, Hindman was consulting at Twitter, working hand-in-hand with those ex-Google engineers and others to expand the project. Then he joined the company as an intern. And, a year after that, he signed on as a full-time employee. "My boss at the time said: 'You could have vested a year's worth of Twitter stock! What are you thinking!?" Hindman remembers. He and his fellow engineers continued to run Mesos as an open source software project, but at Twitter, he also worked to move the platform into the company's data center and fashion something very similar to Google Borg.

Google wasn't officially part of this effort. But the company helps fund the Berkeley AMP Lab, where the Mesos project was gestated, and those working on Mesos have regularly traded ideas with Googlers like John Wilkes. "We discovered they were doing it – and I started arranging from them to come down here every six months or so, just to have a chat," Wilkes says.

‘But there was a lot of very helpful feedback — at a high-level — about what the problems were, what we should be looking at.’ Andy Konwinski

Andy Konwinski, one of the other founders of the Mesos project, also interned at Google and spent part of that time working under Wilkes. "There was never any explicit information exchanged about specific systems run inside of Google – because Google is pretty secretive about those things," Konwinski says. "But there was a lot of very helpful feedback – at a high-level – about what the problems were, what we should be looking at."

Mesos is a little different from Borg, which is several years older. But the fundamental ideas are the same. And according to Hindman, Google's new version of Borg – Omega, which Wilkes has publicly discussed – is even closer to the Mesos model.

These are known as "server cluster management systems," following in the footsteps of similar tools built in years past to run supercomputers and services like the Sun Grid Engine. Both Omega and Mesos let you run multiple distributed systems atop the same cluster of servers. Rather than run one cluster for Hadoop and one for Storm – a tool for process massive streams of data in real-time – you can move them both onto one collection of machines. "This is the way to go," Wilkes says. "It can increase efficiency – which is why we do it."

The tools also provide an interface that software designers can then use to run their own applications atop Borg or Mesos. At Twitter, this interface is codenamed Aurora. A team of engineers, for example, can use Aurora to run Twitter's advertising system. At the moment, Hindman says, about 20 percent of the company's services run atop Mesos in this way.

Currently, Wilkes says, Google provides all sorts of dials that engineers can use to allot resources to their applications. But with Omega, the aim is to handle more of this behind the scenes, so that engineers needn't worry about the details. "Think of it an automatic car versus manual," he says. "You want to go fast. You shouldn't have to tune the compression ratio or the inlet manifold used for the turbo charger for the engine."

Omega is still under development, but the company is beginning to test prototypes in its live data centers.

Attack of the Clones

According to Wilkes, Google plans to publish a research paper on Borg (though he still won't use the name). The web giant often keeps a tight lip when it comes to the systems that underpin its online empire – it sees these technologies as the most of important of advantages over the competition – but once these tools have reached a certain maturity, the company will open the curtains.

Between this planned paper and the rise of Mesos at Twitter, the Borg model is poised to spread even further across the web. Other companies are already using Mesos – including AirBNB and Conviva, another company with close ties to UC Berkeley – and Wilkes believes the basic idea could significantly change the way companies run distributed systems.

Yes, there are other ways of efficiently spreading workloads across a cluster of servers. You could use virtualization, where you run virtual servers atop your physical machines and then load them with whatever software you like. But with Borg and Mesos, you don't have to worry about juggling all those virtual machines.

"The interface is the most important thing. The interface that virtualization gives you is a new machine. We didn't want that. We wanted something simpler," Hindman says. "We wanted people to be able to program for the data center just like they program for their laptop."

‘We wanted people to be able to program for the data center just like they program for their laptop.’ Ben Hindman

Wilkes says much the same thing. "If you're an engineer and you bring up a virtual machine, you get something that looks like just another piece of hardware. You have to bring up an operating system on it. You have to administer it. You have to update it. You have to do all the stuff you have to do with a physical machine," he says.

"But maybe that's not the most useful way for an engineer to spend their time. What they really want to do is run their application. And we give them a way to do that – without dealing with virtual machines."

Clearly, many engineers prefer working with raw virtual machines. This is what they get from Amazon EC2, and Amazon's cloud computing service has become a hugely popular way to build an run software applications – so popular that countless companies are trying to provide developers and businesses with similar tools.

Charles Reiss – a Berkeley graduate student who interned at Google under John Wilkes and has seen Borg in action – doesn't believe this existing system offers an enormous advantage over the alternatives. "I don't think it's super impressive – beyond having just tons of engineering hours poured into it," he says. But Omega, he adds, is another matter.

With Omega, Google aims to make the process ever smoother – much like Twitter has done with Mesos and Aurora – and in the long term, others will surely follow their lead. Google and Twitter treat the data center like one big computer, and eventually, that's where the world will end up. This is the way computer science always progresses. We start with an interface that's complicated and we move to one that's not. It happens on desktops and laptops and servers. And now, it's happening with data centers too.