Biz & IT —

Microsoft’s bid to bring AI to every developer is starting to make sense

The APIs are getting good enough to be built into production systems.

Microsoft’s bid to bring AI to every developer is starting to make sense

SEATTLE—For the third year in a row, Microsoft is heavily promoting machine-learning services at its Build developer conference. Over the three years, some of the language used around the services has changed—the "machine learning" term seems to have fallen out of favor, being replaced by the better-known "artificial intelligence," and Microsoft has added many more services. But the bigger change is that ubiquitous intelligence now seems a whole lot more feasible than it did three years ago.

Three years ago, the service selection was narrow—a language service that identified important elements from natural language, speech-to-text and text-to-speech, an image-recognition service, a facial recognition service. But outside of certain toy applications, such as Microsoft's age-guessing website, the services felt more than a little abstract. They felt disconnected from real-world applications.

Last year, the services took shape a little more. The bot bandwagon was just getting started, with Microsoft offering a framework for developers to build their own chatbots, and the right plumbing components have been published to hook those bots up to things like Skype and Teams. The appeal of the bots seemed perhaps limited, but other components that were displayed, such as a training user interface to help refine the language-understanding service, looked more promising. They showed ways in which a developer who wasn't an expert in machine learning or artificial intelligence could not just build systems that used machine-learning components, but which tailored those components to tackle the specific problem area the developer was interested in.

This year, the machine-learning story is improving once again. More services have been added, to make the platform able to do more things. Some of these are similar to the old services; for example, there's an image recognition service, "Custom Vision." The difference between this and the old vision service is that the new one is trainable. The old service has a corpus of objects that it understands, and if it sees them in a picture, it'll tell you. But if that corpus doesn't match the needs of your application, there's no way to add to it. The new service lets you upload small amounts of training data—about 20 representations of each object, typically—to generate a new image recognition model. The model generation itself, however, is entirely handled by the service; developers don't need to understand how it works.

Microsoft also has what it calls "Cognitive Services Labs," where developers can create more experimental AI-like services. The first of these is a gesture-recognizing service.

As well as working to build more trainable services, Microsoft is also working to train its bots to recognize certain standard processes, such as specifying a date or taking payment information.

Actually starting to be useful

These various machine-learning components are starting to become versatile enough and useful enough that they can solve problems that couldn't be solved before. Last year, Rolls-Royce, for example, developed a system that takes buzzwords—"Internet of Things" and "machine learning"—and did something useful with them. Rolls-Royce makes jet engines used in commercial airliners, and its latest jet engines are Internet of Things jet engines: they collect tons of telemetry data about operating conditions and upload them to Azure. The telemetry data is then combined with plane-level information such as altitude and flight plan.

Rolls-Royce has used machine learning to build a model that takes all this data and estimates when engine components will fail. This, in turn, allows preventative maintenance to be performed; the system can make estimates of which components are near the end of their lifetime (even if that lifetime has been prematurely shortened, as would be the case for an engine used on a plane only used for short flights). The system then advises that maintenance be performed to swap out the parts before they actually fail. This is even tied into inventory management, so the system can suggest making a replacement a little sooner than otherwise necessary, if it knows that the plane is flying somewhere that doesn't have the right parts available.

Hand-in-hand with these intelligent services, Microsoft has promoted its bot framework. Many people have misgivings about the industry-wide focus on bots, finding it hard to envisage a world in which we routinely type or talk to computer programs. However, Microsoft says that the bots have been instrumental in letting people learn how to use the cognitive services, and the company has seen substantial growth in developer interest for bots, especially in business-to-consumer roles. Using text chat on the Web to talk to a low-level sales rep or tech support person is a pretty common activity, for example, and some of this workload is a good match for bots with a suitable understanding of the problem domain.

Culture appears to play a significant role. We all remember Microsoft's neo-Nazi chatbot, Tay, but what's often forgotten is that Redmond had a different chatbot, XiaoIce, that spoke Chinese to Chinese users. That chatbot didn't have any of the problems that Tay did, and the Chinese market uses XiaoIce in a very different way; as well as using the bot's interactive or conversational features, Microsoft has found that people will just talk to it, unwinding from the day's stresses or using it as a sounding board of sorts.

Some of these differences are obvious when explained; for example, we were told that adoption of speech-to-text was much higher in China than in other countries because keyboard entry of Chinese text is much more awkward. Others were a little more surprising. Microsoft has found that even when the input modality is the same, audience demographics change the kind of language that's used with bots, and the things people ask the bots to do. While Facebook Messenger and Kik are both text chat, the older audience on Messenger uses bot services differently than the younger Kik crowd.

Even bot-averse users might find that they're more amenable to the concept in, for example, Teams or Slack. The conceptual shift from typing to your colleagues to typing to a bot feels much smaller.

Beyond bots

But the cognitive services don't live or die on the success of bots anyway. We're already seeing hints of more subtle interfaces, such as Cortana reading your e-mails and figuring out if you have have committed to any particular actions within them—she'll remind you to call people if you previously promised to do something by a given date. Doing this effectively requires comparable natural language parsing to a chatbot, but it transforms the intelligence from a system that must be explicitly interacted with into one that's altogether more transparent.

It's still early days for machine learning, and these capabilities are far from ubiquitous. The shift to "artificial intelligence" terminology is also unfortunate, as it sets users up for disappointment—these systems are still a long way short of rivaling Lt. Cmdr. Data or the Terminator, and these fictional characters arguably define the widespread perception and understanding of "artificial intelligence."

But the overall movement is positive. Over the last couple of years, Microsoft's cognitive services have gone from abstract and somewhat impenetrable to a useful set of tools that developers of all kinds can integrate into their apps, all without having to be experts in machine learning or artificial intelligence.

Channel Ars Technica