On March 30th, during the BUILD 2016 keynote presentation, Microsoft CEO Satya Nadella talked about “Conversations as a platform.” That means conversations between two or more people, between people and digital assistants, and between people and bots, which are intelligent applications that humans can converse with. Eventually this platform will include communication between bots and digital assistants.
In his presentation, Nadella also reiterated that Cortana is all about the user, not the devices. That said, Marcus Ash of the Cortana team took the stage and revealed that one million voice questions are directed towards Cortana on a daily basis. The good news is that Microsoft’s personal assistant will receive an upgrade in the “Anniversary Update” planned for this summer (aka Redstone 1), allowing it to work more closely with Office, such as sending a document the user created the night before.
In addition to improved Office integration, Cortana will also have the ability to track location details on Cortana-enabled devices. Ash explained that Cortana will also rely on a “team of experts” that are essentially apps worked into the Cortana ecosystem.
Ash also shared that the updated Cortana in the Anniversary Update will be made available in 13 countries this summer. However, the Cortana Developer Preview is available now, so that developers can get started working with the new features.
In the subsequent Skype presentation, Cortana also played a major role in the future of Microsoft’s popular chat client. The virtual assistant will be located in the upper right-hand corner of the application.
In a demo, Cortana sent the user a private message in Skype asking if it’s okay that a third party gets her location to send a shipment of cupcakes. The demo also showed that the user was able to reserve a block of time within the client to take a trip to Dublin while instant messaging to a friend located in that city. She was even capable of booking a hotel by using a Westin Hotel bot within Skype, and received a suggestion from Cortana to visit another friend while in Dublin based on another Skype chat.
Moving away from Cortana, the BUILD 2016 presentation revealed that intelligence is also being injected into real-time video via Skype Video bots. That said, developers now have access to the Skype Bot Platform SDK. Even more, the latest Skype app for the general public will provide access to Skype bots.
Also mentioned during the session was the Microsoft Bot Framework SDK. This will enable developers to more easily build secure bots that can communicate with each other. The SDK provides a reusable chat control that can be inserted into webpages to test the bot. There’s also the Bot Builder SDK, which does all the work of walking the end-user through tasks, like ordering pizza from Domino’s.
In a demo, Lili Cheng revealed a few Bot Framework tools for building natural language rules. With a simple click of a mouse, developers can expand a dictionary for a simple verb. They can even teach a bot to understand slang, like the use of “crib” to mean home. If the bot doesn’t understand what the end-user is trying to communicate, it can call in a real person to fix the problem so that it can finish the order correctly. As an example, if the bot doesn’t understand what you want on the pizza, it can alert a local worker to correct the order before moving on.
The platform contains a natural language backend that will help understand what the end-user is asking. It can also tell developers what it knows about a user request and present the user’s request so that the developer can tweak what the bot recognizes and how it responds. This platform will be made available for applications such as Slack, SMS, Skype, and more.
Finally, Cornelia Carapcea took the stage to talk about Microsoft’s Cognitive Services located at Microsoft.com/cognitive. Twenty-two APIs are available to developers today including Search, Vision, Language, and so on. This platform will link together a number of Microsoft capabilities and machine learning services.
In one example provide by Carapcea, the Vision API was used to recognize images in a just-taken photograph. In another demo, a CaptionBot was capable of adding a funny caption to any image pulled from Bing without any prior knowledge of that image. Another demo took bad audio and correctly translated the spoken words from the file into text using a process called CRISP.
The Intelligence section of the keynote ended on a high note, a video about a software developer from London named Saqib Shaikh. He built an app that’s capable of audibly describing people and places surrounding him. He’s been blind since the age of seven, and this app is built not only into a pair of shades, but on his smartphone as well. That means he can take photos of his surroundings, and the AI image recognition program can describe what it sees to him.