Making AI Possible
Today there are three mega trends converging that are making AI possible:
1. Big Compute
2. Powerful Algorithms
3. Massive Data
Microsoft is in a unique position to help you take advantage of these trends with three important assets:
1. Microsoft Azure, providing the best cloud for developers
2. Breakthrough in AI Innovations, through Microsoft Azure and their AI resources this innovation is brought to you as a developer
3. Data. Microsoft Graph gives you access to the most important data for your business and/or application, your data!
Microsoft has a strong vision that AI should be democratized and be available to everyone – developers, data scientists, enterprises, and yes even your dog. Microsoft has been involved and conducting research into AI for the last couple decades and infusing it into their products and services (Bing, Xbox, Office 365, Skype, Cortana, LinkedIn, etc). This research eventually found its way into a product known as Microsoft Cognitive Services.
Introducing Microsoft Cognitive Services
Microsoft Cognitive Services, formerly known as “Project Oxford” was first announced at Build 2016 conference and released as a preview. This is a rich collection of cloud-hosted APIs that let’s developers add AI capabilities such as vision, speech, language, knowledge and search into any application across any platform (Windows, Mac, iOS, Android, and Web) using simple RESTful APIs and/or SDKs (NuGet packages). Rather than having to deal with the complexities that come with machine learning, Cognitive Services provides simple APIs that handle common use cases, such as recognizing speech or performing facial recognition on an image.These APIs are based off machine learning and fit perfectly into the conversation-as-a-platform philosophy.
With Microsoft Cognitive Services, you can give your applications a human side. To date there are currently 29 APIs across 5 categories of Vision, Speech, Language, Knowledge, and Search. Let’s take a look at each of these categories:
Vision – From faces to feelings, allow your apps to understand images and videos
Speech – Hear and speak to your users by filtering noise, identifying speakers, and understanding intent
Language – Process text and learn how to recognize what users want
Knowledge – Tap into rich knowledge amassed from the web, academia, or your own data
Search – Access billions of web pages, images, videos and news with the power of Bing API’s
Labs – Microsoft Cognitive Services Labs is an early look at emerging technologies that you can discover, try and provide feedback before they become generally available
Why Use Microsoft Cognitive Services?
So why choose these APIs? It’s simple, they just work, their easy to work with, flexible to fit into any application or platform and their tested.
Easy – The APIs are easy to implement because their simple REST calls.
Flexible – These APIs all work on whatever language, framework, or platform your choose. This means you can easily incorporate into your Windows, iOS, Android and Web apps using the tools and frameworks you already use and love (.NET, Python, Node.js, Xamarin, etc.).
Tested – Tap into the ever growing collection of APIs developed by the experts. You as developers can trust the quality and expertise built into each API by experts in their field from Microsoft Research, Bing, and Azure Machine Learning.
What’s also nice to know is that Microsoft Cognitive Services is now using the same terms as other Azure services. Under these new terms you as a Microsoft Cognitive Services customer, you own and can manage and delete your data.
Cognitive Services Real-World Applications
The following is a set of possible real-world application scenarios:
The Computer Vision API is able to extract rich information from images to categorize and process visual data and protect your users from unwanted content. Here, the API is able to tell us what the photo contains, indicate the most common colors, and lets us know that the content would not be considered inappropriate for users.
The Bing Speech API is capable of converting audio to text, understanding intent, and converting text back to speech for natural responsiveness. This case shows us that the user has asked for directions verbally, the intent has been extracted, and a map with directions provided.
Language Understanding Intelligent Service, known as LUIS, can be trained to understand user language contextually, so your app communicates with people in the way they speak. The example we see here demonstrates Language Understanding’s ability to understand what a person wants, and to find the pieces of information that are relevant to the user’s intent.
Knowledge Exploration Service adds interactive search over structured data to reduce user effort and increase efficiency. The Knowledge Exploration API example here demonstrates the usefulness of this API for answering questions posed in natural language in an interactive experience.
Bing Image Search API enables you to add a variety of image search options to your app or website, from trending images to detailed insights. Users can do a simple search, and this API scours the web for thumbnails, full image URLs, publishing website info, image metadata, and more before returning results.
These APIs are available as stand-alone solutions, or as past of the Cortana Intelligence Suite. These APIs can also be used in conjunction with the Microsoft Bot Framework.
Use Case: How Uber is Using Driver Selfies to Enhance Security
There is a use case where Uber is using Microsoft Cognitive Services to offer real-time ID check. Using the Face API, drivers are prompted to verify their identity by taking a selfie and then verifying that image with the one they have one file. The Face API is smart enough to recognize if you’re wearing glasses or a hat letting you take action and ask your users to remove and retry the verification process. Uber has made rides safer by giving their clients peace of mind that the drivers have been verified.
Dig Deeper into AI
If you’re interested in learning more about Microsoft AI then be sure to checkout these two websites:
In my next post I’ll dig deeper into one of these APIs and walk through the code on how easily it is to incorporate into your applications.