Using the Azure Custom Vision API

Machine learning has infinite potential and we are at the forefront of discovering applications for it. In its infancy, machine learning is already being utilized for things like financial market predictions, medical diagnosis, and retail. It’s a complex topic that even the most experienced technologist might have difficulty navigating. Tasks like training models (the method of making predictions) often require a deep understanding of data science.

Fortunately, Microsoft has a suite of services called Azure Cognitive Services that provides the benefits of machine learning without having to understand its intricacies. Azure Cognitive Services offers many commonly used applications of machine learning like decision making, speech analysis, and computer vision. In this article, we’ll look at how the Custom Computer Vision Service can help us classify images based on their contents.

The Azure Cognitive Services Vision APIs include services for detecting common objects. For example, you can upload an image of a cheeseburger to the Computer Vision Service and it will correctly identify it as a cheeseburger. However, there might be some circumstances where you need to identify objects or scenarios that the Computer Vision Service does not recognize. This is where the Custom Vision Service comes into play. While the Computer Vision Service is already trained by Microsoft to recognize common objects, it’s up to you to train the Custom Vision Service for the specific things you want it to recognize.

Example Application

I have a large collection of video games, but it’s very unorganized. My favorite video game characters are Mario, Sonic, and Yoshi. I’d like to make a list of games that feature those characters. Now, I could look at each game title and determine this myself, but why would I do that when Azure can do it for me? I’ll train the Custom Vision Service to recognize these characters so that it can categorize these games for me based on their covers.

What You’ll Need

If you don’t have one already, create a free account on Microsoft Azure.
Go to https://www.customvision.ai and create a project
If you would like to follow along, you’ll need Visual Studio for Windows or Mac

Training the Model

First, we’ll need to train the service to recognize our characters. To train your first model, head over to https://www.customvision.ai which offers a UI for uploading, tagging, and training models. I’ve uploaded about 10 pictures of each character and tagged them by name (although Microsoft recommends uploading at least 50 for each tag you make).

I have also created negative tags to exclude things like company logos and box art. This makes is so that, for example, if we upload a picture of Sonic with the SEGA logo, the service won’t think the logo is Sonic. Remember, the computer is not a person, so without training it can’t recognize the difference in between a logo and a blue hedgehog.

After we’ve uploaded and tagged our images, we’ll click on Train. There are two types of training: Fast and Advanced. Fast Training is usually much quicker and works well when you have lots of good samples. Advanced Training takes anywhere from 1-24 hours but can be far more accurate. Note that you only get 1 free hour of training and, as of this writing, is $20 per hour thereafter. In other words, only used Advanced Training when you absolutely need it.

Once training is complete, which took about 20 seconds in this case, you will get metrics for the model overall and also per tag. Since this is a beginner’s guide, I won’t go over each metric in detail, but suffice to say that you want all three metrics to be as high as possible (although that doesn’t necessarily mean the model was trained well). If you have a specific tag where the metrics look low, upload more sample images or make sure they’re consistent. For example, I had some issues having a lot of 3D pictures of Mario but not many 2D pictures. Remember, this is a computer looking at images rather than a person. You can use the Quick Test feature to send a test image to predict. Sending the cover to Super Mario Bros. 3 gives me an 93% probability Mario is on the cover. Not bad for our first iteration, and the ears and tail on raccoon Mario didn’t throw it off too much. When you’re satisfied, publish the model so that it can be used by the program we’ll write.

Calling the Service to Make a Prediction

We’ll be making a simple .NET Console application that uploads an image stored on the local file system to the Custom Vision Service for prediction.

Microsoft offers an SDK for the Custom Vision Service for several platforms. For .NET Framework and .NET Core projects, the SDK is available via NuGet.

To follow along, create a new Console application in Visual Studio. Then, install the Microsoft.Azure.CognitiveServices.Vision.CustomVision.Prediction NuGet package.

You’ll need to grab some keys from the Custom Vision UI. Go to settings and get the project ID, the prediction key, and the domain for the prediction endpoint (just the domain, not the entire URL). Also get the name of the iteration of the model that was trained. I named mine “Iteration1”. Also, locate an image on your machine you want to upload to the service and get its path.

In the main block of the program, we initialize the connection to the service and point to the file. In a real-life scenario, you could point to a folder and send predictions in a batch or allow users to upload images.

We then send a MemoryStream of the image to the service, which will return a prediction. In this case, I am sending an image of the cover of Super Mario All-Stars, which produces this result:

Going Further

Not only can we use the Custom Vision SDK to make predictions, we can also train and publish iterations without using the UI.

This is useful for situations where you create your own application for managing your models and predictions, but keep in mind that you’ll want to train and publish models sparingly, especially if you are using the free version of the Custom Vision Service.

Conclusion

We live in exciting times, and thanks to Azure Cognitive Services, we can utilize advanced machine learning technology without a deep knowledge of data science. The services Azure Cognitive Services offers are just the tip of the iceberg. If you would like more information about Azure or machine learning and how it can help your business, there are several enthusiastic technologists at Stratus Innovations who would love to help.

Posted in Machine Learning