Custom Computer Vision

Caio Gasparine
4 min readOct 29, 2020

--

Computer Vision is a scientific field that deals with how computers can “see” and understand what they are seeing based on digital images and digital videos — sometimes real-time. It tries to reproduce the human ability to see and identify different objects in different places and in different angles.

Custom Vision (customvision.ai) is an Artificial Intelligence service, cloud-based, and end-to-end platform for applying computer vision to your specific scenario, using your own data, and training your model using the power of a cloud environment.

In this example, I will be using the Microsoft cloud-based service customvision.ai to explore the process and results.

https://azure.microsoft.com/en-ca/services/cognitive-services/custom-vision-service/#security

The scope

Using customvision.ai, create a database with some images, train the model, and make the model recognize different images and classify them. This is a good example of supervised machine learning (image classification).

How does it work?

The first step is to load the images and tag each image with the respective meaning.

Then you have to train the model to connect each image with each tag, this way the model will understand what each image represents (tag).

The last step is to evaluate the model and verify the results (% accuracy) for each classification considering different photos (not the same used to train the model).

customvision.ai/projects (you will need an Azure Subscription to access this service)

Create a new project

Adding the info of your project

Add the images

Selected images to train the model

In this case, I selected 4 different kinds of images. Helicopters, airplanes, ships, and trains. The next step is to load a set of images and tag them with a specific name.

Loading airplanes <TAG>

Loading helicopters <TAG>

Loading ships <TAG>

Loading trains <TAG>

After you load all the photos, train the model

Select training type

Validating the results

Precision — Measures how accurate are your predictions. i.e. the percentage of your predictions are correct.

Recall — Measures how good you find all the positives. For example, we can find 70% of the possible positive cases in the top K predictions.

AP — (Average Precision)

You can check more details about precision, recall, and AP in this great and detailed post from Jonathan Hui:

Checking the results -> airplane

Checking the results -> train

Checking the results -> ship

Checking the results -> helicopter

Conclusion

As you can see even using draws in the specific tag the Computer Vision is able to recognize and to evaluate that the higher probability between the 4 different tags is correct.

Service Availability

There is an important note. This is a cloud-based service and maybe this service could not be available in all regions.

The Purpose

The purpose of this work is ONLY to present a simple solution for the proposed scope using free resources (or at least free for students) available on the web.

There IS NO pretension to show the most optimized solution and also to showcase the best tools. It is also important to recognize that, in some cases, has been applied more steps than needed to test and validate different features and services.

The main idea of this project is to explore possibilities, resources, and solutions that could be an alternative to solve the presented problem/scope in an effective way.

References

Thanks! ;-)

--

--

Caio Gasparine
Caio Gasparine

Written by Caio Gasparine

Project Manager | Data & AI | Professor

No responses yet