(4.2) Using APIs

Caio Gasparine
11 min readNov 2, 2023

--

Translator | Face | Computer Vision

This is part of a series of articles called Azure Challenges. You can refer to the Intro Page to understand more about how the challenges work.

Before we start there are some important clarification points:

(1) Troubleshooting is IMPORTANT — It is important for you to exercise the error message search and solution, find bugs in your code, environment, IDE, etc.

(2) The code IS JUST a code — There are several ways to write code and different languages. The examples here are just one way to do it.

(3) This IS NOT a prep course — The main goal here is to show the practical application of Azure Resources with a focus on Enterprise AI solutions.

(4) You won’t be graded by the challenges but, they are an important practical component in your learning experience.

(5) Make sure you are using a FREE student account and check your costs!

First Steps

(1) Download the source files:

(2) Copy the folder IMAGES to your root directory — C:\

(There are some files used as examples in this folder)

Translator API

Code snapshots

Keys and Endpoint

The file:

Your Submission

Now try to run your code using the Translator/Language resource.

Add translations (link below) to Filipino, French (Canada), Punjabi, Turkish, and English.

Expected result

Face API

The Azure Face service provides AI algorithms that detect, recognize, and analyze human faces in images. Facial recognition software is important in many different scenarios, such as security, natural user interface, image content analysis and management, mobile apps, and robotics.

https://azure.microsoft.com/en-us/services/cognitive-services/face/

Attributes

Attributes are a set of features that can optionally be detected by the Face — Detect API. The following attributes can be detected:

Accessories. Whether the given face has accessories. This attribute returns possible accessories including headwear, glasses, and mask, with confidence score between zero and one for each accessory.

Age. The estimated age in years of a particular face.

Blur. The blurriness of the face in the image. This attribute returns a value between zero and one and an informal rating of low, medium, or high.

Emotion. A list of emotions with their detection confidence for the given face. Confidence scores are normalized, and the scores across all emotions add up to one. The emotions returned are happiness, sadness, neutral, anger, contempt, disgust, surprise, and fear.

Exposure. The exposure of the face in the image. This attribute returns a value between zero and one and an informal rating of underExposure, goodExposure, or overExposure.

Facial hair. The estimated facial hair presence and the length for the given face.

Gender. The estimated gender of the given face. Possible values are male, female, and genderless.

Glasses. Whether the given face has eyeglasses. Possible values are NoGlasses, ReadingGlasses, Sunglasses, and Swimming Goggles.

Hair. The hair type of the face. This attribute shows whether the hair is visible, whether baldness is detected, and what hair colors are detected.

Head pose. The face’s orientation in 3D space. This attribute is described by the roll, yaw, and pitch angles in degrees, which are defined according to the right-hand rule. The order of three angles is roll-yaw-pitch, and each angle’s value range is from -180 degrees to 180 degrees. 3D orientation of the face is estimated by the roll, yaw, and pitch angles in order. See the following diagram for angle mappings:

Makeup. Whether the face has makeup. This attribute returns a Boolean value for eyeMakeup and lipMakeup.

Mask. Whether the face is wearing a mask. This attribute returns a possible mask type, and a Boolean value to indicate whether nose and mouth are covered.

Noise. The visual noise detected in the face image. This attribute returns a value between zero and one and an informal rating of low, medium, or high.

Occlusion. Whether there are objects blocking parts of the face. This attribute returns a Boolean value for eyeOccluded, foreheadOccluded, and mouthOccluded.

Smile. The smile expression of the given face. This value is between zero for no smile and one for a clear smile

Keys and Endpoint

The file:

The code:

The Result

Your Submission

Analyze this file at the following address:

https://cdn.i-scmp.com/sites/default/files/d8/images/canvas/2023/11/07/b8a6e3e7-3a43-4848-a5aa-b282fff160a9_07c99f1e.jpg

Now try to run your code using the Face API and validate the results!

Please add a screenshot of the result and the image analyzed as part of your submission! ;-)

Computer Vision

Computer Vision // Describe + Categorize

Categorize an image

Identify and categorize an entire image, using a category taxonomy with parent/child hereditary hierarchies. Categories can be used alone or with our new tagging models.

Currently, English is the only supported language for tagging and categorizing images. Categorize an image

Describe an image

Generate a description of an entire image in human-readable language, using complete sentences. Computer Vision’s algorithms generate various descriptions based on the objects identified in the image. The descriptions are each evaluated, and a confidence score is generated. A list is then returned ordered from highest confidence score to lowest. Describe an image

The file:

The code:

Your Submission

Using cognitive services, write your code to analyze your preferred local file (image) and a URL image.

Please add a screenshot of the result and the image analyzed as part of your submission! ;-)

Computer Vision

Computer Vision // Image Tagging REMOTE

Computer vision returns tags based on thousands of recognizable objects, living beings, scenery, and actions.

After uploading an image or specifying an image URL, Computer Vision algorithms output tags based on the objects, living beings, and actions identified in the image.

Code

Output and image analyzed

Computer Vision // Image Tagging LOCAL

Your Submission

Using Azure AI Services / Cognitive Services, write your code to analyze the following URL and the following local file. URL:

https://wp.en.aleteia.org/wp-content/uploads/sites/2/2020/05/web3-family-large-big-home-brother-sister-mother-father-shutterstock_1197877477.jpg?w=640&crop=1

Local: C:\images\happy-family.jpg

Please add a screenshot of the result and the image analyzed as part of your submission! ;-)

Computer Vision

Computer Vision // Domain Specific Models

Detect Adult or Racy Content

The file:

The code:

Output and images analyzed

Your Submission

Using Azure AI Services / Cognitive Services, write your code to analyze your preferred local file (image) and a URL image.

Please add a screenshot of the result and the image analyzed as part of your submission! ;-)

Computer Vision

Computer Vision // Domain Specific Models

Celebrities and Landmarks

The file:

The code:

Your Submission

Using Azure AI Services / Cognitive Services, write your code to analyze your preferred local file (image) and a URL image.

Please add a screenshot of the result and the image analyzed as part of your submission! ;-)

Computer Vision

Computer Vision // Domain Specific Models

Brands

The file:

The code:

Your Submission

Using Azure AI Services / Cognitive Services, write your code to analyze the local file and URL below:

(1) Local file: C:\images\brand2.jpg

(2) URL: https://www.incimages.com/uploaded_files/image/1920x1080/GettyImages-71974463_431181.jpg

Please add a screenshot of the result and the image analyzed as part of your submission! ;-)

Computer Vision

Computer Vision // Thumbnail

A thumbnail is a miniature representation of a page or image that is used to identify a file by its contents. Thumbnails are an option in file managers, such as Windows Explorer, and they are found in photo editing and graphics programs to quickly browse multiple images in a folder.

The file:

The code:

Your Submission

Using Azure AI Services / Cognitive Services, write your code to analyze your preferred local file (image) and a URL image and generate your thumbnails.

Please add a screenshot of the result and the image analyzed as part of your submission! ;-)

Computer Vision

Computer Vision // Text Detection // OCR

Optical Character Recognition (OCR) allows you to extract printed or handwritten text from images, such as photos of street signs and products, as well as from documents — invoices, bills, financial reports, articles, and more. Microsoft’s OCR technologies support extracting printed text in several languages.

The file:

The code:

Your Submission

Using Azure AI Services / Cognitive Services, write your code to analyze the flowing URL.

https://media.wired.com/photos/59327d3b44db296121d6b881/master/w_1600%2Cc_limit/bond_0011_Layer-5.jpg

Please add a screenshot of the result and the image analyzed as part of your submission! ;-)

Computer Vision

Computer Vision // Face Detection

The file:

The code:

ATTENTION !!!

Your challenge is to try to fix that! ;-)

Retirement Notice!!!

Check your resource costs

Go to your resource and choose Cost analysis

Monitor your Resource utilization

Go to your resource and choose Metrics

Computer Vision // Available APIs returns

Describe an Image

  • This example describes the contents of an image with the confidence score.

Categorize an Image

  • This example extracts (general) categories from a remote image with a confidence score.

Tag an Image

  • This example returns a tag (keyword) for each thing in the image.

Detect Faces

  • This example detects faces in a local image, gets their gender and age, and marks them with a bounding box.

Detect Adult or Racy Content

  • This example detects adult or racy content in a local image and then prints the adult/racy score.

Detect Color

  • This example detects the different aspects of its color scheme in a local image.

Detect Domain-specific Content

  • This example detects celebrities and landmarks in local images.

Detect Image Types

  • This example detects an image’s type (clip art/line drawing).

Detect Objects

  • This example detects different kinds of objects with bounding boxes in a local image.

Detect Brands

  • This example detects common brands like logos and puts a bounding box around them.

Generate Thumbnail

  • This example creates a thumbnail from both a local and URL image.

The architecture so far…

Something to think about…

--

--

Caio Gasparine
Caio Gasparine

Written by Caio Gasparine

Project Manager | Data & AI | Professor

No responses yet