Create your own Vehicle Recognition system with Azure Custom Vision

8 min readMar 16, 2021

Photo: https://blog.automart.co.za/wp-content/uploads/2016/01/cars-driving-on-a-highway.jpg

Vehicle Recognition System Goals

In this tutorial, I will show you how I created my own vehicle recognition system. The goal was to determine three main components — the vehicle’s type, vehicle color, and vehicle make given a static image. To do this, I leveraged a couple of different Azure tools.

To start off, I looked at the capabilities of Computer Vision. This API is within Azure’s Cognitive Services. It is a pre-built API with certain functionalities. Given an image of vehicle(s), I could retrieve the vehicle type, dominant color of the picture, and sometimes the make of the vehicle in a JSON format. However, the make of the model did not always appear. I needed more flexibility and customization in my project to recognize specific content in the images. For example, if I wanted to recognize the vehicle make as a Lexus or not. For this reason, I decided to look into Azure’s Custom Vision service.

What is Azure Custom Vision?

Custom Vision is an image recognition service under Azure Cognitive Services. It allows you to train your own custom models with your own images. There are two types of models that you can train, an object detection model as well as a classification model. Both models will allow you to apply one or more labels to the image. With the object detection model however, you can see the coordinates of the specific label as well. In other words, you can see where the object is in an image through a bounding box.

In this project we will train a total of three custom models — one object detection model to determine the vehicle type as well as two classification models to determine the vehicle’s color and vehicle’s make.

Example Use Case:

What You Need:

Images of Vehicles (Cars, Trucks, RV’s, Busses, etc..) to train the models
Azure Subscription
Azure Storage Explorer (optional)

Part 1: Create Custom Vision resources in the Azure Portal

We will need to create a training and prediction resource to use the Custom Vision service. Start by going to the Azure portal.

Click on Create a resource -> Type in Custom Vision in the search bar -> Select Custom Vision from the marketplace -> Create -> Fill out the required information.

Part 2: Create a new project on the Custom Vision page

Classification vs. Object Detection project types

Custom Vision allows you to train either a classification or object detection model. Classification allows you to apply one or more labels to an image. Object Detection allows you to do the same, with one difference — you also get the coordinates of where the specified labels are found.

Create three separate projects — Vehicle Type, Vehicle Color, Vehicle Make

Go to the Custom Vision portal page and sign in with your Azure Portal account. To create a new project, we click the New Project button.

Create New Project on Custom Vision — Home

Here we can identify the project type as object detection or classification. Let’s start off with the Vehicle Type Project — an object detection model to determine if the vehicle is a car or a truck.

Choose training images

1. Amount of Images

To start training the model, you need about 15 images per label. However, I would recommend using a lot more per label for training to get more accurate results.

2. Think about lighting, angles, background.

TIP: Consider the following factor’s while looking for vehicle images — vehicles in different angles, different lighting, image background, etc. This will significantly impact the results of the model that you train.

Upload and tag images with the right label using bounding boxes.

To add your images, click Add images and then upload them from your local files. All of the uploaded images should appear in the untagged section. Go ahead and draw bounding boxes and label them according to the appropriate label. Examples of labels include cars, trucks, rv’s, busses, etc.

Photo of Vehicle: https://www.bing.com/images/blob?bcid=T4kFi-Hf13wC5w

Here I have an image with one car. I have drawn a bounding box around the object and labelled it as a car. It is important to ensure every single car or truck present in an image is labelled for better accuracy.

Repeat Part 2 for the other two models.

Vehicle Color (classification)

For ex: Red vs. Blue

Photo of Vehicle: https://data.1freewallpapers.com/download/ferrari-458-ferrari-sports-car-red.jpg

Vehicle Make (classification)

For ex: Lexus or not

Photo of Vehicle: https://ccnwordpress.blob.core.windows.net/journal/2019/09/lexus-nx-beach-main.jpg

Part 3: Train each of the object detection and classification models

Quick vs. Advanced

Click on the Train button at the top of the Custom Vision page to finally train your model. If you want to see quick results from your training, the Quick method is the best way to go. However, for improved performance, advanced training is useful. You can specify a compute time training budget.

Part 4: Evaluate the Models

From the Azure docs, some key definitions are:

Metrics

Precision — This is the fraction of identified classifications that were correct.
Recall — This is the fraction of actual classifications that were correctly identified.
Mean Average Precision — This is the average area under the precision/recall curve.

Other functionalities

Probability Threshold — You can use this slider to select the amount of confidence that a prediction needs to have to be considered correct.
Overlap Threshold — You can use this slider to determine how correct an object/classification prediction must be to be considered correct in training.

Part 5: Test your model

If you are not getting the results you want from your Quick or Advanced training, you might have to add more images and retrain your model.

Part 6: Publish Model

Retrieve endpoint

Click Publish -> Prediction URL -> Grab the prediction URL and prediction key

Part 7: Upload Test Images to Azure Blob Storage through Portal or Storage Explorer

Test Images

Grab some new images to test the model. Upload this to Blob Storage through the portal or through storage explorer.

Through Portal:

Go to your storage account -> Containers -> +Container -> Fill out info -> Upload your images

Azure Storage Explorer:

Find your storage account within your subscription -> Create a container under Blob Containers -> Upload your images

Code

Snippet 1: All required imports.

from azure.cognitiveservices.vision.customvision.training import CustomVisionTrainingClientfrom azure.cognitiveservices.vision.customvision.prediction import CustomVisionPredictionClientfrom azure.cognitiveservices.vision.customvision.training.models import ImageFileCreateBatch, ImageFileCreateEntry, Regionfrom msrest.authentication import ApiKeyCredentialsimport timefrom msrest.exceptions import HttpOperationErrorimport mathfrom PIL import Imagefrom io import BytesIOimport ioimport base64import requestsimport globimport osfrom dotenv import load_dotenvload_dotenv()

Snippet 2: Variables to store import endpoints and keys and authenticate the prediction client

credentials = ApiKeyCredentials(in_headers={“Training-key”: os.getenv(‘training_key’)})trainer = CustomVisionTrainingClient(os.getenv(‘ENDPOINT’), credentials)prediction_credentials = ApiKeyCredentials(in_headers={“Prediction-key”: os.getenv(‘prediction_key’)})predictor = CustomVisionPredictionClient(os.getenv(‘ENDPOINT’), prediction_credentials)

Snippet 3: Retrieve the three custom models

# vehicle detection: car vs truckproject = trainer.get_project(project_id= os.getenv(‘type_project_id’))iteration_name = “Iteration2”# vehicle color: blue vs redproject2 = trainer.get_project(project_id=os.getenv(‘color_project_id’))iteration_name2 = “Iteration1”# vehicle make: lexus vs acuravehicle_make = trainer.get_project(project_id=os.getenv(‘make_project_id’))iteration_name3 = “Iteration1”

Snippet 4: Retrieve the test images from Blob using the SAS tokens. Here# use trained endpoint to make a prediction

img_url = os.getenv('test_three') #red and blue lexusresults = predictor.detect_image_url(project.id, iteration_name, img_url)results1 = predictor.detect_image_url(vehicle_make.id, iteration_name3, img_url)# Now there is a trained endpoint that can be used to make a predictionprediction_credentials = ApiKeyCredentials(in_headers={"Prediction-key": os.getenv('prediction_key')})predictor = CustomVisionPredictionClient(os.getenv('ENDPOINT'), prediction_credentials)“test_three” is a variable that has a stored SAS token for the respective image. Now we can call on the trained endpoint to make a prediction on the test image.

Snippet 5: View the Image we are testing

response = requests.get(img_url)img = Image.open(BytesIO(response.content))img.show()width = img.size[0]height = img.size[1]

Output of Snippet 5:

Main Test Image File https://data.1freewallpapers.com/download/blue-red-lexus-lc-500-convertible-2020-4k-cars-1280x720.jpg

Snippet 6: Takes main image, crops it according to the objects, and then determines the vehicle color and make for the cropped images.

f = r’c://Users/dthakar/repos/VehicleRecognition/’images = glob.glob(“C:/Users/dthakar/repos/VehicleRecognition/Images/*.jpg”)vehicles_in_img=[]img_count = 1for prediction in results.predictions:if (prediction.probability >= 0.10) :# normalized bbox coordinates -> actual coordinatesupper_x = math.floor(prediction.bounding_box.left * width)upper_y = math.floor(prediction.bounding_box.top * height)h = math.ceil(prediction.bounding_box.height * height)w = math.ceil(prediction.bounding_box.width * width)lower_x = upper_x + wlower_y = upper_y + hcroppedimg = img.crop((upper_x, upper_y, lower_x, lower_y))vehicles_in_img.append(croppedimg)name = ‘C:/Users/dthakar/repos/VehicleRecognition/Images/file_’ + str(img_count) + ‘.jpg’for i in vehicles_in_img:# i.show()i.save(name)img_count +=1for im in images:with open(im, “rb”) as file:# print(file)results2 = predictor.classify_image(project2.id, iteration_name2, file.read())# Display the results.print(“Image “ + im)print(“Vehicle Type: “ + prediction.tag_name)for prediction2 in results2.predictions:if (prediction2.probability >= 0.50):print(“Vehicle Color: “ + prediction2.tag_name +“ {0:.2f}%”.format(prediction2.probability * 100))for prediction1 in results1.predictions:if (prediction1.probability >= 0.50):print(“Vehicle Make: “ + prediction1.tag_name)

Output of Snippet 6:

Results & Next Steps:

As seen through the code snippets above, we were able to determine the type of each vehicle in the image along with its respective color and make. To summarize, the main image was cropped according to the number of vehicle types present. In our case, the test image file had two cars. Therefore, two new files were created. The first file contained a cropped image of the red car. The second image contained a cropped image of the blue car. As a next step, the custom vision models for color and make were called on the cropped images. In our results, we see an accurate response. File 1 contained a Red Lexus Car. File 2 contained a Blue Lexus Car.

Azure Custom Vision allows for lots of customization as you train the models. By giving it a high quantity of images from various backgrounds and angles, one can ensure accurate results. As a next step, one could also extend the classifiers to classify between more colors and more makes. This would be known as having multi-class classifiers instead of binary classifiers to handle more vehicles. I would encourage everyone to build their own custom model through this service!