Explore Python Libraries: An Introductory Guide to Object Detection with Detectron2

By Kimaru Thagana

Sep 1, 2020 • 6 Minute Read

Introduction

Object detection is a subfield of computer vision that deals with identifying instances of semantic objects from digital images and videos. Usually, the identified object is detected and identified by drawing a bounding box around it. In an image, this is a static box, but in a video, this box is in motion, following the live object.

Object detection technology has several applications, such as face detection, people counting, optical character recognition (OCR), and fault and defect detection, among others. This is an exciting field of research and application, and big tech companies are investing and building tools to perform object detection. Such companies include Google with TensorFlow Object Detection API , Facebook with Detectron, Amazon AWS with Sagemaker, and ImageAI their Object Detection.

In this guide, you will go through an object detection example using Detectron2 by Facebook.

Assume you are a software developer looking to develop a proof of concept (PoC) of a face/person detection application for a security company. The client requests a PoC consisting of a simple program that, given an image, can draw a bounding box around a face/person if present.

To develop this, you choose to use Detectron2.

Detectron 2

A Pytorch based modular object detection software that is a successor of the previous library, Detectron2 was built on Caffe2. This is an improvement over its predecessor, especially in terms of training time, where Detectron2 is much faster. It also spots new features, such as cascaded R-CNN, panoptic segmentation, and DensePose, among others.

This guide assumes you have a fundamental understanding of computer vision, object detection, and at least intermediate knowledge in deep learning using Pytorch.

Setup

There are three main ways to set up.

1. Using Docker: Use these instructions to run Detectron2 in a Docker container. This method requires you to be knowledgable in Docker.

2. Building from source: To use this method, run the code block below. It should be noted that the version of GCC and G++ required is >=5.

          python -m pip install 'git+https://github.com/facebookresearch/detectron2.git'
python -m pip install -e detectron2
    

3. Installing Pre-Built Versions for Linux Only: Learn more about this method in the Detectron2 documentation.

Using Pre-trained Models

Pre-trained models are good for quick demos and can be downloaded from online resources such as model zoo. To run your object detection, use any image of your choosing and read it using opencv. Refer to the code below.

          import detectron2
import numpy as np
import os, cv2

from detectron2 import model_zoo
from detectron2.engine import DefaultPredictor
from detectron2.config import get_cfg
from detectron2.utils.visualizer import Visualizer

im = cv2.imread("./input.jpg")
cv2.imshow(im)
    

After this, you will then create a Detectron2 config (configuration variable) and a predictor to run inference (perform object detection) on the image you just loaded.

          cfg = get_cfg()
# load a weights file from online resources such as model zoo
cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml")
predictor = DefaultPredictor(cfg)
outputs = predictor(im)
    

At this point, the inference has already happened in the output variable. To check the number of identified objects and the classes of the predicted objects, run the code below.

          print(outputs["instances"].pred_classes)
print(len(outputs["instances"].pred_classes))
    

To visualize your results, you will require a special utility from Detectron2 called Visualizer.

          from detectron2.utils.visualizer import Visualizer
from detectron2.data import MetadataCatalog, DatasetCatalog
viualizer = Visualizer(im[:, :, ::-1], MetadataCatalog.get(cfg.DATASETS.TRAIN[0]), scale=1.2)
img_output = visualizer.draw_instance_predictions(outputs["instances"].to("cpu"))
cv2.imshow(img_output.get_image()[:, :, ::-1])
    

Source: Detectron2 sample code

The result will be the previously loaded image, but with bounding boxes around identified objects and the predicted class names on the boxes. Here's an example output created by the Detectron2 team:

Conclusion

In this guide, you have learned the basic use of Detectron2 by Facebook. With this skill, there are several exciting job roles predominantly in the computer vision space. Positions that involve object detection include computer vision engineers, computer vision researchers, and Image processing engineers.

To further build on the skills learned in this guide, challenge yourself to develop a custom object detection model that can detect anything wish. To make it even more challenging, collect the dataset from scratch. For example, you might decide to collect image data on dogs and build an object detector that can identify dogs in images and draw bounding boxes around them. To help you get started, consider this google Colab tutorial.

Kimaru T.

Kimaru is a firm believer of education as a tool of self sufficiency. As software development consultant, living in Kenya, he mainly works to bring small and medium sized business to the internet with custom solutions ranging from data processing to business digitization. Away from the field of coding and computer science, he participates as a mentor for young university students. In his free time, he prefers peace and quiet, away from screens but close to nature.

More about this author