Detectron2 Train a Instance Segmentation Model on Custom Data

Here we import the libraries which we are going to be using in this notebook

os – Used for making links / paths to save retrive information from directories
cv2 – Pre process images to feed them into detectron2 for training for example: change BGR to RBG format for training
cv2_imshow – Used to display images in Google CoLab as regular imshow doesnt work
register_coco_instances – Register COCO formatted dataset with Detectron2, so that it understands the annotation formats
DatasetCatalog, MetadataCatalog – Used to load the image data & its Meta data
Visualizer – Used to display the predicted images with masks
model_zoo – Access Detectron2 model zoo i.e., the repository where all pre-trained models & weights can be downloaded from
get_cfg – Package containing the configuration files for each model, used for training the model
DefaultPredictor – Instantiate a Predictor object, used to make predictions using the trained model
DefaultTrainer – To train the Detectron2 model

# COMMON LIBRARIES import os import cv2


from datetime import datetime

from google.colab.patches import cv2_imshow
# DATA SET PREPARATION AND LOADING

from detectron2.data.datasets import register_coco_instances

from detectron2.data import DatasetCatalog, MetadataCatalog
# VISUALIZATION

from detectron2.utils.visualizer import Visualizer

from detectron2.utils.visualizer import ColorMode
# CONFIGURATION

from detectron2 import model_zoo

from detectron2.config import get_cfg
# EVALUATION

from detectron2.engine import DefaultPredictor

# TRAINING from detectron2.engine import DefaultTrainer

In the code below we are doing the following:

Instantiating a configuration object
Pulling and merging Mask RCNN config yml file from detectron2 model zoo (the place where all models are stored)
Setting Region of Interest (ROI) threshold i.e., the score above which we will consider predictions
Settings Pre Trained Weights – To Mask RCNN weights from Model Zoo
Predictor – Instantiating a Detectron2 Predictor for inferencing
Inferencing on a image

We use `football-pitch-segmentation` dataset as example. This Dataset is in `COCO Segmentation` format.

Structure of your dataset should look like this:

dataset-directory/
├─ README.dataset.txt
├─ README.roboflow.txt
├─ train
│ ├─ train-image-1.jpg
│ ├─ train-image-1.jpg
│ ├─ …
│ └─ _annotations.coco.json
├─ test
│ ├─ test-image-1.jpg
│ ├─ test-image-1.jpg
│ ├─ …
│ └─ _annotations.coco.json
└─ valid
├─ valid-image-1.jpg
├─ valid-image-1.jpg
├─ …
└─ _annotations.coco.json

Detectron2 requires you to register a dataset before you actually use it to train a model.

To let detectron2 know how to obtain a dataset named “football-pitch-segmentation”, users need to implement a function that returns the items in your dataset and then tell detectron2 about this function. Below we use detector2’s “register_coco_instances” function which we imported from detectron2.data.datasets to register our dataset

Registering in essence tells Detectron2 how to obtain your dataset.

DATA_SET_NAME = dataset.name.replace(" ", "-") ANNOTATIONS_FILE_NAME = "_annotations.coco.json"

# TRAIN SET
TRAIN_DATA_SET_NAME = f”{DATA_SET_NAME}-train”
TRAIN_DATA_SET_IMAGES_DIR_PATH = os.path.join(dataset.location, “train”)
TRAIN_DATA_SET_ANN_FILE_PATH = os.path.join(dataset.location, “train”, ANNOTATIONS_FILE_NAME)

register_coco_instances(
name=TRAIN_DATA_SET_NAME,
metadata={},
json_file=TRAIN_DATA_SET_ANN_FILE_PATH,
image_root=TRAIN_DATA_SET_IMAGES_DIR_PATH
)

# TEST SET
TEST_DATA_SET_NAME = f”{DATA_SET_NAME}-test”
TEST_DATA_SET_IMAGES_DIR_PATH = os.path.join(dataset.location, “test”)
TEST_DATA_SET_ANN_FILE_PATH = os.path.join(dataset.location, “test”, ANNOTATIONS_FILE_NAME)

register_coco_instances(
name=TEST_DATA_SET_NAME,
metadata={},
json_file=TEST_DATA_SET_ANN_FILE_PATH,
image_root=TEST_DATA_SET_IMAGES_DIR_PATH
)

# VALID SET
VALID_DATA_SET_NAME = f”{DATA_SET_NAME}-valid”
VALID_DATA_SET_IMAGES_DIR_PATH = os.path.join(dataset.location, “valid”)
VALID_DATA_SET_ANN_FILE_PATH = os.path.join(dataset.location, “valid”, ANNOTATIONS_FILE_NAME)

register_coco_instances(
name=VALID_DATA_SET_NAME,
metadata={},
json_file=VALID_DATA_SET_ANN_FILE_PATH,
image_root=VALID_DATA_SET_IMAGES_DIR_PATH
)

We are going to check if the downloaded Pre-trained Detectron2 Model is working. Before training the model on our dataset we will do a quick check by performing inferencing on a image & see if the model predicts the masks, bounding boxes & classes correctly

Below we download a image from Coco dataset & display it
!wget http://images.cocodataset.org/val2017/000000439715.jpg -q -O input.jpg image = cv2.imread("./input.jpg") cv2_imshow(image)

In the code snippet below we do the following:

cfg – Instantiate a configuration object, this holds the model parameters used for training
merge_from_file – Populate the configuration object from a yaml file which holds the parameters for the model
ROI_HEADS.SCORE_THRESH_TEST – Set this to 0.5 or 50%, which tells the model to only consider predictions of 0.5 & above threshold
WEIGHTS – Check the trained checkpoint i.e., weights to be used for inferencing
DefaultPredictor – Instantiate a predictor object by passing the configuration object to predictor
predictor(image) – Pass the image to predictor & get the inferencing outputs

cfg = get_cfg() cfg.merge_from_file(model_zoo.get_config_file("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml")) cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5 cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml") predictor = DefaultPredictor(cfg) outputs = predictor(image)
Verify if prediction is done correctly by printing tensors which contain the outputs for Predicted Classes & Bounding boxes
print(outputs["instances"].pred_classes) print(outputs["instances"].pred_boxes)

Lets take a look at one of the training images along with the mask, bounding box & class displayed on the image

metadata = MetadataCatalog.get(TRAIN_DATA_SET_NAME) dataset_train = DatasetCatalog.get(TRAIN_DATA_SET_NAME)

dataset_entry = dataset_train[0]
image = cv2.imread(dataset_entry[“file_name”])

visualizer = Visualizer(
image[:, :, ::-1],
metadata=metadata,
scale=0.8,
instance_mode=ColorMode.IMAGE_BW
)

out = visualizer.draw_dataset_dict(dataset_entry)
cv2_imshow(out.get_image()[:, :, ::-1])

The below code is to configure the hyperparameters in the configuration file & then to set the output directory. In our case I have set the output directory to a Google Drive location for persistance storage. The storage allocated in Google Colab is temperory & is de allocated as soon as you disconnect from the Colab allocated instance

You also have option to use Google Colab’s temporary storage which is de-allocated as soon as you disconnect the notebook from Colab instance. If you plan to use this then be sure to backup / download the trained model to your computer once training is finished

# HYPERPARAMETERS #https://github.com/facebookresearch/detectron2/blob/main/configs/COCO-InstanceSegmentation/mask_rcnn_R_101_FPN_3x.yaml ARCHITECTURE = "mask_rcnn_R_101_FPN_3x" CONFIG_FILE_PATH = f"COCO-InstanceSegmentation/{ARCHITECTURE}.yaml" MAX_ITER = 2000 EVAL_PERIOD = 200 BASE_LR = 0.001 NUM_CLASSES = 3

# OUTPUT DIR – Uncomment to use Google Colabs temporarly allocated storage, however you’ll loose trained model once u disconnect from Colab
# OUTPUT_DIR_PATH = os.path.join(
# DATA_SET_NAME,
# ARCHITECTURE,
# datetime.now().strftime(‘%Y-%m-%d-%H-%M-%S’)
# )

# OUTPUT DIR – Customized output dir for GDrive
GDRIVE_PATH = ‘/content/football-pitch-segmentation’
OUTPUT_DIR_PATH = os.path.join(
GDRIVE_PATH,
ARCHITECTURE,
datetime.now().strftime(‘%Y-%m-%d-%H-%M-%S’)
)

os.makedirs(OUTPUT_DIR_PATH, exist_ok=True)

We need to provide a configuration file to train the model. In the below code snippet we do the following:
Instantiate a config object
Merge the existing config file into this object
Weights – Provide the path to pre-trained weights
Train Dataset – Set the train dataset
Test Dataset – Set the test dataset
Set other parameters as given below

cfg = get_cfg() cfg.merge_from_file(model_zoo.get_config_file(CONFIG_FILE_PATH)) cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url(CONFIG_FILE_PATH) cfg.DATASETS.TRAIN = (TRAIN_DATA_SET_NAME,) cfg.DATASETS.TEST = (TEST_DATA_SET_NAME,) cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 64 cfg.TEST.EVAL_PERIOD = EVAL_PERIOD cfg.DATALOADER.NUM_WORKERS = 2 cfg.SOLVER.IMS_PER_BATCH = 2 cfg.INPUT.MASK_FORMAT='bitmask' cfg.SOLVER.BASE_LR = BASE_LR cfg.SOLVER.MAX_ITER = MAX_ITER cfg.MODEL.ROI_HEADS.NUM_CLASSES = NUM_CLASSES cfg.OUTPUT_DIR = OUTPUT_DIR_PATH

Now that the model is trained we evaluate its performance by performing the following steps:

Load Weights – Populate the configuration file with the path to trained model weights

Region of Interest (ROI) Threshold – Set the ROI threshold so that only predictions above 0.7 are considered

Predictor – Instantiate a predictor object by passing the model configuration file to it

cfg.MODEL.WEIGHTS = os.path.join(cfg.OUTPUT_DIR, "model_final.pth") cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.7 predictor = DefaultPredictor(cfg)

In the below code we loop through each image file in validation dataset, pass the image to model to make prediction & then visualize the predicted image to check if the predictions are accurate
dataset_valid = DatasetCatalog.get(VALID_DATA_SET_NAME)

for d in dataset_valid:
img = cv2.imread(d[“file_name”])
outputs = predictor(img)

visualizer = Visualizer(
img[:, :, ::-1],
metadata=metadata,
scale=0.8,
instance_mode=ColorMode.IMAGE_BW
)
out = visualizer.draw_instance_predictions(outputs[“instances”].to(“cpu”))
cv2_imshow(out.get_image()[:, :, ::-1])

Detectron2 Train a Instance Segmentation Model

Mount Google Drive

Create Symbolic link for Google Drive

Unlink Symbolic link for Google Drive

Check GPU & CUDA Versions

Install Detectron2

Check CUDA, PyTorch, Detectron2 Versions

Import Libraries

Check if Everything is Installed Correctly – Run a Pre-trained Detectron2 Model

Get Configuration File & Predict on Image

Check Bounding Box Co-ordinates & Predicted Classes

Visualize Predictions

Football Dataset

Get Dataset from Roboflow

Register Dataset with Detectron2

Check if Pre-trained Detectron2 Model is Working

Check if custom Dataset Registered Correctly

Visualize Training Dataset Image

Train Model – Set Hyperparameters & Output Directory

Train Model – Configuration File Setup

Training Model

Tensorboard – To Check training progress

Evaluate the Trained Model

Visualize & Evaluate the Trained Model

AI Deep Learning