a wooden block spelling truth next to a bouquet of flowers
Machine Learning

Generating Ground Truth Images

Ground truth images are essential for the evaluation of object detection or background subtraction techniques. Now what is a ground truth image? Let’s consider a video frame captured using a fixed camera where some objects are static while some are moving. A ground truth image corresponding to this image labels the static and moving parts with two different labels. Usually, white pixels are used for moving objects and black pixels for static pixels. The goal of an object detection technique is to produce detection image close to the ground truth image. To compare the performance of two or more object detection techniques, detection images are compared with the corresponding ground truth images. Most standard datasets come with ground truth images for the test sequences. However, the availability of ground truth images is not uniform across all datasets. For example, some datasets provide ground truth image for just one test frame only while some other datasets provide for all video frames. On the other hand, if you want to want to evaluate performance on your own datasets, there would be no ground truth images at all. A technique to manually generate the ground truth image for a video frame is explained below.

First, make sure that the test video is in image sequence format, i.e., all video frames are split into individuals images. Observe the  previous and next frames for the target frame to identity the moving objects in that frame. Then zoom out the target frame using the image editing software so that you can closely observe the boundaries of the moving entities. Now use a custom section tool to select the objects. Once all objects are selected press delete; it will clear the object regions or in other words will fill them with white pixels. Now using an invert selection tool select the  non object regions of the frame. Then press delete and fill that region with black color using the flood fill tool. You are done. Save the final image where the moving and pixels are labeled with white and black colors respectively.