Upload Labeled Images to Object Detection Projects

Updated on 21 Dec 2024
7 Minutes to read

Print
Dark
Light
PDF

Article summary

Did you find this summary helpful?

Thank you for your feedback!

This article applies to these versions of LandingLens:

LandingLens	LandingLens on Snowflake
✓	✓ (see exceptions below)

If you have labeled images you want to upload to Object Detection projects, you can upload them in the Pascal VOC (Visual Object Classes) format. This format involves uploading the original (unlabeled) image and a corresponding XML file. The XML file contains the label (annotation) details of its paired image. The XML file essentially tells LandingLens where each label is on its associated image and what the name of the class is.

The image and XML file must have the same file name (with different extensions). For example:

vehicle_123.png
vehicle_123.xml

Note:

If you're using LandingLens on Snowflake, uploading Pascal VOC files is not supported when loading images from Snowflake stages. You can still upload the files from a hard drive.

Pascal VOC XML Files

When you upload an image and its corresponding XML file to an Object Detection project in LandingLens, the XML file must follow the format of the following code sample:

<?xml version='1.0' encoding='UTF-8'?>
<annotation>
  <object>
    <pose>Unspecified</pose>
    <occluded>0</occluded>
    <difficult>0</difficult>
    <truncated>0</truncated>
    <name>Diamond</name>
    <labeler>0f7936ab-7025-482f-9812-4d9c778e00f6</labeler>
    <bndbox>
      <xmax>926</xmax>
      <xmin>652</xmin>
      <ymax>3046</ymax>
      <ymin>2808</ymin>
    </bndbox>
  </object>
</annotation>

Pascal VOC Elements

The following table explains the elements in the Pascal VOC XML file.

Element	Definition	Example
`<?xml version='1.0' encoding='UTF-8'?>`	The XML preamble that defines the XML standard and encoding method.	`<?xml version='1.0' encoding='UTF-8'?>`
`<annotation>`	This element wraps around all objects.	`<annotation>`
`<object>`	This element wraps around all elements that define the labeled object.	`<object>`
`<pose>`	Identifies if the image is skewed. The default value is `Unspecified`, which means that the image is not skewed.	`<pose>Unspecified</pose>`
`<occluded>`	Indicates if the labeled object overlaps another labeled object. Values: The labeled object is not overlapped: `0` The labeled object is overlapped: `-1`	`<occluded>0</occluded>`
`<difficult>`	Indicates if the object is easy to detect. Values: The object is easy to detect: `0` The object is difficult to detect: `-1`	`<difficult>0</difficult>`
`<truncated>`	Indicates if a bounding box doesn't include the full object. This could happen because the object isn't fully in the frame (ex: the object is a cat but it's only halfway in the picture) or because there is something in front of the object. Values: The bounding box includes the full object: `0` The bounding box does not include the full object: `-1`	`<truncated>0</truncated>`
`<name>`	The name of the class (label). When the image and the corresponding XML file are uploaded to LandingLens, LandingLens converts this name into a class.	`<name>Diamond</name>`
`<bndbox>`	This element wraps around the dimensions for the edges of the bounding box, which is the rectangle around the labeled object.	`<bndbox>`
`<xmax>`	The distance from the left edge of the image to the right side of the bounding box (in pixels).	`<xmax>1032</xmax>`
`<xmin>`	The distance from the left edge of the image to the left side of the bounding box (in pixels).	`<xmin>2085</xmin>`
`<ymax>`	The distance from the top of the image to the bottom side of the bounding box (in pixels).	`<ymax>2800</ymax>`
`<ymin>`	The distance from the top of the image to the top side of the bounding box (in pixels).	`<ymin>2396</ymin>`
`<filename>`	The file name and image file extension of the image.	`<filename>IMG_6397.jpg</filename>`
`<segmented>`	Indicates if any of the bounding boxes are not rectangles (for example if one is a hexagon). Values: All bounding boxes are rectangles: `0` At least one bounding box is not a rectangle: `-1`	`<segmented>0</segmented>`
`<size>`	This element wraps around the dimensions of the image.	`<size>`
`<height>`	The height of the image (in pixels).	`<height>4032</height>`
`<width>`	The width of the image (in pixels).	`<width>3024</width>`

Convert to Pascal VOC

If you have images annotated in a format that is NOT Pascal VOC, you can use the Python scripts in this section to convert from YOLO or COCO to Pascal VOC.

To convert from other formats, use a third-party conversion tool to convert the files to Pascal VOC. Then upload the images and the XML files to LandingLens.

Convert from YOLO to Pascal VOC

Use the following Python script to convert images from YOLO to Pascal VOC:

import argparse
import os
import sys
import shutil

import cv2
from lxml import etree, objectify

# Write label information to xml
from tqdm import tqdm

images_nums = 0
category_nums = 0
bbox_nums = 0


def save_anno_to_xml(filename, size, objs, save_path):
    E = objectify.ElementMaker(annotate=False)
    anno_tree = E.annotation(
        E.folder("DATA"),
        E.filename(filename),
        E.source(
            E.database("The VOC Database"),
            E.annotation("PASCAL VOC"),
            E.image("flickr")
        ),
        E.size(
            E.width(size[1]),
            E.height(size[0]),
            E.depth(size[2])
        ),
        E.segmented(0)
    )
    for obj in objs:
        E2 = objectify.ElementMaker(annotate=False)
        anno_tree2 = E2.object(
            E.name(obj[0]),
            E.pose("Unspecified"),
            E.truncated(0),
            E.difficult(0),
            E.bndbox(
                E.xmin(obj[1][0]),
                E.ymin(obj[1][1]),
                E.xmax(obj[1][2]),
                E.ymax(obj[1][3])
            )
        )
        anno_tree.append(anno_tree2)
    anno_path = os.path.join(save_path, filename[:-3] + "xml")
    etree.ElementTree(anno_tree).write(anno_path, pretty_print=True)


def xywhn2xyxy(bbox, size):
    bbox = list(map(float, bbox))
    size = list(map(float, size))
    xmin = (bbox[0] - bbox[2] / 2.) * size[1]
    ymin = (bbox[1] - bbox[3] / 2.) * size[0]
    xmax = (bbox[0] + bbox[2] / 2.) * size[1]
    ymax = (bbox[1] + bbox[3] / 2.) * size[0]
    box = [xmin, ymin, xmax, ymax]
    return list(map(int, box))


def parseXmlFilse(image_path, anno_path, save_path):
    global images_nums, category_nums, bbox_nums
    assert os.path.exists(image_path), "ERROR {} dose not exists".format(image_path)
    assert os.path.exists(anno_path), "ERROR {} dose not exists".format(anno_path)
    if os.path.exists(save_path):
        shutil.rmtree(save_path)
    os.makedirs(save_path)

    category_set = []
    with open(anno_path + '/classes.txt', 'r') as f:
        for i in f.readlines():
            category_set.append(i.strip())
    category_nums = len(category_set)
    category_id = dict((k, v) for k, v in enumerate(category_set))

    images = [os.path.join(image_path, i) for i in os.listdir(image_path)]
    files = [os.path.join(anno_path, i) for i in os.listdir(anno_path)]
    images_index = dict((v.split(os.sep)[-1][:-4], k) for k, v in enumerate(images))
    images_nums = len(images)

    for file in tqdm(files):
        if os.path.splitext(file)[-1] != '.txt' or 'classes' in file.split(os.sep)[-1]:
            continue
        if file.split(os.sep)[-1][:-4] in images_index:
            index = images_index[file.split(os.sep)[-1][:-4]]
            img = cv2.imread(images[index])
            shape = img.shape
            filename = images[index].split(os.sep)[-1]
        else:
            continue
        objects = []
        with open(file, 'r') as fid:
            for i in fid.readlines():
                i = i.strip().split()
                category = int(i[0])
                category_name = category_id[category]
                bbox = xywhn2xyxy((i[1], i[2], i[3], i[4]), shape)
                obj = [category_name, bbox]
                objects.append(obj)
        bbox_nums += len(objects)
        save_anno_to_xml(filename, shape, objects, save_path)


if __name__ == '__main__':
    """
    Script Description:
        This script is used to convert annotation file .txt in yolo format to annotation file .xml in voc format
    Parameter description:
        anno_path:txt storage path of the annotation file.
        save_path:the folder where the json file will be output.
        image_path:path of the image.
    """
    parser = argparse.ArgumentParser()
    parser.add_argument('-ap', '--anno-path', type=str, default='./data/labels/yolo', help='yolo txt path')
    parser.add_argument('-s', '--save-path', type=str, default='./data/convert/voc', help='xml save path')
    parser.add_argument('--image-path', default='./data/images')

    opt = parser.parse_args()
    if len(sys.argv) > 1:
        print(opt)
        parseXmlFilse(**vars(opt))
        print("image nums: {}".format(images_nums))
        print("category nums: {}".format(category_nums))
        print("bbox nums: {}".format(bbox_nums))
    else:
        anno_path = './data/labels/yolo'
        save_path = './data/convert/voc1'
        image_path = './data/images'
        parseXmlFilse(image_path, anno_path, save_path)
        print("image nums: {}".format(images_nums))
        print("category nums: {}".format(category_nums))
        print("bbox nums: {}".format(bbox_nums))

Convert from COCO to Pascal VOC

Use the following Python script to convert images from COCO to Pascal VOC:

from pycocotools.coco import COCO
import os
from lxml import etree, objectify
import shutil
from tqdm import tqdm
import sys
import argparse


# index name and id
def catid2name(coco):
    classes = dict()
    for cat in coco.dataset['categories']:
        classes[cat['id']] = cat['name']
    return classes


# write the label to xml
def save_anno_to_xml(filename, size, objs, save_path):
    E = objectify.ElementMaker(annotate=False)
    anno_tree = E.annotation(
        E.folder("DATA"),
        E.filename(filename),
        E.source(
            E.database("The VOC Database"),
            E.annotation("PASCAL VOC"),
            E.image("flickr")
        ),
        E.size(
            E.width(size['width']),
            E.height(size['height']),
            E.depth(size['depth'])
        ),
        E.segmented(0)
    )
    for obj in objs:
        E2 = objectify.ElementMaker(annotate=False)
        anno_tree2 = E2.object(
            E.name(obj[0]),
            E.pose("Unspecified"),
            E.truncated(0),
            E.difficult(0),
            E.bndbox(
                E.xmin(obj[1]),
                E.ymin(obj[2]),
                E.xmax(obj[3]),
                E.ymax(obj[4])
            )
        )
        anno_tree.append(anno_tree2)
    anno_path = os.path.join(save_path, filename[:-3] + "xml")
    etree.ElementTree(anno_tree).write(anno_path, pretty_print=True)


def load_coco(anno_file, xml_save_path):
    if os.path.exists(xml_save_path):
        shutil.rmtree(xml_save_path)
    os.makedirs(xml_save_path)

    coco = COCO(anno_file)
    classes = catid2name(coco)
    imgIds = coco.getImgIds()
    classesIds = coco.getCatIds()
    for imgId in tqdm(imgIds):
        size = {}
        img = coco.loadImgs(imgId)[0]
        filename = img['file_name']
        width = img['width']
        height = img['height']
        size['width'] = width
        size['height'] = height
        size['depth'] = 3
        annIds = coco.getAnnIds(imgIds=img['id'], iscrowd=None)
        anns = coco.loadAnns(annIds)
        objs = []
        for ann in anns:
            object_name = classes[ann['category_id']]
            # bbox:[x,y,w,h]
            bbox = list(map(int, ann['bbox']))
            xmin = bbox[0]
            ymin = bbox[1]
            xmax = bbox[0] + bbox[2]
            ymax = bbox[1] + bbox[3]
            obj = [object_name, xmin, ymin, xmax, ymax]
            objs.append(obj)
        save_anno_to_xml(filename, size, objs, xml_save_path)


def parseJsonFile(data_dir, xmls_save_path):
    assert os.path.exists(data_dir), "data dir:{} does not exits".format(data_dir)

    if os.path.isdir(data_dir):
        data_types = ['train2017', 'val2017']
        for data_type in data_types:
            ann_file = 'instances_{}.json'.format(data_type)
            xmls_save_path = os.path.join(xmls_save_path, data_type)
            load_coco(ann_file, xmls_save_path)
    elif os.path.isfile(data_dir):
        anno_file = data_dir
        load_coco(anno_file, xmls_save_path)


if __name__ == '__main__':
    """
    Script Description:
        This script is used to convert json files in coco format to xml files in voc format
    Parameter Description:
        data_dir:path of the json file
        xml_save_path:path of the xml output.
    """

    parser = argparse.ArgumentParser()
    parser.add_argument('-d', '--data-dir', type=str, default='./data/labels/coco/train.json', help='json path')
    parser.add_argument('-s', '--save-path', type=str, default='./data/convert/voc', help='xml save path')
    opt = parser.parse_args()
    print(opt)

    if len(sys.argv) > 1:
        parseJsonFile(opt.data_dir, opt.save_path)
    else:
        data_dir = './data/labels/coco/train.json'
        xml_save_path = './data/convert/voc'
        parseJsonFile(data_dir=data_dir, xmls_save_path=xml_save_path)

Upload Images and Their XML Files

Note:

If you're using LandingLens on Snowflake, uploading Pascal VOC files is not supported when loading images from Snowflake stages. You can still upload the files from a hard drive.

Before uploading labeled images, ensure that they are not in a zipped file. LandingLens doesn't support uploading zipped files.

Once you've located your image files, XML files, and ensure that they have the same name, follow the instructions below to upload the Pascal VOC files to LandingLens:

Open the Object Detection project you want to upload images to.
Open the Upload pop-up window:
- If you haven't uploaded any images to the project yet, scroll down to the bottom of the page and click Click Here.
  Open the Upload Pop-Up Window
- If you've already uploaded images, click the Upload icon.
  Click the Upload Icon
Drag and drop the images and Pascal VOC XML files into the project. You can also select the files from a directory. The image and XML file must have the same file name (with different extensions).
A preview of your images displays. If an image has a correctly-formatted XML file, LandingLens captions it as Pascal Voc.
To confirm your upload, click the Upload button.
Preview and Upload Labeled Images
The pre-labeled images display in your project.
Labeled Images

Was this article helpful?

What's Next

Upload Labeled Images to Segmentation Projects

Table of contents

Pascal VOC XML Files
Upload Images and Their XML Files