- 21 Dec 2024
- 7 Minutes to read
- Print
- DarkLight
- PDF
Upload Labeled Images to Object Detection Projects
- Updated on 21 Dec 2024
- 7 Minutes to read
- Print
- DarkLight
- PDF
This article applies to these versions of LandingLens:
LandingLens | LandingLens on Snowflake |
✓ | ✓ (see exceptions below) |
If you have labeled images you want to upload to Object Detection projects, you can upload them in the Pascal VOC (Visual Object Classes) format. This format involves uploading the original (unlabeled) image and a corresponding XML file. The XML file contains the label (annotation) details of its paired image. The XML file essentially tells LandingLens where each label is on its associated image and what the name of the class is.
The image and XML file must have the same file name (with different extensions). For example:
- vehicle_123.png
- vehicle_123.xml
Pascal VOC XML Files
When you upload an image and its corresponding XML file to an Object Detection project in LandingLens, the XML file must follow the format of the following code sample:
<?xml version='1.0' encoding='UTF-8'?>
<annotation>
<object>
<pose>Unspecified</pose>
<occluded>0</occluded>
<difficult>0</difficult>
<truncated>0</truncated>
<name>Diamond</name>
<labeler>0f7936ab-7025-482f-9812-4d9c778e00f6</labeler>
<bndbox>
<xmax>926</xmax>
<xmin>652</xmin>
<ymax>3046</ymax>
<ymin>2808</ymin>
</bndbox>
</object>
</annotation>
Pascal VOC Elements
The following table explains the elements in the Pascal VOC XML file.
Element | Definition | Example |
---|---|---|
<?xml version='1.0' encoding='UTF-8'?> | The XML preamble that defines the XML standard and encoding method. | <?xml version='1.0' encoding='UTF-8'?> |
<annotation> | This element wraps around all objects. | <annotation> |
<object> | This element wraps around all elements that define the labeled object. | <object> |
<pose> | Identifies if the image is skewed. The default value is Unspecified , which means that the image is not skewed. | <pose>Unspecified</pose> |
<occluded> | Indicates if the labeled object overlaps another labeled object. Values:
| <occluded>0</occluded> |
<difficult> | Indicates if the object is easy to detect. Values:
| <difficult>0</difficult> |
<truncated> | Indicates if a bounding box doesn't include the full object. This could happen because the object isn't fully in the frame (ex: the object is a cat but it's only halfway in the picture) or because there is something in front of the object. Values:
| <truncated>0</truncated> |
<name> | The name of the class (label). When the image and the corresponding XML file are uploaded to LandingLens, LandingLens converts this name into a class. | <name>Diamond</name> |
<bndbox> | This element wraps around the dimensions for the edges of the bounding box, which is the rectangle around the labeled object. | <bndbox> |
<xmax> | The distance from the left edge of the image to the right side of the bounding box (in pixels). | <xmax>1032</xmax> |
<xmin> | The distance from the left edge of the image to the left side of the bounding box (in pixels). | <xmin>2085</xmin> |
<ymax> | The distance from the top of the image to the bottom side of the bounding box (in pixels). | <ymax>2800</ymax> |
<ymin> | The distance from the top of the image to the top side of the bounding box (in pixels). | <ymin>2396</ymin> |
<filename> | The file name and image file extension of the image. | <filename>IMG_6397.jpg</filename> |
<segmented> | Indicates if any of the bounding boxes are not rectangles (for example if one is a hexagon). Values:
| <segmented>0</segmented> |
<size> | This element wraps around the dimensions of the image. | <size> |
<height> | The height of the image (in pixels). | <height>4032</height> |
<width> | The width of the image (in pixels). | <width>3024</width> |
Convert to Pascal VOC
If you have images annotated in a format that is NOT Pascal VOC, you can use the Python scripts in this section to convert from YOLO or COCO to Pascal VOC.
To convert from other formats, use a third-party conversion tool to convert the files to Pascal VOC. Then upload the images and the XML files to LandingLens.
Convert from YOLO to Pascal VOC
Use the following Python script to convert images from YOLO to Pascal VOC:
import argparse
import os
import sys
import shutil
import cv2
from lxml import etree, objectify
# Write label information to xml
from tqdm import tqdm
images_nums = 0
category_nums = 0
bbox_nums = 0
def save_anno_to_xml(filename, size, objs, save_path):
E = objectify.ElementMaker(annotate=False)
anno_tree = E.annotation(
E.folder("DATA"),
E.filename(filename),
E.source(
E.database("The VOC Database"),
E.annotation("PASCAL VOC"),
E.image("flickr")
),
E.size(
E.width(size[1]),
E.height(size[0]),
E.depth(size[2])
),
E.segmented(0)
)
for obj in objs:
E2 = objectify.ElementMaker(annotate=False)
anno_tree2 = E2.object(
E.name(obj[0]),
E.pose("Unspecified"),
E.truncated(0),
E.difficult(0),
E.bndbox(
E.xmin(obj[1][0]),
E.ymin(obj[1][1]),
E.xmax(obj[1][2]),
E.ymax(obj[1][3])
)
)
anno_tree.append(anno_tree2)
anno_path = os.path.join(save_path, filename[:-3] + "xml")
etree.ElementTree(anno_tree).write(anno_path, pretty_print=True)
def xywhn2xyxy(bbox, size):
bbox = list(map(float, bbox))
size = list(map(float, size))
xmin = (bbox[0] - bbox[2] / 2.) * size[1]
ymin = (bbox[1] - bbox[3] / 2.) * size[0]
xmax = (bbox[0] + bbox[2] / 2.) * size[1]
ymax = (bbox[1] + bbox[3] / 2.) * size[0]
box = [xmin, ymin, xmax, ymax]
return list(map(int, box))
def parseXmlFilse(image_path, anno_path, save_path):
global images_nums, category_nums, bbox_nums
assert os.path.exists(image_path), "ERROR {} dose not exists".format(image_path)
assert os.path.exists(anno_path), "ERROR {} dose not exists".format(anno_path)
if os.path.exists(save_path):
shutil.rmtree(save_path)
os.makedirs(save_path)
category_set = []
with open(anno_path + '/classes.txt', 'r') as f:
for i in f.readlines():
category_set.append(i.strip())
category_nums = len(category_set)
category_id = dict((k, v) for k, v in enumerate(category_set))
images = [os.path.join(image_path, i) for i in os.listdir(image_path)]
files = [os.path.join(anno_path, i) for i in os.listdir(anno_path)]
images_index = dict((v.split(os.sep)[-1][:-4], k) for k, v in enumerate(images))
images_nums = len(images)
for file in tqdm(files):
if os.path.splitext(file)[-1] != '.txt' or 'classes' in file.split(os.sep)[-1]:
continue
if file.split(os.sep)[-1][:-4] in images_index:
index = images_index[file.split(os.sep)[-1][:-4]]
img = cv2.imread(images[index])
shape = img.shape
filename = images[index].split(os.sep)[-1]
else:
continue
objects = []
with open(file, 'r') as fid:
for i in fid.readlines():
i = i.strip().split()
category = int(i[0])
category_name = category_id[category]
bbox = xywhn2xyxy((i[1], i[2], i[3], i[4]), shape)
obj = [category_name, bbox]
objects.append(obj)
bbox_nums += len(objects)
save_anno_to_xml(filename, shape, objects, save_path)
if __name__ == '__main__':
"""
Script Description:
This script is used to convert annotation file .txt in yolo format to annotation file .xml in voc format
Parameter description:
anno_path:txt storage path of the annotation file.
save_path:the folder where the json file will be output.
image_path:path of the image.
"""
parser = argparse.ArgumentParser()
parser.add_argument('-ap', '--anno-path', type=str, default='./data/labels/yolo', help='yolo txt path')
parser.add_argument('-s', '--save-path', type=str, default='./data/convert/voc', help='xml save path')
parser.add_argument('--image-path', default='./data/images')
opt = parser.parse_args()
if len(sys.argv) > 1:
print(opt)
parseXmlFilse(**vars(opt))
print("image nums: {}".format(images_nums))
print("category nums: {}".format(category_nums))
print("bbox nums: {}".format(bbox_nums))
else:
anno_path = './data/labels/yolo'
save_path = './data/convert/voc1'
image_path = './data/images'
parseXmlFilse(image_path, anno_path, save_path)
print("image nums: {}".format(images_nums))
print("category nums: {}".format(category_nums))
print("bbox nums: {}".format(bbox_nums))
Convert from COCO to Pascal VOC
Use the following Python script to convert images from COCO to Pascal VOC:
from pycocotools.coco import COCO
import os
from lxml import etree, objectify
import shutil
from tqdm import tqdm
import sys
import argparse
# index name and id
def catid2name(coco):
classes = dict()
for cat in coco.dataset['categories']:
classes[cat['id']] = cat['name']
return classes
# write the label to xml
def save_anno_to_xml(filename, size, objs, save_path):
E = objectify.ElementMaker(annotate=False)
anno_tree = E.annotation(
E.folder("DATA"),
E.filename(filename),
E.source(
E.database("The VOC Database"),
E.annotation("PASCAL VOC"),
E.image("flickr")
),
E.size(
E.width(size['width']),
E.height(size['height']),
E.depth(size['depth'])
),
E.segmented(0)
)
for obj in objs:
E2 = objectify.ElementMaker(annotate=False)
anno_tree2 = E2.object(
E.name(obj[0]),
E.pose("Unspecified"),
E.truncated(0),
E.difficult(0),
E.bndbox(
E.xmin(obj[1]),
E.ymin(obj[2]),
E.xmax(obj[3]),
E.ymax(obj[4])
)
)
anno_tree.append(anno_tree2)
anno_path = os.path.join(save_path, filename[:-3] + "xml")
etree.ElementTree(anno_tree).write(anno_path, pretty_print=True)
def load_coco(anno_file, xml_save_path):
if os.path.exists(xml_save_path):
shutil.rmtree(xml_save_path)
os.makedirs(xml_save_path)
coco = COCO(anno_file)
classes = catid2name(coco)
imgIds = coco.getImgIds()
classesIds = coco.getCatIds()
for imgId in tqdm(imgIds):
size = {}
img = coco.loadImgs(imgId)[0]
filename = img['file_name']
width = img['width']
height = img['height']
size['width'] = width
size['height'] = height
size['depth'] = 3
annIds = coco.getAnnIds(imgIds=img['id'], iscrowd=None)
anns = coco.loadAnns(annIds)
objs = []
for ann in anns:
object_name = classes[ann['category_id']]
# bbox:[x,y,w,h]
bbox = list(map(int, ann['bbox']))
xmin = bbox[0]
ymin = bbox[1]
xmax = bbox[0] + bbox[2]
ymax = bbox[1] + bbox[3]
obj = [object_name, xmin, ymin, xmax, ymax]
objs.append(obj)
save_anno_to_xml(filename, size, objs, xml_save_path)
def parseJsonFile(data_dir, xmls_save_path):
assert os.path.exists(data_dir), "data dir:{} does not exits".format(data_dir)
if os.path.isdir(data_dir):
data_types = ['train2017', 'val2017']
for data_type in data_types:
ann_file = 'instances_{}.json'.format(data_type)
xmls_save_path = os.path.join(xmls_save_path, data_type)
load_coco(ann_file, xmls_save_path)
elif os.path.isfile(data_dir):
anno_file = data_dir
load_coco(anno_file, xmls_save_path)
if __name__ == '__main__':
"""
Script Description:
This script is used to convert json files in coco format to xml files in voc format
Parameter Description:
data_dir:path of the json file
xml_save_path:path of the xml output.
"""
parser = argparse.ArgumentParser()
parser.add_argument('-d', '--data-dir', type=str, default='./data/labels/coco/train.json', help='json path')
parser.add_argument('-s', '--save-path', type=str, default='./data/convert/voc', help='xml save path')
opt = parser.parse_args()
print(opt)
if len(sys.argv) > 1:
parseJsonFile(opt.data_dir, opt.save_path)
else:
data_dir = './data/labels/coco/train.json'
xml_save_path = './data/convert/voc'
parseJsonFile(data_dir=data_dir, xmls_save_path=xml_save_path)
Upload Images and Their XML Files
Before uploading labeled images, ensure that they are not in a zipped file. LandingLens doesn't support uploading zipped files.
Once you've located your image files, XML files, and ensure that they have the same name, follow the instructions below to upload the Pascal VOC files to LandingLens:
- Open the Object Detection project you want to upload images to.
- Open the Upload pop-up window:
- If you haven't uploaded any images to the project yet, scroll down to the bottom of the page and click Click Here.
- If you've already uploaded images, click the Upload icon.
- If you haven't uploaded any images to the project yet, scroll down to the bottom of the page and click Click Here.
- Drag and drop the images and Pascal VOC XML files into the project. You can also select the files from a directory. The image and XML file must have the same file name (with different extensions).
- A preview of your images displays. If an image has a correctly-formatted XML file, LandingLens captions it as Pascal Voc.
- To confirm your upload, click the Upload button.
- The pre-labeled images display in your project.