Object Detection YAML References

21 Dec 2022
3 Minutes to read

Print
Dark
Light
PDF

Object Detection YAML References

Updated on 21 Dec 2022
3 Minutes to read

Print
Dark
Light
PDF

Article summary

Did you find this summary helpful?

Thank you for your feedback

Object Detection training jobs use the keys listed in the table below.

YAML Key	Description
dataset	The dataset that the job is dependent on
train	Model training hyperparameters
model	Classification model and parameters
loss	Loss function to use
eval	Parameters for evaluation

Dataset

The following parameters can be used to load datasets.

train_split_key

Type: String
Default: train
Choices: train/test/dev
Description: The name given to the data split that will be used for training.

test_split_key

Type: String
Default: dev
Choices: train/test/dev
Description: The name given to the data split that will be used for testing.

val_split_key

Type: String
Default: dev
Choices: train/test/dev
Description: The name given to the data split that will be used for validation.

Train

The following parameters allow you to define the model hyperparameter.

batch_size

Type: Integer
Default: 8
Value Range: 1 <= batch_size
Description: Number of examples/images in each training batch. The minimum batch size is 1.

epoch

Type: Integer
Default: 200
Value Range: 1 <= epoch <=1000
Description: The number of epochs to train the model. If you turn on early stopping, the actual training epoch may be smaller.

learning_rate

Type: Float
Default: 0.0001
Value Range: 0 < learning_rate < 1.0
Description: The learning rate used to update the weights.

validation_run_freq

Type: Integer
Default: 1
Value Range: 1 <= validation_run_freq
Description: Number of epochs between validation runs.

early_stop

Type: Object
Description: Configuration for stopping training when a monitored metric has stopped improving
Properties:
- min_delta
  - Type: Float
  - Default: 0.01
  - Description: Minimum change in the monitored quantity to qualify as an improvement.
- min_epochs
  - Type: Integer
  - Default: 40
  - Description: Number of epochs with no improvement after which training will be stopped.

auto_tuning

Type: Object
Description: Configuration for automated hyperparameter tuning.
Properties:
- anchors
  - Type: Boolean
  - Default: True
  - Description: Whether to use automated anchor-parameter tuning with train and valid datasets.
- class_weights
  - Type: Boolean
  - Default: True
  - Description: Whether to use automated class weight tuning with train and valid datasets.
- class_weights_method
  - Type: Integer
  - Default: 0
  - Choices: 0, 1, 2
  - Description: Which method to use to tune class weights
- debug
  - Type: Boolean
  - Default: False
  - Description: Whether to turn on the debug mode of auto-tuning process

Model

The following parameters allow you to specify which Object Detection model to use.

avi

Type: Object
Description: Specifies the python library from where the model is implemented.\
Properties:
- RetinaOD
  - Type: Object
  - Description: The architecture of the RetinaNet model.
  - Properties:
    - backbone
      - Type: String
      - Default: ResNet34
      - Choices:
        ResNet18
        ResNet34
        ResNet50
      - Description: The backbone to use in RetinaNet.
    - backbone_weights
      - Type: String | Null
      - Default: Imagenet
      - Choices:
        null
        imagenet
      - Description: Load ImageNet pre-trained weights to the backbone or randomly initialized weights
    - output_depth
      - Type: Integer
      - Value Range: 1 <= output_depth
      - Description: The number of classes. Number of defective class plus OK class.'
    - input_shape
      - Type: array
      - Description: Shape of the images.
      - Items:
        item_1
        Default: 400
        Value Range: 1 <= item_1 <=1500
        Description: Height of the image
        item_2
        Default: 400
        Value Range: 1 <= item_2 <=1500
        Description: Width of the image
        item_3
        Default: 3
        Value Range: 1 <= item_3 <=4
        Description: This is the channels of the image (1 for grayscale, 2 for grayscale + alpha, 3 for RGB, 4 for RGB + alpha).
  - nms_threshold
    - Type: Float
    - Default: 0.1
    - Value Range: 0 <= nms_threshold <= 1
    - Description: Non-maximum suppression threshold.
  - score_threshold
    - Type: float
    - Default: 0.3
    - Value Range: 0 <= score_threshold <= 0.99
    - Multiple of: 0.01
    - Description: Minimum score required to make an object prediction.
  - class_specific_filter
    - Type: Boolean
    - Default: false
    - Description: Whether to perform score-based filtering and NMS filtering per class.
  - anchor_sizes
    - Type: array
    - Description: The anchor_sizes parameter
    - Minimum Items: 1
    - Default: [ 32, 64, 128, 256, 512 ]
    - Items:
      - Type: integer
      - Minimum: 1
  - anchor_sizes
    - Type: array
    - Description: The anchor_strides parameter
    - Minimum Items: 1
    - Default: [ 8, 16, 32, 64, 128 ]
    - Items:
      - Type: integer
      - Minimum: 1
  - anchor_ratios
    - Type: array
    - Description: The anchor_ratios parameter
    - Minimum Items: 1
    - Default: [ 0.25, 0.5, 1, 4 ]
    - Items:
      - Type: Float
      - Minimum: 0.0
  - anchor_scales
    - Type: Array
    - Description: The anchor_scales parameter
    - Minimum Items: 1
    - Default: [ 0.5, 1, 1.5, 2 ]
    - Items:
      - Type: Float
        Minimum: 0.0

Loss

The following parameters allow you to specify the loss functions to use for the Classification and regression head of the model.

regression

Type: Object
Description: Loss function to measure the distance between the predicted and the target box.
Properties:
- RetinaNetSmoothL1
- Type: Object
- Description: The Smooth L1 loss
  - Properties:
    - sigma
      - Type: Float
      - Default: 3.0
      - Description: Sigma

classification

Type: Object
Description: Loss function to measure the misclassification error between the predicted and the target class.
Properties:
- RetinaNetFocal
  - Type: Object
  - Description: The Smooth L1 loss
- Properties:
  - gamma
    - Type: Integer
      - Default: 2
      - Value Range: 1 <= gamma
      - Description: Parameter that gives weight to the hard misclassified examples.

Eval

The following parameters can be used in the evaluation phase. To see the list of supported post-processing transformations, please refer to the Transform YAML References.

postprocessing

Type: Object
Description: The configuration for post-processing steps.
Properties:
- output_type
  - Type: string
  - Default: object-detection
  - Choices:
    - classification
    - object-detection
  - Description: The kind of output after preprocessing
- iou_threshold
  - Type: float
  - Default: 0.5
  - Value Range: 0 <= iou_threshold <= 1
  - Description: The IoU threshold to determine True Positive and False Positive.
- transforms
  - Type: array | null
  - Items:
    - Type: Post-processing transforms
    - Description: The transformations to apply in post-processing steps to both ground truth and prediction.
- gt_transforms
  - Type: Array | Null
  - Items:
    - Type: Post-processing transforms
    - Description: The transformations to apply in post-processing steps to ground truth only.
- pred_transforms
  - Type: Array | Null
  - Items:
    - Type: Post-processing transforms
      - Description: The transformations to apply in post-processing steps to prediction only.

Was this article helpful?

What's Next

Segmentation YAML References

Table of contents

Dataset
Train
Model
Loss
Eval