Object Detection YAML References
  • 21 Dec 2022
  • 3 Minutes to read
  • Dark
    Light
  • PDF

Object Detection YAML References

  • Dark
    Light
  • PDF

Article summary

Object Detection training jobs use the keys listed in the table below.

YAML KeyDescription
datasetThe dataset that the job is dependent on
trainModel training hyperparameters
modelClassification model and parameters
lossLoss function to use
evalParameters for evaluation

Dataset

The following parameters can be used to load datasets.

train_split_key

  • Type: String
  • Default: train
  • Choices: train/test/dev
  • Description: The name given to the data split that will be used for training.

test_split_key

  • Type: String
  • Default: dev
  • Choices: train/test/dev
  • Description: The name given to the data split that will be used for testing.

val_split_key

  • Type: String
  • Default: dev
  • Choices: train/test/dev
  • Description: The name given to the data split that will be used for validation.

Train

The following parameters allow you to define the model hyperparameter.

batch_size

  • Type: Integer
  • Default: 8
  • Value Range: 1 <= batch_size
  • Description: Number of examples/images in each training batch. The minimum batch size is 1.

epoch

  • Type: Integer
  • Default: 200
  • Value Range: 1 <= epoch <=1000
  • Description: The number of epochs to train the model. If you turn on early stopping, the actual training epoch may be smaller.

learning_rate

  • Type: Float
  • Default: 0.0001
  • Value Range: 0 < learning_rate < 1.0
  • Description: The learning rate used to update the weights.

validation_run_freq

  • Type: Integer
  • Default: 1
  • Value Range: 1 <= validation_run_freq
  • Description: Number of epochs between validation runs.

early_stop

  • Type: Object
  • Description: Configuration for stopping training when a monitored metric has stopped improving
  • Properties:
    • min_delta
      • Type: Float
      • Default: 0.01
      • Description: Minimum change in the monitored quantity to qualify as an improvement.
    • min_epochs
      • Type: Integer
      • Default: 40
      • Description: Number of epochs with no improvement after which training will be stopped.

auto_tuning

  • Type: Object
  • Description: Configuration for automated hyperparameter tuning.
  • Properties:
    • anchors
      • Type: Boolean
      • Default: True
      • Description: Whether to use automated anchor-parameter tuning with train and valid datasets.
    • class_weights
      • Type: Boolean
      • Default: True
      • Description: Whether to use automated class weight tuning with train and valid datasets.
    • class_weights_method
      • Type: Integer
      • Default: 0
      • Choices: 0, 1, 2
      • Description: Which method to use to tune class weights
    • debug
      • Type: Boolean
      • Default: False
      • Description: Whether to turn on the debug mode of auto-tuning process

Model

The following parameters allow you to specify which Object Detection model to use.

avi

  • Type: Object
  • Description: Specifies the python library from where the model is implemented.\
  • Properties:
    • RetinaOD
      • Type: Object
      • Description: The architecture of the RetinaNet model.
      • Properties:
        • backbone
          • Type: String
          • Default: ResNet34
          • Choices:
            • ResNet18
            • ResNet34
            • ResNet50
          • Description: The backbone to use in RetinaNet.
        • backbone_weights
          • Type: String | Null
          • Default: Imagenet
          • Choices:
            • null
            • imagenet
          • Description: Load ImageNet pre-trained weights to the backbone or randomly initialized weights
        • output_depth
          • Type: Integer
          • Value Range: 1 <= output_depth
          • Description: The number of classes. Number of defective class plus OK class.'
        • input_shape
          • Type: array
          • Description: Shape of the images.
          • Items:
            • item_1
              • Default: 400
              • Value Range: 1 <= item_1 <=1500
              • Description: Height of the image
            • item_2
              • Default: 400
              • Value Range: 1 <= item_2 <=1500
              • Description: Width of the image
            • item_3
              • Default: 3
              • Value Range: 1 <= item_3 <=4
              • Description: This is the channels of the image (1 for grayscale, 2 for grayscale + alpha, 3 for RGB, 4 for RGB + alpha).
      • nms_threshold
        • Type: Float
        • Default: 0.1
        • Value Range: 0 <= nms_threshold <= 1
        • Description: Non-maximum suppression threshold.
      • score_threshold
        • Type: float
        • Default: 0.3
        • Value Range: 0 <= score_threshold <= 0.99
        • Multiple of: 0.01
        • Description: Minimum score required to make an object prediction.
      • class_specific_filter
        • Type: Boolean
        • Default: false
        • Description: Whether to perform score-based filtering and NMS filtering per class.
      • anchor_sizes
        • Type: array
        • Description: The anchor_sizes parameter
        • Minimum Items: 1
        • Default: [ 32, 64, 128, 256, 512 ]
        • Items:
          • Type: integer
          • Minimum: 1
      • anchor_sizes
        • Type: array
        • Description: The anchor_strides parameter
        • Minimum Items: 1
        • Default: [ 8, 16, 32, 64, 128 ]
        • Items:
          • Type: integer
          • Minimum: 1
      • anchor_ratios
        • Type: array
        • Description: The anchor_ratios parameter
        • Minimum Items: 1
        • Default: [ 0.25, 0.5, 1, 4 ]
        • Items:
          • Type: Float
          • Minimum: 0.0
      • anchor_scales
        • Type: Array
        • Description: The anchor_scales parameter
        • Minimum Items: 1
        • Default: [ 0.5, 1, 1.5, 2 ]
        • Items:
          • Type: Float
            • Minimum: 0.0

Loss

The following parameters allow you to specify the loss functions to use for the Classification and regression head of the model.

regression

  • Type: Object
  • Description: Loss function to measure the distance between the predicted and the target box.
  • Properties:
    • RetinaNetSmoothL1
    • Type: Object
    • Description: The Smooth L1 loss
      • Properties:
        • sigma
          • Type: Float
          • Default: 3.0
          • Description: Sigma

classification

  • Type: Object
  • Description: Loss function to measure the misclassification error between the predicted and the target class.
  • Properties:
    • RetinaNetFocal
      • Type: Object
      • Description: The Smooth L1 loss
    • Properties:
      • gamma
        • Type: Integer
          • Default: 2
          • Value Range: 1 <= gamma
          • Description: Parameter that gives weight to the hard misclassified examples.

Eval

The following parameters can be used in the evaluation phase. To see the list of supported post-processing transformations, please refer to the Transform YAML References.

postprocessing

  • Type: Object
  • Description: The configuration for post-processing steps.
  • Properties:
    • output_type
      • Type: string
      • Default: object-detection
      • Choices:
        • classification
        • object-detection
      • Description: The kind of output after preprocessing
    • iou_threshold
      • Type: float
      • Default: 0.5
      • Value Range: 0 <= iou_threshold <= 1
      • Description: The IoU threshold to determine True Positive and False Positive.
    • transforms
      • Type: array | null
      • Items:
        • Type: Post-processing transforms
        • Description: The transformations to apply in post-processing steps to both ground truth and prediction.
    • gt_transforms
      • Type: Array | Null
      • Items:
        • Type: Post-processing transforms
        • Description: The transformations to apply in post-processing steps to ground truth only.
    • pred_transforms
      • Type: Array | Null
      • Items:
        • Type: Post-processing transforms
          • Description: The transformations to apply in post-processing steps to prediction only.

Was this article helpful?