Embed-Optimized Models
  • 06 May 2025
  • 4 Minutes to read
  • Dark
    Light
  • PDF

Embed-Optimized Models

  • Dark
    Light
  • PDF

Article summary

LandingLens supports building and deploying embed-optimized models, which are lightweight, efficient computer vision models designed to run directly on embedded devices.

To create an embed-optimized model, choose one of the following model architectures when running Custom Training

  • For Object Detection: ODEmbedded-[23M]
  • For Classification: ConvNextEmbedded-[16M]

Using these architecture types generates a TensorFlow Lite (TFLite) model ready for embedded deployment. After downloading an embed-optimized model from LandingLens, you can interact with it using TFLite libraries and APIs.

Note:
LandingLens doesn't support embed-optimized models for Segmentation, Anomaly Detection, or Visual Prompting projects.

When to Use Embed-Optimized Models

Embed-optimized models are designed for scenarios where you're deploying to devices with limited compute power, restricted memory, or no network connection. These models are a good fit if your application needs:

  • Faster inference times on constrained hardware
  • Smaller model sizes that fit within embedded memory limits
  • Offline capabilities for real-time inference without a network connection
  • Deployment to mobile devices, IoT endpoints, robots, or edge cameras
Note:
If you don't need to run your model on embedded hardware, use our other model architectures, which are optimized for cloud and general-purpose edge deployment.

Example Use Cases

There are many scenarios where running computer vision models directly on embedded devices offers real advantages—such as lower latency, offline operation, and reduced hardware costs. Here are a few real-world use cases for LandingLens embed-optimized models:

  • Utility drones for aerial inspection: A utility company uses drones to detect damaged power lines after a storm. The LandingLens embed-optimized model runs directly on the drone’s onboard chipset, allowing real-time analysis without needing a cloud connection.
  • Smartphone app for plant classification: A mobile app uses a LandingLens embed-optimized model to identify plant species from photos taken by users. Running the model directly on the phone means users get fast predictions, even when offline.
  • Manufacturing line quality control: A factory installs embedded cameras on an assembly line. The cameras have LandingLens embed-optimized models that detect defects. The model's fast inference helps catch issues early, without needing an external server or latency-prone connection.
  • Smart city info kiosks: Public kiosks in a smart city use LandingLens embed-optimized models to classify visitor behavior (like approaching the kiosk) to trigger helpful content or assistance. The model runs efficiently on low-power hardware inside each kiosk.
  • Autonomous delivery robots: A robotics company deploys LandingLens embed-optimized models in delivery bots to detect pedestrians and navigate around obstacles. Lightweight, local inference helps the bot respond quickly in real-world environments.

Deployment Support

The LandingLens embed-optimized models are built for flexible deployment to embedded systems that support TensorFlow Lite (TFLite) inference.

LandingLens provides the model architecture and TFLite export format, allowing you to integrate the model into your own applications or pipelines. Because every embedded deployment is different, you’ll need to customize your solution based on your device, hardware accelerators, and software stack. 

We recommend thoroughly testing your setup before deploying to production.

LandingAI Edge Applications

While embed-optimized models are intended for custom embedded deployments, you can use LandingEdge and LandingAI Docker to test your models without writing custom code. These tools are a convenient way to validate performance and output before integrating with your production environment.

Supported versions:

  • LandingEdge 2.11.79 and later
  • LandingAI Docker 2.11.79 and later

LandingAI Cloud Deployment

The LandingLens embed-optimized models are not designed for cloud deployment. Their architecture is tuned for fast inference on constrained devices, prioritizing low latency over high accuracy. 

For cloud use cases, we recommend using other LandingLens model architectures.

Model Information

For the embed-optimized models, both the model input and output are quantized to int8. The models are fully quantized per-tensor and model training performs static quantization, which means that both weights and activations are quantized. 

The models are calibrated using Quantization Aware Training (QAT).

The following table describes key data points for the LandingLens embed-optimized models.

DescriptionObject DetectionClassification
Architecture name in LandingLens ODEmbedded-[23M]ConvNextEmbedded-[16M]
Trainable parameters23M16M
Input shapeBHWC (1, height, width, channel)BHWC (1, height, width, channel)
Model formatTensorFlow Lite (TFLite)TensorFlow Lite (TFLite)
Model input(1, height, width, channel), int8(1, height, width, channel), int8
Model output
  • scores: (1, num_detections), int8
  • bboxes: (1, num_detections, 4), int8, in (x_min, y_min, x_max, y_max) format
  • labels: (1, num_detections), int64 (some devices will move this tensor to CPU to generate it)
scores: (1, num_detections), int8

Quickstart: Train and Deploy Embed-Optimized Models

The process for creating an embed-optimized model is similar to creating other models in LandingLens. This quickstart highlights the key steps in the process. For detailed instructions, we’ve included links to the relevant parts of the documentation.

  1. Create an Object Detection or Classification project.
  2. Upload and label your images. 
  3. Train a Custom Model using the settings listed in the Training Settings table. 
  4. Activate the project.
  5. Download the model. The downloaded model is a ZIP file that has the prefix bundle_
  6. Use TFLite libraries and APIs to write a script that runs the downloaded model. 
  7. Load the model and script to the embedded device. (Do not unzip the downloaded model.)

Training Settings

Embed-optimized models are created with Custom Training. Use these settings when configuring your Custom Training job. 

SettingObject DetectionClassification
Model sizeODEmbedded-[23M]ConvNextEmbedded-[16M]
Minumum epochs300 epochs100 epochs
Minimum dimensions
(for resize or rescale)
4096px (for example: 64x64px)4096px (for example: 64x64px)
Maximum dimensions
(for resize or rescale)
541,696px (for example: 736x736px)1,478,656px (for example: 1216x1216px)

Recommended Number of Images

We recommend using at least 100 labeled images for training embed-optimized models. The model might not train correctly with fewer images. 

Epochs

Embed-optimized models typically require more training epochs than other model types to achieve good performance. Using too few epochs may prevent the model from converging or learning effectively.

We recommend the following minimums:

  • Object Detection: At least 300 epochs
  • Classification: At least 100 epochs

Learning Loss Spikes During Training

In some cases during training, the loss curve might spike and the model convergence degrade. This results in a model with low performance, such as an F1 score of 30%. If you notice the loss curve spikes or the model performance is low, try training the model again.

Low Confidence Scores (Object Detection)

After training an embed-optimized Object Detection model, the highest confidence score for a prediction might be 0.5. The low confidence score is due to a quantization limitation, and does not mean that the model performs poorly.

Adjust the confidence threshold to find the optimal between accurate false positives and false negatives.

Sample Scripts: Run Inference with Embed-Optimized Models

Get sample scripts for running inference with embed-optimized models in our Python library here.


Was this article helpful?