Landing AI Docker
  • 02 Mar 2024
  • 15 Minutes to read
  • Dark
    Light
  • PDF

Landing AI Docker

  • Dark
    Light
  • PDF

Article Summary

Landing AI has launched a new way to deploy models using Docker, enabling DevOps users to integrate the model inference programmatically.

The Docker approach is headless, which means that it has no user interface. This allows you to manage inference programmatically and scale deployments quickly.

After you deploy your model to an endpoint using our Docker approach, use the LandingLens Python library to run inference on the model.

Here’s a quick summary of the benefits that the Docker deployment offers:

  • Deployable in a private cloud environment or on any operating system
  • Total API control (including remote access) for inference
  • Can manage deployments programmatically
  • Not limited to a certain amount of inferences per minute
  • Can be used with or without a GPU
  • All communication with the Landing AI server is via an HTTPS connection
Note:
Deploying LandingLens models with Docker is intended for developers who are familiar with Docker and programming languages. If that's not you, that's okay! In that case, we recommend that you explore our other deployment options.

Requirements

Ensure that you meet all the requirements before installing and running the Landing AI Deploy Docker image:

Required Applications

Install the following applications and systems in order to install and run the Landing AI Deploy Docker image:

System Support and Requirements

ItemRequirement
Supported Operating Systems

Linux, macOS, Windows

Memory
  • Minimum: 8 GB
  • Recommended: 16 GB or more
Supported Processors and Architecture

x86-64, ARM64


X86-64 includes, but is not limited to:

  • Intel Core i5 (2013 Haswell or later)
  • AMD Ryzen (2017 Zen or later)

ARM64 includes, but is not limited to:

  • AWS Graviton2
  • Apple silicon (M1, M2)1
Internet ConnectionRequired to download the Docker image and run inference
Note:
1. We have not yet tested the Apple M3 line of chips, but based on what we know about this architecture, we presume it is supported. 

NVIDIA Jetson Support

NVIDIA Jetson devices are supported but not required. If you choose to use an NVIDIA Jetson device:

  • Minimum: NVIDIA Jetson Orin Nano (or better/similar)
  • Recommended: NVIDIA Jetson Orin AGX (or better/similar)
Notes:
  • The Landing AI Deploy Docker image for devices on NVIDIA Jetpack 4.x does not support Object Detection models.
  • Docker might require you to run commands or download libraries specific to your NVIDIA device and driver version. We recommend reading the NVIDIA documentation to understand requirements and compatibility.

GPU Support

Using a GPU is optional. If you plan on using a GPU, we recommend using an NVIDIA GPU with at least 8 GB of RAM. For example:

  • GeForce GTX 1070 or newer
  • Quadro P4000 or newer
  • Tesla P4 or newer

The following table lists the minimum required driver version for NVIDIA GPUs.

Operating SystemMinimum Driver Version
Windows452.39 or later
Linux450.80.02 or later
Note:
  • For information about NVIDIA Jetson devices, go here.  
  • Docker might require you to run commands or download libraries specific to your NVIDIA device and driver version. We recommend reading the Docker and NVIDIA documentation to understand requirements and compatibility.

Install the Landing AI Image

To install the Landing AI Docker image, you will pull the image from Amazon Elastic Container Registry (Amazon ECR), a library of containerized applications. The image size is about 6GB. You can see all of the Landing AI "Deploy" images here.

You only need to install (pull) the image once. After you've downloaded the image, you're ready to launch an instance of it.

To pull the image, run this command:

docker pull public.ecr.aws/landing-ai/deploy:latest

Install the Image on Jetson Devices

We provide separate images for users on NVIDIA Jetson devices. To pull the image you need, run the command based on your JetPack version.

For NVIDIA Jetson devices with JetPack 4.x.x:

Go to the Landing AI repository in Amazon ECR and download the most recent version with the -jp4 suffix.

docker pull public.ecr.aws/landing-ai/deploy:VERSION-jp4

For NVIDIA Jetson devices with JetPack 5.x.x:

Go to the Landing AI repository in Amazon ECR and download the most recent version with the -jp5 suffix.

docker pull public.ecr.aws/landing-ai/deploy:VERSION-jp5

Docker Deployment Quick-Start Guide

To run inference on a model after installing the Landing AI Docker image, go through this checklist:

  1. Ensure you have a model in LandingLens that you want to run inference on. (New to LandingLens? Learn how to train a model here.)
  2. Locate the Model ID for the model you want to run inference on.
  3. Locate your LandingLens API key.
  4. Locate your license key.
  5. Launch a container with the model.
  6. Run inference with the LandingLens Python library.

Locate Your Model ID

The Model ID is included in the command to deploy the model via Docker. The Model ID tells the application which model to download from LandingLens.

To locate your Model ID:

  1. Open the project with the model you want to deploy.
  2. Click the Deploy tab.
  3. Click Deployable Models.
  4. Click the Copy Model ID icon for the model you want to deploy via Docker. (Don't see your model listed? Go here.)Copy the Model IDThe Model ID is copied to your clipboard.

Don't See Your Model Listed in Deployable Models? Create a Deployable Model

Before December 2023, only models that had been deployed via Cloud Deployment or had deployable models generated were available in the Deployable Models table.

Starting in early December 2023, all trained models display in the Deployable Models table. Each model has its best Confidence Score. If you want to use a different Confidence Score for a model, or if you don't see your model in the Deployable Models table, you can either deploy it via Cloud Deployment or use the following procedure to generate a deployable model:

  1. Open the project with the model you want to deploy.
  2. Click Models to open the Models panel.
    Open the Models Panel
  3. Select the model you want to deploy from the Model drop-down menu.
  4. Use the Confidence Threshold slider to select the Confidence Score you want the model to have.
  5. Click Deploy.
    Deploy the Model after Selecting a Confidence Score
  6. Select Self-Hosted Deployments.
  7. Click Generate Deployable Model.
    Generate a Deployable Model
    LandingLens creates a deployable model and opens to the Deployable Models table on the Deploy page.
Note:
Models from Visual Prompting and the Classic Flow use a separate deployment workflow and don’t display in the Deployable Models table.

Locate Your LandingLens API Key

LandingLens uses API keys to authenticate access to the system. You will use an API Key in the command to download your computer vision model from LandingLens.

LandingLens API keys are managed on the API Keys page. To learn how to locate your LandingLens API key, go to Retrieve API Key.

"New" API Credentials Only Use the API Key

Locate Your License Key

When you purchase the Docker deployment solution from Landing AI, your representative will give you a unique license key. You will use this license to launch an instance of the Docker deployment solution.

Note:
Docker licenses are different from LandingEdge licenses. They can’t be used interchangeably.

Get a License Key for a Free Trial

If you haven’t purchased the Docker deployment solution from Landing AI, you can run a 14-day free trial! 

You can generate the license key for the free trial directly from the Docker solution by running the trial-license command. The response will contain the license key for the free trial.

To run the trial-license command:

docker run --rm public.ecr.aws/landing-ai/deploy:latest trial-license --apikey API_KEY

This outputs:

[2023-10-18 20:00:04.907 +00:00] [INF] Retrieving trial license
[2023-10-18 20:00:04.974 +00:00] [INF] [Licensing] Requesting trial license
[2023-10-18 20:00:07.181 +00:00] [INF] [Licensing] New license provided: [LK-AAAAAA-BBBBBB-CCCCCC-DDDDDD-EEEEEE-XX]
[2023-10-18 20:00:09.414 +00:00] [WRN] [Licensing] License will expire soon!
[2023-10-18 20:00:09.414 +00:00] [INF] [Licensing] License will expire in approximately [13, 23, 59] (days, hours, minutes)

Save your free-trial license key, which is enclosed in square brackets in this line:

[2023-10-18 20:00:07.181 +00:00] [INF] [Licensing] New license provided: [LK-AAAAAA-BBBBBB-CCCCCC-DDDDDD-EEEEEE-XX]

Launch a Container with the Model

When you're ready to use Docker to deploy a model, launch a container by running the run-model-id command. 

By default, the inference endpoint port is 8000. In the code snippet below, the port flag (-p) sets the ports for the host and the container. The port on the left is the host port, and the port on the right is the container port.

To run the run-model-id command:

docker run -p 8000:8000 --rm -e LANDING_LICENSE_KEY=LICENSE_KEY public.ecr.aws/landing-ai/deploy:latest run-model-id --apikey API_KEY --model MODEL_ID

Run Inference with the LandingLens Python Library

Once the container with the model is ready, you can run inference with the model by using the LandingLens Python library

If your client code (in other words, your Python code) is on the same machine you’re running inference on, you can also use “localhost” instead of the IP address. 

The following example shows how to run inference within an instance hosted on 192.168.1.12, port 8000). For more examples, see our Python library.

from landingai.predict import EdgePredictor
import PIL.Image

predictor = EdgePredictor(host="192.168.1.12", port=8000)
img = PIL.Image.open("/Users/username/Downloads/test.png")
predictions = predictor.predict(img)

Power Inference with GPUs

By default, the Docker deployment doesn’t use any GPUs in your system. However, you can configure the deployment to use some or all of your GPUs. See the list of supported GPUs here.

In this section, we’ll give you the basics about how to enable GPUs to power inference. However, Docker might require you to run commands or download libraries specific to your NVIDIA device and driver version. We recommend reading the Docker and NVIDIA documentation to understand requirements and compatibility.

To use all GPUs in your system to power inference, include the --gpus all flag as part of the docker command. For example:

docker run --gpus all -p 8000:8000 --rm -e LANDING_LICENSE_KEY=LICENSE_KEY public.ecr.aws/landing-ai/deploy:latest run-model-id --apikey API_KEY --model MODEL_ID

To use only specific GPUs in your system to power inference, include the --gpus flag and enter the indices of the GPUs you want to use as a comma-separated list. For example, in the snippet below, the deployment will use the GPUs with index 0 and index 2:

docker run --rm --gpus '"device=0,2"' -p 8000:8000 -e LANDING_LICENSE_KEY=LICENSE_KEY public.ecr.aws/landing-ai/deploy:latest run-model-id --apikey API_KEY --model MODEL_ID

Deploy with Kubernetes

You can deploy the Landing AI Deploy image with Kubernetes and monitor the system with our status APIs.

Here is an example Kubernetes Pod YAML deployment file for the Landing AI Deploy image:

apiVersion: v1
kind: Pod
metadata:
  name: landingai-deploy
  namespace: default
  labels:
    app.kubernetes.io/name: landingai-deploy
    app: landingai-deploy
spec:
  containers:
  - name: landingai-deploy
    image: public.ecr.aws/landing-ai/deploy:latest
    args: ["run-model-id", "-m", "YOUR_MODEL_ID"]
    imagePullPolicy: IfNotPresent
    env:
      - name: LANDING_API_KEY
        value: "YOUR_API_KEY"
      - name: LANDING_LICENSE_KEY
        value: "YOUR_LICENSE_KEY"
    ports:
      - containerPort: 8000
    resources:
      requests:
        memory: 2G
        cpu: 2
      limits:
        memory: 4G
        cpu: 5
    startupProbe:
      httpGet:
        port: 8000
        path: /status
    livenessProbe:
      httpGet:
        port: 8000
        path: /live
      initialDelaySeconds: 5
      periodSeconds: 5
    readinessProbe:
      httpGet:
        port: 8000
        path: /ready
      initialDelaySeconds: 5
      periodSeconds: 5

Web APIs in Swagger

To access the web APIs for the Landing AI Docker deployment solution in Swagger, go to http://localhost:[port], where [port] is the port you’re using to communicate with the Dockerized application.

Use the /status, /ready, and /live endpoints to monitor the status when using Kubernetes or another orchestration system.

Use the /images endpoint to run inference.

You can use these web APIs to programmatically start or monitor inference.

APIs in Swagger

/status

The /status endpoint always returns 200.

/ready

The /ready endpoint returns 200 if the license is valid and the model is loaded. Otherwise, it returns 503.

/live

The /live endpoint returns 200 if the license is valid and the model is loading. Otherwise, it returns 503.

/images

Use the /images endpoint to run inference. Results are returned in JSON.

Note:
The /api/v1/images endpoint is provided for compatibility with an older version of the Landing AI Docker image. It is supported, but the JSON results are formatted differently.

Commands and Flags

Use the following commands to manage the Landing AI Docker deployment:

run-model-id

Run the run-model-id command to deploy a model.

By default, the inference endpoint port is 8000. In the code snippet below, the port flag (-p) sets the ports for the host and the container. The port on the left is the host port, and the port on the right is the container port.

docker run -p 8000:8000 --rm -e LANDING_LICENSE_KEY=LICENSE_KEY public.ecr.aws/landing-ai/deploy:latest run-model-id --apikey API_KEY --model MODEL_ID

When you run the command, the Deploy application downloads your model from LandingLens. When it’s done, this message displays:

 [INF] Model loading status: [Ready]

Examples

Deploy the model when you have only an API Key (and not an API Key and API Secret): 

run-model-id --apikey API_KEY --model MODEL_ID

Deploy the model when you have an API Key and an API Secret:

run-model-id --apikey API_KEY --apisecret API_SECRET --model MODEL_ID

Deploy the model. When you send images for inference, save those images and the predictions to the corresponding project in LandingLens: 

run-model-id --apikey API_KEY --model MODEL_ID --upload

Deploy the model. When you deploy a model, the name of the device displays on the Deploy page in LandingLens. Set a device name:

run-model-id --apikey API_KEY --model MODEL_ID --name "my edge device" --upload

Flags

FlagDescription
-k, --apikey

Required. Set an API Key for your LandingLens organization. Can also set through the 'LANDING_API_KEY' environment variable.


Note:
Up until June 30, 2023, LandingLens required both an API Key and API Secret to be used to run inference. As of that date, LandingLens only generates API Keys (and not API Key and API Secret pairs). For more information, go to API Keys.
-s, --apisecret

If you’re using a “legacy” API Key and API Secret pair, set the API Secret. Can also set through the 'LANDING_API_SECRET' environment variable.


Note:
Up until June 30, 2023, LandingLens required both an API Key and API Secret to be used to run inference. As of that date, LandingLens only generates API Keys (and not API Key and API Secret pairs). For more information, go to API Keys.
-m, --modelRequired. Set the Model ID of the model you want to load. To locate your Model ID, go to Locate Your Model ID. Can also set through the 'MODEL_ID' environment variable.
-p, --port

The port number to use for communicating with the deployed model via API. Can also set through the 'PORT' environment variable.


Default: 8000.

-e, --external

Allow external hosts to access the API. Can also set through the 'ALLOW_EXTERNAL' environment variable.


If running in a container, the default is true. Otherwise, the default is false.

-u, --upload

When you send images for inference, save those images and the predictions to the corresponding project in LandingLens. Can also set through the 'UPLOAD_RESULTS' environment variable.


Default: false.

-g, --gpus

Select the GPUs you want to use to run inference. Include a space-separated list of the GPU indices. If you select multiple GPUs, the system will balance the load between the processors. Can also set through the 'GPUS' environment variable.


Default: use all GPUs available.

-n, --name

When you deploy a model, the name of the device displays on the Deploy page in LandingLens. Use this flag to set the device name. Can also set through the 'DEVICE_NAME' environment variable.


If unspecified the default is 'LE-{hostname}'.

--helpDisplay more information about the command.
--versionDisplay version information.

trial-license

Run the trial-license command to generate a license key for a 14-day free trial of the Docker deployment solution:

docker run --rm public.ecr.aws/landing-ai/deploy:latest trial-license --apikey API_KEY

This outputs:

[2023-10-18 20:00:04.907 +00:00] [INF] Retrieving trial license
[2023-10-18 20:00:04.974 +00:00] [INF] [Licensing] Requesting trial license
[2023-10-18 20:00:07.181 +00:00] [INF] [Licensing] New license provided: [LK-AAAAAA-BBBBBB-CCCCCC-DDDDDD-EEEEEE-XX]
[2023-10-18 20:00:09.414 +00:00] [WRN] [Licensing] License will expire soon!
[2023-10-18 20:00:09.414 +00:00] [INF] [Licensing] License will expire in approximately [13, 23, 59] (days, hours, minutes)

Save your free-trial license key, which is enclosed in square brackets in this line:

[2023-10-18 20:00:07.181 +00:00] [INF] [Licensing] New license provided: [LK-AAAAAA-BBBBBB-CCCCCC-DDDDDD-EEEEEE-XX]

Examples

Generate a license key when you have only an API Key (and not an API Key and API Secret): 

trial-license --apikey API_KEY

Generate a license key when you have an API Key and an API Secret:

trial-license --apikey API_KEY --apisecret API_SECRET

Flags

FlagDescription
-k, --apikey

Required. Set an API Key for your LandingLens organization. Can also set through the 'LANDING_API_KEY' environment variable.


Note:
Up until June 30, 2023, LandingLens required both an API Key and API Secret to be used to run inference. As of that date, LandingLens only generates API Keys (and not API Key and API Secret pairs). For more information, go to API Keys.
-s, --apisecret

If you’re using a “legacy” API Key and API Secret pair, set the API Secret. Can also set through the 'LANDING_API_SECRET' environment variable.


Note:
Up until June 30, 2023, LandingLens required both an API Key and API Secret to be used to run inference. As of that date, LandingLens only generates API Keys (and not API Key and API Secret pairs). For more information, go to API Keys.
--helpDisplay more information about the command.
--versionDisplay version information.

license

Run the license command to return information about the license key. The license command doesn’t have any flags.

Example response:

[2023-10-06 12:10:27.523 -07:00] [WRN] [Licensing] License will expire soon!
[2023-10-06 12:10:27.524 -07:00] [INF] [Licensing] License will expire in 11:21:29
{
  "valid": true,
  "licenseType": "freetrial",
  "expirationDate": "2023-10-18T16:40:13+00:00",
  "orgId": 1234,
  "registrationCode": "...."
}
[2023-10-06 12:10:27.879 -07:00] [INF] Exiting

run-local-model

Note:
This command is only available to specific Enterprise users. For more information, contact your Landing AI representative.

Run the run-local-model command to deploy a model when your device isn’t connected to the Internet. This requires you to first download the model bundle, which is a feature that is only available to specific Enterprise users. For more information, contact your Landing AI representative.

Examples

Deploy the model bundle:

run-local-model --model models\model-bundle.zip

Deploy the model. When you send images for inference, save those images and the predictions to the corresponding project in LandingLens:

run-local-model --model models\model-bundle.zip --upload

Deploy the model. When you deploy a model, the name of the device displays on the Deploy page in LandingLens. Set a device name:

run-local-model --model models\model-bundle.zip --name "my edge device" --upload

Flags

FlagDescription
-m, --modelRequired. The model bundle to load. This will be a zip file. Can also set through the 'MODEL_PATH' environment variable.
-p, --port

The port number to use for communicating with the deployed model via API. Can also set through the 'PORT' environment variable.


Default: 8000.

-e, --external

Allow external hosts to access the API. Can also set through the 'ALLOW_EXTERNAL' environment variable.


If running in a container, the default is true. Otherwise, the default is false.

-u, --upload

When you send images for inference, save those images and the predictions to the corresponding project in LandingLens. Can also set through the 'UPLOAD_RESULTS' environment variable.


Default: false.

-g, --gpus

Select the GPUs you want to use to run inference. Include a space-separated list of the GPU indices. If you select multiple GPUs, the system will balance the load between the processors. Can also set through the 'GPUS' environment variable.


Default: use all GPUs available.

-n, --name

When you deploy a model, the name of the device displays on the Deploy page in LandingLens. Use this flag to set the device name. Can also set through the 'DEVICE_NAME' environment variable.


If unspecified the default is 'LE-{hostname}'.

--helpDisplay more information about the command.
--versionDisplay version information.

Troubleshooting: User Declined Directory Sharing

Scenario

You receive the following error message when running the run-local-model command:

Error response from daemon: user declined directory sharing [directory path].

Solution

This error indicates that the Landing AI Deployment application does not have permission to access the directory you included in the command. Add the directory to the File Sharing list in Docker Desktop. The way to do this depends on your operating system. See the instructions for your system Mac, Windows, Linux.


Was this article helpful?