- 04 Dec 2023
- 14 Minutes to read
- Print
- DarkLight
- PDF
Landing AI Docker
- Updated on 04 Dec 2023
- 14 Minutes to read
- Print
- DarkLight
- PDF
Landing AI has launched a new way to deploy models using Docker, enabling DevOps users to integrate the model inference programmatically.
The Docker approach is headless, which means that it has no user interface. This allows you to manage inference programmatically and scale deployments quickly.
After you deploy your model to an endpoint using our Docker approach, use the Landing AI Python SDK to run inference on the model.
Here’s a quick summary of benefits that the Docker deployment offers:
- Deployable in a private cloud environment or on any operating system
- Total API control (including remote access) for inference
- Can manage deployments programmatically
- Not limited to a certain amount of inferences per minute
- Can be used with or without a GPU
Requirements
Ensure that you meet all the requirements before installing and running the Landing AI Deploy Docker image:
- Required applications
- System requirements
- GPU support
Required Applications
Install the following applications and systems in order to install and run the Landing AI Deploy Docker image:
- Python
- Docker Engine: Install the correct version for your operating system. If you plan to use an Amazon instance, we recommend using Ubuntu AMI.
- Landing AI Python Library: You will use the API to run inference.
System Support and Requirements
Item | Requirement |
---|---|
Supported Operating Systems | Linux, macOS, Windows |
Memory |
|
Supported Processors and Architecture | x86-64, ARM64 X86-64 includes, but is not limited to:
ARM64 includes, but is not limited to:
|
Internet Connection | Required to download the Docker image and run inference |
NVIDIA Jetson Support
NVIDIA Jetson devices are supported but not required. If you choose to use an NVIDIA Jetson device:
- Minimum: NVIDIA Jetson Orin Nano (or better/similar)
- Recommended: NVIDIA Jetson Orin AGX (or better/similar)
GPU Support
Install the Landing AI Image
To install the Landing AI Docker image, you will pull the image from Amazon Elastic Container Registry (Amazon ECR), a library of containerized applications. The image size is about 6GB. You can see all of the Landing AI "Deploy" images here.
You only need to install (pull) the image once. After you've downloaded the image, you're ready to launch an instance of it.
To pull the image, run this command:
docker pull public.ecr.aws/landing-ai/deploy:latest
Install the Image on Jetson Devices
We provide separate images for users on NVIDIA Jetson devices. To pull the image you need, run the command based on your JetPack version.
For NVIDIA Jetson devices with JetPack 4.x.x:
Go to the Landing AI repository in Amazon ECR and download the most recent version with the -jp4
suffix.
docker pull public.ecr.aws/landing-ai/deploy:VERSION-jp4
For NVIDIA Jetson devices with JetPack 5.x.x:
Go to the Landing AI repository in Amazon ECR and download the most recent version with the -jp5
suffix.
docker pull public.ecr.aws/landing-ai/deploy:VERSION-jp5
Docker Deployment Quick-Start Guide
To run inference on a model after installing the Landing AI Docker image, go through this checklist:
- Ensure you have a model in LandingLens that you want to run inference on. (New to LandingLens? Learn how to train a model here.)
- Locate the Model ID for the model you want to run inference on.
- Locate your LandingLens API key.
- Locate your license key.
- Launch a container with the model.
- Run inference with the Landing AI Python SDK.
Locate Your Model ID
The Model ID is included in the command to deploy the model via Docker. The Model ID tells the application which model to download from LandingLens.
To locate your Model ID:
- Open the project with the model you want to deploy.
- Click the Deploy tab.
- Click Deployable Models.
- Click the Copy Model ID icon for the model you want to deploy via Docker. (Don't see your model listed? Go here.)
Copy the Model IDThe Model ID is copied to your clipboard.
Don't See Your Model Listed in Deployable Models? Create a Deployable Model
Before December 2023, only models that had been deployed via Cloud Deployment or had deployable models generated were available in the Deployable Models table.
Starting in early December 2023, all trained models display in the Deployable Models table. Each model has its best Confidence Score. If you want to use a different Confidence Score for a model, or if you don't see your model in the Deployable Models table, you can either deploy it via Cloud Deployment or use the following procedure to generate a deployable model:
- Open the project with the model you want to deploy.
- Click Models to open the Models panel.
Open the Models Panel
- Select the model you want to deploy from the Model drop-down menu.
- Use the Confidence Threshold slider to select the Confidence Score you want the model to have.
- Click Deploy.
Deploy the Model after Selecting a Confidence Score
- Select Self-Hosted Deployments.
- Click Generate Deployable Model.LandingLens creates a deployable model and opens to the Deployable Models table on the Deploy page.
Generate a Deployable Model
Locate Your LandingLens API Key
LandingLens uses API keys to authenticate access to the system. You will use an API Key in the command to download your computer vision model from LandingLens.
LandingLens API keys are managed on the API Keys page. To learn how to locate your LandingLens API key, go to Retrieve API Key.

Locate Your License Key
When you purchase the Docker deployment solution from Landing AI, your representative will give you a unique license key. You will use this license to launch an instance of the Docker deployment solution.
Get a License Key for a Free Trial
If you haven’t purchased the Docker deployment solution from Landing AI, you can run a 14-day free trial!
You can generate the license key for the free trial directly from the Docker solution by running the trial-license command. The response will contain the license key for the free trial.
To run the trial-license
command:
docker run --rm public.ecr.aws/landing-ai/deploy:latest trial-license --apikey API_KEY
This outputs:
[2023-10-18 20:00:04.907 +00:00] [INF] Retrieving trial license
[2023-10-18 20:00:04.974 +00:00] [INF] [Licensing] Requesting trial license
[2023-10-18 20:00:07.181 +00:00] [INF] [Licensing] New license provided: [LK-AAAAAA-BBBBBB-CCCCCC-DDDDDD-EEEEEE-XX]
[2023-10-18 20:00:09.414 +00:00] [WRN] [Licensing] License will expire soon!
[2023-10-18 20:00:09.414 +00:00] [INF] [Licensing] License will expire in approximately [13, 23, 59] (days, hours, minutes)
Save your free-trial license key, which is enclosed in square brackets in this line:
[2023-10-18 20:00:07.181 +00:00] [INF] [Licensing] New license provided: [LK-AAAAAA-BBBBBB-CCCCCC-DDDDDD-EEEEEE-XX]
Launch a Container with the Model
When you're ready to use Docker to deploy a model, launch a container by running the run-model-id command.
By default, the inference endpoint port is 8000. In the code snippet below, the port
flag (-p
) sets the ports for the host and the container. The port on the left is the host port, and the port on the right is the container port.
To run the run-model-id
command:
docker run -p 8000:8000 --rm -e LANDING_LICENSE_KEY=LICENSE_KEY public.ecr.aws/landing-ai/deploy:latest run-model-id --apikey API_KEY --model MODEL_ID
Run Inference with the Landing AI Python SDK
Once the container with the model is ready, you can run inference with the model by using the Landing AI Python SDK.
If your client code (in other words, your Python code) is on the same machine you’re running inference on, you can also use “localhost” instead of the IP address.
The following example shows how to run inference within an instance hosted on 192.168.1.12, port 8000). For more examples, see our Python library.
from landingai.predict import EdgePredictor
import PIL.Image
predictor = EdgePredictor(host="192.168.1.12", port=8000)
img = PIL.Image.open("/Users/username/Downloads/test.png")
predictions = predictor.predict(img)
Power Inference with GPUs
By default, the Docker deployment doesn’t use any GPUs in your system. However, you can configure the deployment to use some or all of your GPUs. See the list of supported GPUs here.
In this section, we’ll give you the basics about how to enable GPUs to power inference. However, we recommend familiarizing yourself with Docker’s approach to GPU management in their Docker Engine documentation.
To use all GPUs in your system to power inference, include the --gpus all
flag as part of the docker command. For example:
docker run --gpus all -p 8000:8000 --rm -e LANDING_LICENSE_KEY=LICENSE_KEY public.ecr.aws/landing-ai/deploy:latest run-model-id --apikey API_KEY --model MODEL_ID
To use only specific GPUs in your system to power inference, include the --gpus
flag and enter the indices of the GPUs you want to use as a comma-separated list. For example, in the snippet below, the deployment will use the GPUs with index 0 and index 2:
docker run --rm --gpus '"device=0,2"' -p 8000:8000 -e LANDING_LICENSE_KEY=LICENSE_KEY public.ecr.aws/landing-ai/deploy:latest run-model-id --apikey API_KEY --model MODEL_ID
Deploy with Kubernetes
You can deploy the Landing AI Deploy image with Kubernetes and monitor the system with our status APIs.
Here is an example Kubernetes Pod YAML deployment file for the Landing AI Deploy image:
apiVersion: v1
kind: Pod
metadata:
name: landingai-deploy
namespace: default
labels:
app.kubernetes.io/name: landingai-deploy
app: landingai-deploy
spec:
containers:
- name: landingai-deploy
image: public.ecr.aws/landing-ai/deploy:latest
args: ["run-model-id", "-m", "YOUR_MODEL_ID"]
imagePullPolicy: IfNotPresent
env:
- name: LANDING_API_KEY
value: "YOUR_API_KEY"
- name: LANDING_LICENSE_KEY
value: "YOUR_LICENSE_KEY"
ports:
- containerPort: 8000
resources:
requests:
memory: 2G
cpu: 2
limits:
memory: 4G
cpu: 5
startupProbe:
httpGet:
port: 8000
path: /status
livenessProbe:
httpGet:
port: 8000
path: /live
initialDelaySeconds: 5
periodSeconds: 5
readinessProbe:
httpGet:
port: 8000
path: /ready
initialDelaySeconds: 5
periodSeconds: 5
Web APIs in Swagger
To access the web APIs for the Landing AI Docker deployment solution in Swagger, go to http://localhost:[port], where [port] is the port you’re using to communicate with the Dockerized application.
Use the /status
, /ready
, and /live
endpoints to monitor the status when using Kubernetes or another orchestration system.
Use the /images
endpoint to run inference.
You can use these web APIs to programmatically start or monitor inference.

/status
The /status
endpoint always returns 200
.
/ready
The /ready
endpoint returns 200
if the license is valid and the model is loaded. Otherwise, it returns 503
.
/live
The /live
endpoint returns 200
if the license is valid and the model is loading. Otherwise, it returns 503
.
/images
Use the /images
endpoint to run inference. Results are returned in JSON.
/api/v1/images
endpoint is provided for compatibility with an older version of the Landing AI Docker image. It is supported, but the JSON results are formatted differently.Commands and Flags
Use the following commands to manage the Landing AI Docker deployment:
run-model-id
Run the run-model-id
command to deploy a model.
By default, the inference endpoint port is 8000. In the code snippet below, the port
flag (-p
) sets the ports for the host and the container. The port on the left is the host port, and the port on the right is the container port.
docker run -p 8000:8000 --rm -e LANDING_LICENSE_KEY=LICENSE_KEY public.ecr.aws/landing-ai/deploy:latest run-model-id --apikey API_KEY --model MODEL_ID
When you run the command, the Deploy application downloads your model from LandingLens. When it’s done, this message displays:
[INF] Model loading status: [Ready]
Examples
Deploy the model when you have only an API Key (and not an API Key and API Secret):
run-model-id --apikey API_KEY --model MODEL_ID
Deploy the model when you have an API Key and an API Secret:
run-model-id --apikey API_KEY --apisecret API_SECRET --model MODEL_ID
Deploy the model. When you send images for inference, save those images and the predictions to the corresponding project in LandingLens:
run-model-id --apikey API_KEY --model MODEL_ID --upload
Deploy the model. When you deploy a model, the name of the device displays on the Deploy page in LandingLens. Set a device name:
run-model-id --apikey API_KEY --model MODEL_ID --name "my edge device" --upload
Flags
Flag | Description |
---|---|
-k , --apikey | Required. Set an API Key for your LandingLens organization. Can also set through the Note: Up until June 30, 2023, LandingLens required both an API Key and API Secret to be used to run inference. As of that date, LandingLens only generates API Keys (and not API Key and API Secret pairs). For more information, go to API Keys. |
-s , --apisecret | If you’re using a “legacy” API Key and API Secret pair, set the API Secret. Can also set through the Note: Up until June 30, 2023, LandingLens required both an API Key and API Secret to be used to run inference. As of that date, LandingLens only generates API Keys (and not API Key and API Secret pairs). For more information, go to API Keys. |
-m , --model | Required. Set the Model ID of the model you want to load. To locate your Model ID, go to Locate Your Model ID. Can also set through the 'MODEL_ID' environment variable. |
-p , --port | The port number to use for communicating with the deployed model via API. Can also set through the Default: |
-e , --external | Allow external hosts to access the API. Can also set through the If running in a container, the default is |
-u , --upload | When you send images for inference, save those images and the predictions to the corresponding project in LandingLens. Can also set through the Default: |
-g , --gpus | Select the GPUs you want to use to run inference. Include a space-separated list of the GPU indices. If you select multiple GPUs, the system will balance the load between the processors. Can also set through the Default: use all GPUs available. |
-n , --name | When you deploy a model, the name of the device displays on the Deploy page in LandingLens. Use this flag to set the device name. Can also set through the If unspecified the default is |
--help | Display more information about the command. |
--version | Display version information. |
trial-license
Run the trial-license
command to generate a license key for a 14-day free trial of the Docker deployment solution:
docker run --rm public.ecr.aws/landing-ai/deploy:latest trial-license --apikey API_KEY
This outputs:
[2023-10-18 20:00:04.907 +00:00] [INF] Retrieving trial license
[2023-10-18 20:00:04.974 +00:00] [INF] [Licensing] Requesting trial license
[2023-10-18 20:00:07.181 +00:00] [INF] [Licensing] New license provided: [LK-AAAAAA-BBBBBB-CCCCCC-DDDDDD-EEEEEE-XX]
[2023-10-18 20:00:09.414 +00:00] [WRN] [Licensing] License will expire soon!
[2023-10-18 20:00:09.414 +00:00] [INF] [Licensing] License will expire in approximately [13, 23, 59] (days, hours, minutes)
Save your free-trial license key, which is enclosed in square brackets in this line:
[2023-10-18 20:00:07.181 +00:00] [INF] [Licensing] New license provided: [LK-AAAAAA-BBBBBB-CCCCCC-DDDDDD-EEEEEE-XX]
Examples
Generate a license key when you have only an API Key (and not an API Key and API Secret):
trial-license --apikey API_KEY
Generate a license key when you have an API Key and an API Secret:
trial-license --apikey API_KEY --apisecret API_SECRET
Flags
Flag | Description |
---|---|
-k , --apikey | Required. Set an API Key for your LandingLens organization. Can also set through the Note: Up until June 30, 2023, LandingLens required both an API Key and API Secret to be used to run inference. As of that date, LandingLens only generates API Keys (and not API Key and API Secret pairs). For more information, go to API Keys. |
-s , --apisecret | If you’re using a “legacy” API Key and API Secret pair, set the API Secret. Can also set through the Note: Up until June 30, 2023, LandingLens required both an API Key and API Secret to be used to run inference. As of that date, LandingLens only generates API Keys (and not API Key and API Secret pairs). For more information, go to API Keys. |
--help | Display more information about the command. |
--version | Display version information. |
license
Run the license
command to return information about the license key. The license
command doesn’t have any flags.
Example response:
[2023-10-06 12:10:27.523 -07:00] [WRN] [Licensing] License will expire soon!
[2023-10-06 12:10:27.524 -07:00] [INF] [Licensing] License will expire in 11:21:29
{
"valid": true,
"licenseType": "freetrial",
"expirationDate": "2023-10-18T16:40:13+00:00",
"orgId": 1234,
"registrationCode": "...."
}
[2023-10-06 12:10:27.879 -07:00] [INF] Exiting
run-local-model
Run the run-local-model
command to deploy a model when your device isn’t connected to the Internet. This requires you to first download the model bundle, which is a feature that is only available to specific Enterprise users. For more information, contact your Landing AI representative.
Examples
Deploy the model bundle:
run-local-model --model models\model-bundle.zip
Deploy the model. When you send images for inference, save those images and the predictions to the corresponding project in LandingLens:
run-local-model --model models\model-bundle.zip --upload
Deploy the model. When you deploy a model, the name of the device displays on the Deploy page in LandingLens. Set a device name:
run-local-model --model models\model-bundle.zip --name "my edge device" --upload
Flags
Flag | Description |
---|---|
-m , --model | Required. The model bundle to load. This will be a zip file. Can also set through the 'MODEL_PATH' environment variable. |
-p , --port | The port number to use for communicating with the deployed model via API. Can also set through the Default: |
-e , --external | Allow external hosts to access the API. Can also set through the If running in a container, the default is |
-u , --upload | When you send images for inference, save those images and the predictions to the corresponding project in LandingLens. Can also set through the Default: |
-g , --gpus | Select the GPUs you want to use to run inference. Include a space-separated list of the GPU indices. If you select multiple GPUs, the system will balance the load between the processors. Can also set through the Default: use all GPUs available. |
-n , --name | When you deploy a model, the name of the device displays on the Deploy page in LandingLens. Use this flag to set the device name. Can also set through the If unspecified the default is |
--help | Display more information about the command. |
--version | Display version information. |
Troubleshooting: User Declined Directory Sharing
Scenario
You receive the following error message when running the run-local-model
command:
Error response from daemon: user declined directory sharing [directory path].
Solution
This error indicates that the Landing AI Deployment application does not have permission to access the directory you included in the command. Add the directory to the File Sharing list in Docker Desktop. The way to do this depends on your operating system. See the instructions for your system Mac, Windows, Linux.