Cloud Deployment
  • 12 Mar 2024
  • 7 Minutes to read
  • Dark
    Light
  • PDF

Cloud Deployment

  • Dark
    Light
  • PDF

Article Summary

The quickest and easiest deployment method is Cloud Deployment. At a high level, Cloud Deployment is the deployment tool built directly into LandingLens. You can set up the deployment in seconds, upload images, and see the predictions right away in LandingLens. 

Now let's take a closer look at the Cloud Deployment process. First, you create an endpoint, which is a virtual device. The model you select for the deployment will run on that device. Next, you upload images to the endpoint, and the model runs inferences. The results then display in the LandingLens interface.

You can send images to your model for inference by running API calls or using Mobile Inference.

Note:
The only deployment option for Visual Prompting is Cloud Deployment.
Note:
You can run inference up to 40 times per minute per endpoint. If you exceed that limit, the API returns a 429 Too Many Requests response status code. We recommend implementing an error handling or retry function in your application. If you have questions about inference limits, please contact your Landing AI representative or sales@landing.ai.

Deploy Models to Endpoints

You can set up multiple endpoints, which lets you run different models. After an endpoint is created, it can't be renamed or deleted. The model used for an endpoint also can't be changed. 

To deploy a model via Cloud Deployment:

  1. Open the project.
  2. Click Models.
  3. Click the Deploy or + button in the Deployment column for the model you want to deploy.
    Deploy a Model from the Models Page
  4. Select an existing endpoint (if applicable) or click New Endpoint and name it.
  5. Click Deploy.
    Select and Name an Endpoint
  6. LandingLens deploys the model to the endpoint and opens the Deploy page. You can now use this endpoint to run inference.
    The Endpoint Opens on the Deploy Page and Is Ready to Be Used
Note:
You can create endpoints on the Deploy page, but you can't deploy models to those endpoints on that page.

Test the Inference Process (Predict)

After you set up an endpoint, you can test the inference process using the Predict tool. To do this, you click Predict, select a few images directly through the user interface, and the model detects what it was trained to look for. 

You can review the results to check if your model runs correctly. If the results aren't as precise as you'd like, you can go back to the Build tab and fine-tune your model. Or, if the results look good, you can deploy your model on a larger scale through API or LandingEdge

Since Predict is only for testing, the results aren't saved to your project. 

To test the inference process:

  1. Open the project.
  2. Click Deploy.
  3. Select the endpoint you want to use.
  4. Click Predict.
    Predict
  5. Upload images. 
  6. LandingLens runs the model and shows you the results. 

JSON Output

When you run inference through the Predict tool, you have the option to see the coordinates of the bounding boxes as JSON. To do this, run inference and click JSON Output.

The code includes the following information:

  • mediaId: A unique identifier that LandingLens assigns to each image.  
  • defectId: A unique identifier that LandingLens assigns to each detected area. 
  • score: The Confidence Score, which indicates how confident the AI is that its prediction is correct for a specific detected area.
  • coordinates: The x and y coordinates of the edges of a label's bounding box. For more information, go to Coordinates.

Coordinates

The JSON output includes one section for each bounding box (detected area) the model adds. Each section includes the coordinates of the pixels for each corner of the bounding box. The origin for the coordinates is the top left corner of the image. 

For example, let's say that this is the JSON output for an image:

JSON
{   "mediaId": 8706335,   "label": [      {         "defectId": 52479,         "score": 0.99,         "coordinates": {            "xmin": 787,            "ymin": 1078,            "xmax": 1229,            "ymax": 1594         }      }   ]}

The coordinates correspond to the points described in the following table and image.

NameDescriptionCoordinate (in pixels)
xminThe distance from the left edge of the image to the left side of the bounding box.787
yminThe distance from the top of the image to the top side of the bounding box.1078
xmaxThe distance from the left edge of the image to the right side of the bounding box.1229
ymaxThe distance from the top of the image to the bottom side of the bounding box.1594
The Coordinates Correspond to the Edges of the Bounding Box

Run Inference Using API

Note:
You can run inference up to 40 times per minute per endpoint. If you exceed that limit, the API returns a 429 Too Many Requests response status code. We recommend implementing an error handling or retry function in your application. If you have questions about inference limits, please contact your Landing AI representative or sales@landing.ai.

After you create an endpoint, you can run inference by running an API call. This method  saves the images to your project, so that you can review the results and even save images to your dataset. 

To run inference using an API call:

  1. Open the project.
  2. Click Deploy.
  3. Select the endpoint you want to use.
  4. Click Copy to save the API call to your clipboard. (You can also click the Actions icon, select View API Command, and copy the call from the pop-up window that opens.)
    Copy the API Command
  5. Paste the API call into your API platform.
  6. Update the apikey and apisecret parameters with your API Key and API Secret. To generate these credentials, go to API Key and API Secret.
  7. Update the file parameter with the image file names you want to run inference on. For example, after updating the parameters, your API call could look like this:
    JSON
    curl --location --request POST 'https://predict.app.landing.ai/inference/v1/predict?endpoint_id=12a3bc4d-ef56-7ghi-89jk-123lm4n456789' \     --header 'Content-Type: multipart/form-data' \     --header 'apikey: dv9i0feqgoymhobrc9a6wr1yeo83s2i' \     --header 'apisecret: xi25sgrdcy83m6d0xicgdzw75dz4yhoa1ad0vzipd07p0mc999iaawabjcp0bm' \     --form 'file=@"manufacturing_qa_123.jpeg"'
  8. Run the API call.
  9. In LandingLens, open the Project to the Deploy page (if it's already open, you can refresh the page). The results display in the Historical Data tab.

Example: Run the API Call from Postman

Note:
The procedure in this section is based on the user interface of Postman at the time of this article’s publication. However, this interface may change over time. Refer to the Postman documentation for the current information about how to use this application.

There are several ways to run the API call to run inferences. If you're not familiar with APIs, or you just want to test out the API call, you can run the API call from Postman. Postman is a third-party application that, among other things, lets you run API calls for free. Landing AI isn't affiliated with Postman, and we recommend you do your own research to determine if this tool is right for you. 

The following example is intended only to show you one of the ways to run the deployment API.

Set Up Postman

  1. Sign up for a free version of Postman. 
  2. Select the Browser version (web app) and download the Postman Agent. For more information, check out the Postman installation instructions
  3. After the agent is installed, go to this folder on your computer: /Users/username/Postman Agent.
  4. Rename the Postman Agent folder to Postman.
    Note:
    Renaming the folder is a workaround to an issue in Postman. By default, Postman can only access files in this directory:  ~Postman/files. However, installing the Postman Agent creates this directory: ~Postman Agent/files. If you don't rename the folder, Postman can't access the files that you want to send to LandingLens for inference.
  5. Add the files that you want to run inference on to this folder: /Users/username/Postman/files.

Run the API Call

  1. Open the Postman web app in your browser. 
  2. Go to My Workspace.
  3. Click Import in the left side bar.
    Import the API Call
  4. Click Raw text.
  5. Paste your API call in the text field. You can leave the file=@"YOUR IMAGE" parameter as-is, because you will select the image files later.
    Paste in the API Call
  6. Click Continue.
  7. Click Import.
    Import
  8. Postman fills out the parameters in the user interface based on the API call you submitted. You can view this information on the Post, Params, Header, and Body sections.
    View the Filled-Out Sections
  9. Click Body.
  10. Click Select Values.
    Select the Images to Run Inference On
  11. Go to the /Users/username/Postman/files directory and select the files you want to run inference on. 
  12. The files display in the Value column.
  13. Click Send.
    Send the API Call
  14. The response displays at the bottom of the page.
    View the Response in Postman
  15. In LandingLens, open the project to the Deploy page (if it's already open, you can refresh the page). The results display in the Historical Data section.
    View the Image and Data in LandingLens

Deploy Directly from a Model

You can create an endpoint directly from the Build page after you train a model. To do this:

  1. Click Deploy in the Models panel.
    Deploy the Model
  2. Keep the Cloud Deployment option selected.
  3. Select an existing endpoint or click New Endpoint to create a new one.
  4. Click Deploy.
    Name the Model and Select an Endpoint
  5. If you're creating a new endpoint, enter a brief, descriptive name for the endpoint and click Create.
    Name the Endpoint
  6. LandingLens opens the Deploy page and shows the endpoint.

Was this article helpful?