Pipelines
Build a deployment pipeline to process data and deploy a model
This feature is coming soon.

Overview

Pipeline builder is a tool that can be used to configure deployment pipelines. Here you can create a new pipeline, view your existing pipelines, configure the inputs and outputs for each pipeline, set the processing schedule, add image transformations and prediction blocks, publish and run your pipeline, and stop an active pipeline.

Create a Pipeline

Pipelines can be created and managed in the Pipelines section of the platform.
In order to use a model as a pipeline block, a model version must already be successfully trained with SmartML.
  1. 1.
    Navigate to the "Pipelines" tab on the left sidebar.
  2. 2.
    Click the "Create New Pipeline" button.
  3. 3.
    Enter a name for your pipeline. Name must consist of only alphanumeric characters, starting with a letter and ending with a letter or number, and must not exceed 64 characters in length.
  4. 4.
    Select your Pipeline Type, either Batch or Streaming. Read more about Pipeline Types.
  5. 5.
    Select your Pipeline Input Type, images or video. Read more about Pipeline Input Types.
  6. 6.
    Add 1 or more input sources. These are the sources where data will be pulled from to be used as input for the pipeline. For Batch deployments, set a processing schedule for when new data is pulled from your input sources and run through your pipeline.
  7. 7.
    Configure the pipeline blocks in your deployment process.
  8. 8.
    Add 1 or more output destinations. These are the locations where your pipeline output will be written to.
  9. 9.
    Publish your pipeline.

Pipeline Types

Plainsight supports two different pipeline types: batch or streaming. Each type serves a different purpose, so be sure to select the pipeline type that best fits your needs.
Select the appropriate pipeline for your project.

Batch

A batch pipeline allows you to process data in either scheduled or ad hoc batches. It is used for medical image interpretation, recorded video/image capture, satellite imagery, and one-off deployment testing. Batch pipelines accept image and video input through a cloud storage bucket. They can output data to a cloud storage bucket or publish data to a Google Pub/Sub topic.

Streaming

A streaming pipeline has always-on processing capabilities with low latency. This is needed for AutoLabel functionality, API integrations with a mobile app, etc. Streaming pipelines accept image input through sources a Google Pub/Sub topic or a static endpoint.

Pipeline Input Types

Depending on the pipeline type selected, you can choose from image or video input. Mixed input types are not yet supported.

Image Input

Select image input for pipelines that will process only still images.
This input type is designed for pipelines that will process still images. This type can be selected for batch or streaming pipelines.

Video Input

Select video input for pipelines that will process video files.
This input type is designed for pipelines that will run on video files. This type can be selected for batch pipelines only.
Videos will be split into individual frames for processing. Enter the desired Frame Rate in frames per second. This value defaults to 1 fps.
Note: Lower frames per second (fps) will use less compute resources, but may impact the accuracy of some models or processors such as object tracking. For use cases involving moderate speeds such as running or walking, we find 8fps provides good results.
Read more about connecting inputs to your pipeline:

Publishing a Pipeline

Once you have configured at least one input, one pipeline block, and one output, you can publish your pipeline.
  1. 1.
    Navigate to your pipeline's "Summary" tab.
  2. 2.
    Click "Publish Pipeline"
Click here to publish your pipeline.
  • Batch pipelines scheduled to run "On-Publish" will begin running within several minutes.
  • Batch pipelines with a scheduled start time will begin running at the time indicated by their processing schedule.
  • Streaming pipelines may take anywhere from 20 minutes to about an hour, possibly longer, to be ready to process data.
Streaming pipelines can take some time to provision and be ready to process data.

Running a Batch Pipeline

Batch pipelines can be set to run "On-Publish" which allows them to be run once after publishing, and then manually anytime after the initial run.
To run a batch pipeline manually:
  1. 1.
    Navigate to the pipeline's "Summary" tab
  2. 2.
    Click "Run Pipeline Now"
Click to run this batch pipeline manually.

Stopping a Pipeline

Batch pipelines that run on-publish are inactive once they have completed their run.
Batch pipelines on a schedule or streaming pipelines can be stopped by:
  1. 1.
    Navigate to the pipeline's "Summary" tab
  2. 2.
    Click "Stop Now".
Click to stop the pipeline and prevent further processing.

Pipeline States

The Pipelines table lets you manage each of your pipelines and displays their current statuses.

Batch Pipeline States

  • Draft = pipeline has never been published
  • Inactive = pipeline has been deployed and paused
  • Scheduled = pipeline is scheduled and the latest job is not active
  • Processing = latest job is active
  • Error = latest job failed

Streaming Pipeline States

  • Draft = deployment has never been published
  • NotStarted = job has not started
  • Starting = pipeline is starting
  • Provisioning Resources - pipeline is being provisioned
  • Live = pipeline is running
  • Stopping = pipeline is stopping
  • Inactive = pipeline has been deployed and paused
  • Failed = failed job