Comment on page
Pipelines
Build a deployment pipeline to process data and deploy a model
Plainsight Pipelines is a guided interface that streamlines the setup, processing, and management of visual data workflows, making it easier than ever to operationalize computer vision models, create custom data transformations, and prediction pipelines at scale. Pipeline features include:
- Streaming & Batch Inputs: Support for Streaming (always on) and batch (scheduled or one-off) pipeline ingestion of visual data via cloud buckets, Google Pub/Sub, and/or API endpoints.
- Modular Pipeline Blocks: Allow for sequential processing of visual data. Setup automated processing augmentations like image resize, cropping, and tiling that help you manage and transform your visual data pipeline at scale.
- Pre-built and Custom Model Blocks: Bring your custom-trained Plainsight model into your pipeline or use our off-the-shelf Object Detection model that can detect over 80+ classes/object types.

Pipelines can be created and managed in the Pipelines section of the platform.
In order to use a model as a pipeline block, a model version must already be successfully trained with SmartML.

- 1.Navigate to the "Pipelines" tab on the left sidebar.
- 2.Click the "Create New Pipeline" button.
- 3.Enter a name for your pipeline. Name must consist of only alphanumeric characters, starting with a letter and ending with a letter or number, and must not exceed 64 characters in length.
- 4.
- 5.
- 6.Add 1 or more input sources. These are the sources where data will be pulled from to be used as input for the pipeline. For Batch deployments, set a processing schedule for when new data is pulled from your input sources and run through your pipeline.
- 7.
- 8.Add 1 or more output destinations. These are the locations where your pipeline output will be written to.
- 9.
Plainsight supports two different pipeline types: batch or streaming. Each type serves a different purpose, so be sure to select the pipeline type that best fits your needs.

Select the appropriate pipeline for your project.
A batch pipeline allows you to process data in either scheduled or ad hoc batches. It is used for medical image interpretation, recorded video/image capture, satellite imagery, and one-off deployment testing. Batch pipelines accept image and video input through a cloud storage bucket. They can output data to a cloud storage bucket or publish data to a Google Pub/Sub topic.
A streaming pipeline has always-on processing capabilities with low latency. This is needed for AutoLabel functionality, API integrations with a mobile app, etc. Streaming pipelines accept image input through sources a Google Pub/Sub topic or a static endpoint.
Depending on the pipeline type selected, you can choose from image or video input. Mixed input types are not yet supported.

Select image input for pipelines that will process only still images.
This input type is designed for pipelines that will process still images. This type can be selected for batch or streaming pipelines.

Select video input for pipelines that will process video files.
This input type is designed for pipelines that will run on video files. This type can be selected for batch pipelines only.
Videos will be split into individual frames for processing. Enter the desired Frame Rate in frames per second. This value defaults to 1 fps.
Note: Lower frames per second (fps) will use less compute resources, but may impact the accuracy of some models or processors such as object tracking. For use cases involving moderate speeds such as running or walking, we find 8fps provides good results.
Read more about connecting inputs to your pipeline:
Once you have configured at least one input, one pipeline block, and one output, you can publish your pipeline.
- 1.Navigate to your pipeline's "Summary" tab.
- 2.Click "Publish Pipeline"

Click here to publish your pipeline.
- Batch pipelines scheduled to run "On-Publish" will begin running within several minutes.
- Batch pipelines with a scheduled start time will begin running at the time indicated by their processing schedule.
- Streaming pipelines may take anywhere from 20 minutes to about an hour, possibly longer, to be ready to process data.

Streaming pipelines can take some time to provision and be ready to process data.
The Pipelines list allows you to manage our pipelines and see their current statuses.

- Draft = pipeline has never been published
- Inactive = pipeline has been deployed and paused
- Scheduled = pipeline is scheduled and the latest job is not active
- Processing = latest job is active
- Error = latest job failed
- Draft = deployment has never been published
- Starting = pipeline is starting
- Provisioning Resources - pipeline is being provisioned
- Live = pipeline is running
- Stopping = pipeline is stopping
- Inactive = pipeline has been deployed and paused
- Failed = failed job
If the pipeline has been published:
- 1.Click on the pipeline in the Pipelines list, or navigate to the pipeline's "Summary" tab.
- 2.Click the "Edit Pipeline" button at the top right. This will create a draft and allow you to edit a new version your pipeline.

Click "Edit Pipeline" from the "Summary" tab to make changes to your pipeline.
3. When you are finished, navigate back to the "Summary" tab and click "Publish Pipeline" to save your changes. You will be prompted to publish a new version of your pipeline.
Publishing a new version will process all data from cloud bucket inputs, including data that was previously processed by another pipeline version. It will also overwrite any output data from previous versions that is present in the pipeline's output buckets. If you want to retain existing output data, specify a new output path (folder) for this version.
4. Click "Continue to Publish" to proceed, or "Cancel" to continue editing your draft.
If a pipeline is still in draft state/has not yet been published:
- 1.Click on the pipeline in the Pipelines list. The Pipeline will already be in draft mode.
- 2.Make any changes you need to.
- 3.When you are finished, click "Publish Pipeline"
Pipeline drafts can be discarded. To clear pipeline drafts:
- 1.
- 2.Click "Discard Draft" to delete the current draft.
Drafts of unpublished pipelines can be cleared from the Pipelines list by clicking "Delete" under the Actions column.

Click Delete to discard the unpublished pipeline.
Batch pipelines can be set to run "On-Publish" which allows them to be run once after publishing, and then manually anytime after the initial run.
To run a batch pipeline manually:
- 1.Navigate to the pipeline's "Summary" tab
- 2.Click "Run Pipeline Now"

Click to run this batch pipeline manually.
Batch pipelines that run on-publish are inactive once they have completed their run.
Batch pipelines on a schedule or streaming pipelines can be stopped by:
- 1.Navigate to the pipeline's "Summary" tab
- 2.Click "Stop Now".

Click to stop the pipeline and prevent further processing.
or through the Pipelines list by clicking "Stop" under Actions:

3. A confirmation prompt will be presented informing you what will happen when a pipeline is stopped. Click "Stop Pipeline" to continue canceling this job.
When stopping a pipeline, any resources that have been started will be stopped.
Batch: Any assets in the middle of processing will be completed before the pipeline is stopped. Scheduling will also be stopped at the same time that the job is stopped. Subsequent runs will reprocess all assets from the stopped batch.
Streaming: Any processing in progress will be stopped and partially processed assets will not be output.
Last modified 1yr ago