Export Labels
Export labeled datasets to use in ML projects

Exporting Labels

  1. 1.
    Go to the "Versions" tab
  2. 2.
    Lock your dataset. If there are any unapproved labels, you will be prompted to auto-approve all.
  3. 3.
    Select desired format for export.
  4. 4.
    Select or de-select the desired label types to export, if applicable.
  5. 5.
    Click "Export Now"
You will be notified by email and in-app notification when your export is ready for download.

Supported Data Export Formats

Plainsight supports several popular formats for exporting labeled datasets.
  • Plainsight own JSON format, used to import project data into another Plainsight account or train models outside of Plainsight.
  • Create ML Classifier - Apple's machine learning model creation and training framework. Only Classification label types can be exported in this format.
  • Create ML Object Detection - Apple's machine learning model creation and training framework. Only Rectangle (Bounding box) label types can be exported to this format.
  • COCO - a large-scale object detection, segmentation, and captioning dataset. All label types supported by HyperLabel can be exported to this format.
  • YOLO - a real-time object detection algorithm. Only Rectangles and Polygons can be exported to this format.
  • Pascal VOC - “Pattern Analysis, Statistical Modeling and Computational Learning Visual Object Classes” format is the input to the Pascal object detector. Rectangle and Polygon (converted to Rectangle) are supported.

Split Datasets

Plainsight allows you to utilize dataset splits when exporting your labels.
Split Dataset options

Why split datasets?

When training a computer vision or deep learning model, it's common practice to use 3 separate datasets: train, validation, & test.
Train datasets are composed of the data that's actually used to train a model. This is the data you want the model to see and learn from.
Validation datasets are used to see how your model is doing while training. Usually after a certain number of epochs (data cycles), the model is run on the validation dataset and returns an accuracy/loss score. Since it's never seen this data before, seeing accuracy going up and loss going down is a good indicator that your model is learning correct patterns and will generalize well to new data it's never seen. You can adjust hyperparameters based on the output of the model on this data. A hyperparameter is a parameter whose value is used to control the learning process.
Test datasets are a holdout dataset that should only be used once you have completely trained a model and want to verify that it works on data it's never seen. This is different than the validation dataset because you should not tweak hyperparmeters to try and fit it to the test data.
Read more about dataset splits here and here.
Last modified 4mo ago