Skip to main content

ChatTag

The ChatTag Filter uses OpenAI's ChatGPT Vision API to automatically analyze and annotate images across diverse domains. It can detect objects, classify content, and generate bounding boxes with confidence scores, integrating seamlessly with OpenFilter pipelines.

This document is automatically published to production documentation on every production release.


✨ Key FeaturesDirect link to ✨ Key Features

  • AI-powered image annotation using OpenAI's ChatGPT Vision API
  • Multi-domain support for any image classification task (food, pets, medical, industrial, etc.)
  • Configurable prompts for different annotation requirements
  • Bounding box detection with precise normalized coordinates
  • Confidence scoring for each annotation with quality validation
  • Cost optimization through smart image resizing and quality settings
  • Multiple output formats including binary classification, COCO detection, and JSONL datasets
  • Environment-variable-based configuration for runtime overrides
  • Web visualization interface for real-time monitoring and debugging

🚀 Usage ModesDirect link to 🚀 Usage Modes

1. Standard Annotation ModeDirect link to 1. Standard Annotation Mode

Given a frame, the filter performs AI analysis and returns structured annotations. The output is stored in:

frame.data['meta']['chatgpt_annotator']

Each annotation includes:

{
"annotations": {
"avocado": {
"present": true,
"confidence": 0.95,
"bbox": [0.2, 0.3, 0.4, 0.5]
}
},
"usage": {
"input_tokens": 26088,
"output_tokens": 107,
"total_tokens": 26195
},
"processing_time": 2.34
}

2. Dataset Generation ModeDirect link to 2. Dataset Generation Mode

If enabled via save_frames, the filter generates multiple dataset formats:

  • Binary Classification: binary_datasets/item_name_labels.json
  • COCO Detection: detection_datasets/annotations.json
  • JSONL Results: labels.jsonl

3. No-ops Testing ModeDirect link to 3. No-ops Testing Mode

If no_ops=true, the filter skips API calls and uses default annotations for testing.


⚙️ Configuration OptionsDirect link to ⚙️ Configuration Options

FieldTypeDescription
chatgpt_api_keystrOpenAI API key (required)
promptstrPath to prompt file (required)
output_schemadictExpected output format schema
chatgpt_modelstrModel to use (default: gpt-4o-mini)
confidence_thresholdfloatMinimum confidence for positive classification (default: 0.9)
max_image_sizeintMax image size for cost optimization (default: 0 = keep original)
save_framesboolWhether to save results and generate datasets
output_dirstrOutput directory for saved results
no_opsboolSkip API calls for testing (default: false)

All fields are configurable via code or environment variables prefixed with FILTER_. Example: FILTER_CHATGPT_API_KEY=sk-your-key-here


🧪 Example ConfigurationsDirect link to 🧪 Example Configurations

Basic annotationDirect link to Basic annotation

FilterChatgptAnnotatorConfig(
chatgpt_api_key="sk-your-key-here",
prompt="./prompts/food_prompt.txt",
output_schema={
"avocado": {"present": False, "confidence": 0.0, "bbox": None},
"lettuce": {"present": False, "confidence": 0.0, "bbox": None}
}
)

With dataset generationDirect link to With dataset generation

FilterChatgptAnnotatorConfig(
chatgpt_api_key="sk-your-key-here",
prompt="./prompts/food_prompt.txt",
save_frames=True,
output_dir="./output_frames",
confidence_threshold=0.8,
max_image_size=512
)

🧠 Output BehaviorDirect link to 🧠 Output Behavior

  • frame.data['meta']['chatgpt_annotator'] includes annotation results with confidence scores
  • If save_frames = True, datasets are generated in multiple formats (binary, COCO, JSONL)
  • If no_ops = True, default annotations are used without API calls
  • Automatic quality validation and error handling for malformed responses

🧩 IntegrationDirect link to 🧩 Integration

Use the filter directly:

from filter_chatgpt_annotator.filter import FilterChatgptAnnotator
FilterChatgptAnnotator.run()

Or as part of a multi-stage pipeline:

Filter.run_multi([
(VideoIn, {...}),
(FilterChatgptAnnotator, {...}),
(Webvis, {...})
])

🧼 NotesDirect link to 🧼 Notes

  • Supports both classification and object detection tasks based on output schema
  • Automatically generates balanced datasets for training
  • Cost optimization through configurable image resizing and quality settings
  • Built-in web interface available at http://localhost:8000 for monitoring
  • Comprehensive error handling and quality validation

For detailed guidance, revisit the Configuration Options and Example Configurations sections above—they cover the same workflows step by step.