-
Notifications
You must be signed in to change notification settings - Fork 679
Add negative_prompt_field support to SAM2 models #6520
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
Enables fine-grained segmentation control by excluding unwanted regions from SAM2 segmentation masks using negative prompts. Features: - Add negative_prompt_field parameter to both Image and Video SAM2 models - Convert negative boxes to 4 corner points with label=0 (SAM2 convention) - Support negative keypoints with label=0 for video model - Handle tensor shape mismatches when batching multiple boxes with negatives - Fully backward compatible (negative_prompt_field is optional) Implementation: - Image model: Override predict_all() and modify _forward_pass_boxes() - Video model: Extend _get_field(), _forward_pass_boxes(), and _forward_pass_points() - Process boxes individually when both multiple positives and negatives exist - Fall back to batch processing when possible for performance Testing: - Add comprehensive unit tests (8 tests covering field extraction, prompt logic, backward compatibility, and detection type support) - Add docstring examples demonstrating usage for both models This enables users to refine segmentation by providing both positive prompts (regions to segment) and negative prompts (regions to exclude).
WalkthroughAdded negative prompt support to SegmentAnything2 models for both images and videos. Introduced state storage for negative prompts, updated field extraction to return prompt/negative-prompt pairs, and enhanced forward pass methods to augment predictions with negative points and labels during inference. Changes
Sequence DiagramsequenceDiagram
participant User
participant ImageModel as SegmentAnything2<br/>ImageModel
participant SAM2Predictor as SAM2<br/>Predictor
User->>ImageModel: predict_all(imgs, samples)
ImageModel->>ImageModel: _get_field() → (prompt_field,<br/>negative_prompt_field)
ImageModel->>ImageModel: Load current prompts<br/>& negative prompts
ImageModel->>SAM2Predictor: _forward_pass_boxes(boxes,<br/>masks, labels)
alt Negative prompts exist
ImageModel->>ImageModel: Convert negative boxes<br/>to points (label=0)
ImageModel->>SAM2Predictor: add negative points/labels
end
SAM2Predictor-->>ImageModel: masks, scores, detections
ImageModel-->>User: predictions with augmented data
Estimated Code Review Effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
fiftyone/utils/sam2.py (1)
222-222: Consider addingstrict=Truetozip()for safety.The
zip()call could benefit from an explicitstrict=Trueparameter to ensureimgsandself._curr_promptshave matching lengths, which would catch potential bugs early.Apply this diff:
- for idx, (img, detections) in enumerate(zip(imgs, self._curr_prompts)): + for idx, (img, detections) in enumerate(zip(imgs, self._curr_prompts, strict=True)):tests/unittests/utils/test_sam2.py (1)
11-11: Consider using the standard import pattern.The import could follow the more conventional pattern recommended by Python style guides.
Apply this diff:
-import unittest.mock as mock +from unittest import mock
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
fiftyone/utils/sam2.py(10 hunks)tests/unittests/utils/test_sam2.py(1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-09-10T19:53:08.432Z
Learnt from: AdonaiVera
Repo: voxel51/fiftyone PR: 6337
File: docs/source/tutorials/dinov3.ipynb:121-124
Timestamp: 2025-09-10T19:53:08.432Z
Learning: The `foz.load_zoo_dataset()` function supports loading datasets from external sources including GitHub URLs (e.g., "https://github.com/voxel51/coco-2017"), which is different from loading built-in zoo datasets by name (e.g., "coco-2017"). GitHub URL loading allows access to custom datasets not available in the standard FiftyOne Dataset Zoo.
Applied to files:
fiftyone/utils/sam2.py
📚 Learning: 2025-10-16T17:24:11.588Z
Learnt from: brimoor
Repo: voxel51/fiftyone PR: 6373
File: docs/source/dataset_zoo/datasets/ucf101.rst:32-0
Timestamp: 2025-10-16T17:24:11.588Z
Learning: In FiftyOne documentation, dataset zoo classes follow the pattern `fiftyone.zoo.datasets.base.{DatasetName}Dataset` and should be referenced using `:class:` directives (e.g., `:class:`UCF101Dataset <fiftyone.zoo.datasets.base.UCF101Dataset>``). This is a valid and established documentation pattern.
Applied to files:
fiftyone/utils/sam2.py
🧬 Code graph analysis (2)
tests/unittests/utils/test_sam2.py (3)
fiftyone/utils/sam2.py (2)
SegmentAnything2VideoModel(312-685)_get_field(427-455)fiftyone/utils/sam.py (1)
_to_abs_boxes(396-409)fiftyone/core/labels.py (3)
Detections(632-709)Keypoint(1059-1116)Keypoints(1119-1128)
fiftyone/utils/sam2.py (2)
fiftyone/utils/sam.py (9)
predict_all(147-160)_get_field(162-171)_parse_samples(173-177)_forward_pass_boxes(237-291)_load_predictor(234-235)_to_abs_boxes(396-409)_get_prompts(198-210)_get_prompt_type(179-196)_to_sam_points(384-393)fiftyone/utils/torch.py (1)
InstanceSegmenterOutputProcessor(1424-1513)
🪛 Pylint (4.0.2)
tests/unittests/utils/test_sam2.py
[error] 1-1: Unrecognized option found: optimize-ast, files-output, function-name-hint, variable-name-hint, const-name-hint, attr-name-hint, argument-name-hint, class-attribute-name-hint, inlinevar-name-hint, class-name-hint, module-name-hint, method-name-hint, no-space-check
(E0015)
[refactor] 1-1: Useless option value for '--disable', 'bad-continuation' was removed from pylint, see pylint-dev/pylint#3571.
(R0022)
[refactor] 11-11: Use 'from unittest import mock' instead
(R0402)
🪛 Ruff (0.14.2)
tests/unittests/utils/test_sam2.py
97-97: Local variable neg_points is assigned to but never used
Remove assignment to unused variable neg_points
(F841)
fiftyone/utils/sam2.py
222-222: zip() without an explicit strict= parameter
Add explicit value for parameter strict=
(B905)
450-452: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
- GitHub Check: test-windows / test-python (windows-latest, 3.12)
- GitHub Check: test-windows / test-python (windows-latest, 3.10)
- GitHub Check: test-windows / test-python (windows-latest, 3.11)
- GitHub Check: test / test-python (ubuntu-latest-m, 3.10)
- GitHub Check: test-windows / test-python (windows-latest, 3.9)
- GitHub Check: test / test-python (ubuntu-latest-m, 3.9)
- GitHub Check: test / test-python (ubuntu-latest-m, 3.12)
- GitHub Check: build / build
- GitHub Check: test / test-python (ubuntu-latest-m, 3.11)
- GitHub Check: test / test-app
- GitHub Check: build / changes
- GitHub Check: lint / eslint
- GitHub Check: e2e / test-e2e
- GitHub Check: build
🔇 Additional comments (12)
fiftyone/utils/sam2.py (8)
125-143: LGTM! Clear documentation for the new feature.The negative prompt example follows the established docstring pattern and clearly demonstrates the intended usage.
168-168: LGTM! Proper state initialization.The
_curr_negative_promptsstate variable is correctly initialized toNonefor optional negative prompt support.
189-214: LGTM! Well-structured method for handling optional negative prompts.The method correctly:
- Extracts and parses positive prompts
- Conditionally extracts and parses negative prompts
- Maintains backward compatibility when
negative_prompt_fieldis not provided- Handles the "frames." prefix for compatibility
250-291: Excellent handling of batching constraints with negative prompts.The implementation correctly addresses SAM2's tensor shape constraints:
- Lines 256-266: Negative boxes are converted to 4 corner points with
label=0following SAM2 conventions- Lines 268-281: When multiple positive boxes exist with negative prompts, boxes are processed individually to avoid shape mismatches
- Lines 282-291: Batch processing is preserved when possible (single box or no negative prompts) for performance
The logic is sound and the performance trade-off is well-balanced.
412-424: LGTM! Clean integration of negative prompt handling.The method correctly unpacks the tuple from
_get_field()and conditionally sets_curr_negative_prompts, maintaining symmetry with positive prompt handling.
427-455: LGTM! Proper validation for video model field requirements.The method correctly:
- Returns a tuple
(prompt_field, negative_prompt_field)for both fields- Validates that
negative_prompt_fieldhas the required "frames." prefix for video models- Maintains backward compatibility when
negative_prompt_fieldis not provided
533-553: LGTM! Correct per-frame negative prompt application.The implementation properly:
- Processes negative detections for each frame where they exist
- Converts negative boxes to 4 corner points with
label=0per SAM2 conventions- Associates negative prompts with the same
obj_idto refine that specific object's mask
626-638: LGTM! Clean negative keypoint integration.The implementation correctly:
- Extracts negative keypoints per frame
- Assigns
label=0to all negative keypoints (line 636)- Concatenates negative points/labels with positive ones before inference
tests/unittests/utils/test_sam2.py (4)
17-66: LGTM! Comprehensive field extraction tests.The test class thoroughly covers:
- Extraction with "frames." prefix (correctly strips it)
- Backward compatibility without
negative_prompt_field- Error handling for missing "frames." prefix
These tests validate the core field parsing logic effectively.
72-101: LGTM! Thorough validation of negative prompt conversion logic.The tests correctly verify:
- Negative boxes → 4 corner points with
label=0(lines 72-92)- Correct absolute coordinate conversion
- Negative keypoints →
label=0(lines 94-101)Note: The static analysis warning about
neg_pointsat line 97 is a false positive—the variable is used in assertions at lines 86-92.
107-120: LGTM! Essential backward compatibility verification.The test ensures that existing code continues to work when
negative_prompt_fieldis not provided, verifying:
- The model initializes successfully
_curr_negative_promptsattribute exists- Default value is
None
126-146: LGTM! Good validation of supported negative prompt types.The tests verify that both
DetectionsandKeypointscan be used as negative prompts, checking:
- Correct structure instantiation
- Field values and labels
- List lengths and access patterns
This provides adequate coverage for the supported types.
What changes are proposed in this pull request?
This PR adds support for negative prompts to SAM2 image and video models, enabling fine-grained segmentation control by excluding unwanted regions from segmentation masks.
Key changes:
Implementation details:
How is this patch tested? If it is not, please explain why.
This patch includes comprehensive testing:
Unit tests (8 tests total):
Manual verification:
Release Notes
SAM2 models now support negative prompts for refined segmentation control. Users can specify a negative_prompt_field parameter when applying SAM2 models to exclude unwanted regions from segmentation masks. This enables more precise segmentation by providing both positive prompts (regions to segment) and negative prompts (regions to exclude). Works with both image and video SAM2 models using Detections or Keypoints.
What areas of FiftyOne does this PR affect?
fiftyonePython library changesSummary by CodeRabbit
Release Notes
New Features
Tests