Skip to content

Conversation

@CharlesCNorton
Copy link
Member

@CharlesCNorton CharlesCNorton commented Oct 31, 2025

What changes are proposed in this pull request?

This PR adds support for negative prompts to SAM2 image and video models, enabling fine-grained segmentation control by excluding unwanted regions from segmentation masks.

Key changes:

  • Add negative_prompt_field parameter to both SegmentAnything2ImageModel and SegmentAnything2VideoModel
  • Convert negative bounding boxes to 4 corner points with label=0 (SAM2's negative prompt convention)
  • Support negative keypoints with label=0 for video model
  • Handle tensor shape mismatches when batching multiple boxes with negative prompts
  • Fully backward compatible - negative_prompt_field is optional

Implementation details:

  • Image model: Override predict_all() and modify _forward_pass_boxes() to extract and process negative prompts
  • Video model: Extend _get_field(), _forward_pass_boxes(), and _forward_pass_points() to handle negative prompts per frame
  • Process boxes individually when both multiple positive and negative prompts exist to avoid tensor shape issues
  • Fall back to batch processing when possible for optimal performance

How is this patch tested? If it is not, please explain why.

This patch includes comprehensive testing:

Unit tests (8 tests total):

  • Field extraction with and without frames. prefix
  • Negative box to corner point conversion logic
  • Negative keypoint label assignment
  • Backward compatibility without negative prompts
  • Support for both Detections and Keypoints as negative prompt types

Manual verification:

  • Tested on misc segmentation images in CL and FO app, for ex. dog/cat segmentation showing 6,414 pixels (2.7%) successfully excluded when using negative prompts
  • Verified docstring examples work correctly for both image and video models
  • Confirmed backward compatibility with existing code

Release Notes

SAM2 models now support negative prompts for refined segmentation control. Users can specify a negative_prompt_field parameter when applying SAM2 models to exclude unwanted regions from segmentation masks. This enables more precise segmentation by providing both positive prompts (regions to segment) and negative prompts (regions to exclude). Works with both image and video SAM2 models using Detections or Keypoints.

What areas of FiftyOne does this PR affect?

  • App: FiftyOne application changes
  • Build: Build and test infrastructure changes
  • Core: Core fiftyone Python library changes
  • Documentation: FiftyOne documentation changes
  • Other

Summary by CodeRabbit

Release Notes

  • New Features

    • Added support for negative prompts in SegmentAnything2 models for images and videos. Users can now provide negative examples to refine segmentation predictions.
  • Tests

    • Added comprehensive unit tests for negative prompt handling and field extraction.
Enables fine-grained segmentation control by excluding unwanted regions
from SAM2 segmentation masks using negative prompts.

Features:
- Add negative_prompt_field parameter to both Image and Video SAM2 models
- Convert negative boxes to 4 corner points with label=0 (SAM2 convention)
- Support negative keypoints with label=0 for video model
- Handle tensor shape mismatches when batching multiple boxes with negatives
- Fully backward compatible (negative_prompt_field is optional)

Implementation:
- Image model: Override predict_all() and modify _forward_pass_boxes()
- Video model: Extend _get_field(), _forward_pass_boxes(), and _forward_pass_points()
- Process boxes individually when both multiple positives and negatives exist
- Fall back to batch processing when possible for performance

Testing:
- Add comprehensive unit tests (8 tests covering field extraction, prompt logic,
  backward compatibility, and detection type support)
- Add docstring examples demonstrating usage for both models

This enables users to refine segmentation by providing both positive prompts
(regions to segment) and negative prompts (regions to exclude).
@CharlesCNorton CharlesCNorton requested a review from a team as a code owner October 31, 2025 17:42
@CharlesCNorton CharlesCNorton self-assigned this Oct 31, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 31, 2025

Walkthrough

Added negative prompt support to SegmentAnything2 models for both images and videos. Introduced state storage for negative prompts, updated field extraction to return prompt/negative-prompt pairs, and enhanced forward pass methods to augment predictions with negative points and labels during inference.

Changes

Cohort / File(s) Summary
Core Negative Prompt Support
fiftyone/utils/sam2.py
Added _curr_negative_prompts state storage to both SegmentAnything2ImageModel and SegmentAnything2VideoModel. Modified _get_field() methods to return tuples of (prompt_field, negative_prompt_field) with validation. Enhanced _forward_pass_boxes() and _forward_pass_points() to generate and apply negative points/labels from negative prompts. Added predict_all() method to image model for parsing and processing samples. Updated constructors to initialize negative prompt state.
Unit Tests
tests/unittests/utils/test_sam2.py
New test module covering field extraction logic for negative prompts, negative prompt conversion (boxes to points with label 0, keypoint label validation), backward compatibility without negative prompts, and detection type support for both Detections and Keypoints as negative prompt sources.

Sequence Diagram

sequenceDiagram
    participant User
    participant ImageModel as SegmentAnything2<br/>ImageModel
    participant SAM2Predictor as SAM2<br/>Predictor
    
    User->>ImageModel: predict_all(imgs, samples)
    ImageModel->>ImageModel: _get_field() → (prompt_field,<br/>negative_prompt_field)
    ImageModel->>ImageModel: Load current prompts<br/>& negative prompts
    ImageModel->>SAM2Predictor: _forward_pass_boxes(boxes,<br/>masks, labels)
    
    alt Negative prompts exist
        ImageModel->>ImageModel: Convert negative boxes<br/>to points (label=0)
        ImageModel->>SAM2Predictor: add negative points/labels
    end
    
    SAM2Predictor-->>ImageModel: masks, scores, detections
    ImageModel-->>User: predictions with augmented data
Loading

Estimated Code Review Effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Logic validation: Verify correct conversion of negative boxes to corner points with label 0 and negative keypoint label handling across image and video models
  • State management: Confirm _curr_negative_prompts initialization and population flow in both model constructors and predict methods
  • Method signature changes: Validate _get_field() tuple return type propagation and frame-field validation for negative prompt fields
  • Test coverage: Review mock setup and assertion completeness for negative prompt conversion and backward compatibility scenarios

Poem

🐰 With negative prompts now in SAM2's sight,
We whisper what not to see—such delight!
Points and boxes transformed with label zero care,
Masks refined by what's absent, beyond compare.
FiftyOne hops forward—precision takes flight! ✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 40.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title Check ✅ Passed The pull request title "Add negative_prompt_field support to SAM2 models" is directly and clearly related to the main change in the changeset. The title accurately captures the primary contribution—introducing negative_prompt_field support to both SegmentAnything2ImageModel and SegmentAnything2VideoModel—which is the core objective described in the PR. The title is concise, specific, and sufficiently descriptive for someone scanning commit history to understand the primary change without needing to read the full description.
Description Check ✅ Passed The pull request description follows the required template structure comprehensively. All required sections are present: "What changes are proposed" is filled with detailed implementation information, "How is this patch tested" includes substantive details about eight unit tests and manual verification, and the "Release Notes" section properly indicates this is a user-facing change with a two-sentence description and correctly marks "Core" as the affected area. The description provides clear, well-organized content with both high-level summaries and implementation details, demonstrating thorough documentation of the change.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/sam2-negative-prompts

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
fiftyone/utils/sam2.py (1)

222-222: Consider adding strict=True to zip() for safety.

The zip() call could benefit from an explicit strict=True parameter to ensure imgs and self._curr_prompts have matching lengths, which would catch potential bugs early.

Apply this diff:

-        for idx, (img, detections) in enumerate(zip(imgs, self._curr_prompts)):
+        for idx, (img, detections) in enumerate(zip(imgs, self._curr_prompts, strict=True)):
tests/unittests/utils/test_sam2.py (1)

11-11: Consider using the standard import pattern.

The import could follow the more conventional pattern recommended by Python style guides.

Apply this diff:

-import unittest.mock as mock
+from unittest import mock
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 41a3353 and 1980569.

📒 Files selected for processing (2)
  • fiftyone/utils/sam2.py (10 hunks)
  • tests/unittests/utils/test_sam2.py (1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-09-10T19:53:08.432Z
Learnt from: AdonaiVera
Repo: voxel51/fiftyone PR: 6337
File: docs/source/tutorials/dinov3.ipynb:121-124
Timestamp: 2025-09-10T19:53:08.432Z
Learning: The `foz.load_zoo_dataset()` function supports loading datasets from external sources including GitHub URLs (e.g., "https://github.com/voxel51/coco-2017"), which is different from loading built-in zoo datasets by name (e.g., "coco-2017"). GitHub URL loading allows access to custom datasets not available in the standard FiftyOne Dataset Zoo.

Applied to files:

  • fiftyone/utils/sam2.py
📚 Learning: 2025-10-16T17:24:11.588Z
Learnt from: brimoor
Repo: voxel51/fiftyone PR: 6373
File: docs/source/dataset_zoo/datasets/ucf101.rst:32-0
Timestamp: 2025-10-16T17:24:11.588Z
Learning: In FiftyOne documentation, dataset zoo classes follow the pattern `fiftyone.zoo.datasets.base.{DatasetName}Dataset` and should be referenced using `:class:` directives (e.g., `:class:`UCF101Dataset <fiftyone.zoo.datasets.base.UCF101Dataset>``). This is a valid and established documentation pattern.

Applied to files:

  • fiftyone/utils/sam2.py
🧬 Code graph analysis (2)
tests/unittests/utils/test_sam2.py (3)
fiftyone/utils/sam2.py (2)
  • SegmentAnything2VideoModel (312-685)
  • _get_field (427-455)
fiftyone/utils/sam.py (1)
  • _to_abs_boxes (396-409)
fiftyone/core/labels.py (3)
  • Detections (632-709)
  • Keypoint (1059-1116)
  • Keypoints (1119-1128)
fiftyone/utils/sam2.py (2)
fiftyone/utils/sam.py (9)
  • predict_all (147-160)
  • _get_field (162-171)
  • _parse_samples (173-177)
  • _forward_pass_boxes (237-291)
  • _load_predictor (234-235)
  • _to_abs_boxes (396-409)
  • _get_prompts (198-210)
  • _get_prompt_type (179-196)
  • _to_sam_points (384-393)
fiftyone/utils/torch.py (1)
  • InstanceSegmenterOutputProcessor (1424-1513)
🪛 Pylint (4.0.2)
tests/unittests/utils/test_sam2.py

[error] 1-1: Unrecognized option found: optimize-ast, files-output, function-name-hint, variable-name-hint, const-name-hint, attr-name-hint, argument-name-hint, class-attribute-name-hint, inlinevar-name-hint, class-name-hint, module-name-hint, method-name-hint, no-space-check

(E0015)


[refactor] 1-1: Useless option value for '--disable', 'bad-continuation' was removed from pylint, see pylint-dev/pylint#3571.

(R0022)


[refactor] 11-11: Use 'from unittest import mock' instead

(R0402)

🪛 Ruff (0.14.2)
tests/unittests/utils/test_sam2.py

97-97: Local variable neg_points is assigned to but never used

Remove assignment to unused variable neg_points

(F841)

fiftyone/utils/sam2.py

222-222: zip() without an explicit strict= parameter

Add explicit value for parameter strict=

(B905)


450-452: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
  • GitHub Check: test-windows / test-python (windows-latest, 3.12)
  • GitHub Check: test-windows / test-python (windows-latest, 3.10)
  • GitHub Check: test-windows / test-python (windows-latest, 3.11)
  • GitHub Check: test / test-python (ubuntu-latest-m, 3.10)
  • GitHub Check: test-windows / test-python (windows-latest, 3.9)
  • GitHub Check: test / test-python (ubuntu-latest-m, 3.9)
  • GitHub Check: test / test-python (ubuntu-latest-m, 3.12)
  • GitHub Check: build / build
  • GitHub Check: test / test-python (ubuntu-latest-m, 3.11)
  • GitHub Check: test / test-app
  • GitHub Check: build / changes
  • GitHub Check: lint / eslint
  • GitHub Check: e2e / test-e2e
  • GitHub Check: build
🔇 Additional comments (12)
fiftyone/utils/sam2.py (8)

125-143: LGTM! Clear documentation for the new feature.

The negative prompt example follows the established docstring pattern and clearly demonstrates the intended usage.


168-168: LGTM! Proper state initialization.

The _curr_negative_prompts state variable is correctly initialized to None for optional negative prompt support.


189-214: LGTM! Well-structured method for handling optional negative prompts.

The method correctly:

  • Extracts and parses positive prompts
  • Conditionally extracts and parses negative prompts
  • Maintains backward compatibility when negative_prompt_field is not provided
  • Handles the "frames." prefix for compatibility

250-291: Excellent handling of batching constraints with negative prompts.

The implementation correctly addresses SAM2's tensor shape constraints:

  • Lines 256-266: Negative boxes are converted to 4 corner points with label=0 following SAM2 conventions
  • Lines 268-281: When multiple positive boxes exist with negative prompts, boxes are processed individually to avoid shape mismatches
  • Lines 282-291: Batch processing is preserved when possible (single box or no negative prompts) for performance

The logic is sound and the performance trade-off is well-balanced.


412-424: LGTM! Clean integration of negative prompt handling.

The method correctly unpacks the tuple from _get_field() and conditionally sets _curr_negative_prompts, maintaining symmetry with positive prompt handling.


427-455: LGTM! Proper validation for video model field requirements.

The method correctly:

  • Returns a tuple (prompt_field, negative_prompt_field) for both fields
  • Validates that negative_prompt_field has the required "frames." prefix for video models
  • Maintains backward compatibility when negative_prompt_field is not provided

533-553: LGTM! Correct per-frame negative prompt application.

The implementation properly:

  • Processes negative detections for each frame where they exist
  • Converts negative boxes to 4 corner points with label=0 per SAM2 conventions
  • Associates negative prompts with the same obj_id to refine that specific object's mask

626-638: LGTM! Clean negative keypoint integration.

The implementation correctly:

  • Extracts negative keypoints per frame
  • Assigns label=0 to all negative keypoints (line 636)
  • Concatenates negative points/labels with positive ones before inference
tests/unittests/utils/test_sam2.py (4)

17-66: LGTM! Comprehensive field extraction tests.

The test class thoroughly covers:

  • Extraction with "frames." prefix (correctly strips it)
  • Backward compatibility without negative_prompt_field
  • Error handling for missing "frames." prefix

These tests validate the core field parsing logic effectively.


72-101: LGTM! Thorough validation of negative prompt conversion logic.

The tests correctly verify:

  • Negative boxes → 4 corner points with label=0 (lines 72-92)
  • Correct absolute coordinate conversion
  • Negative keypoints → label=0 (lines 94-101)

Note: The static analysis warning about neg_points at line 97 is a false positive—the variable is used in assertions at lines 86-92.


107-120: LGTM! Essential backward compatibility verification.

The test ensures that existing code continues to work when negative_prompt_field is not provided, verifying:

  • The model initializes successfully
  • _curr_negative_prompts attribute exists
  • Default value is None

126-146: LGTM! Good validation of supported negative prompt types.

The tests verify that both Detections and Keypoints can be used as negative prompts, checking:

  • Correct structure instantiation
  • Field values and labels
  • List lengths and access patterns

This provides adequate coverage for the supported types.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants