Skip to content

feat: Add local embedding provider for semantic similarity#2409

Open
harsh21234i wants to merge 1 commit intoGiskard-AI:mainfrom
harsh21234i:semantic-similarity-embeddings
Open

feat: Add local embedding provider for semantic similarity#2409
harsh21234i wants to merge 1 commit intoGiskard-AI:mainfrom
harsh21234i:semantic-similarity-embeddings

Conversation

@harsh21234i
Copy link
Copy Markdown

Closes #2373

Summary

Adds a built-in local embedding provider for SemanticSimilarity so users
can run embedding-based checks without relying on OpenAI-compatible APIs.

What changed

  • added SentenceTransformerEmbedding in
    giskard.checks.utils.embeddings
  • registered the provider as a BaseEmbeddingModel implementation
  • added optional dependency support:
    • local-embeddings = ["sentence-transformers>=2.0,<4"]
  • documented local embedding usage and custom embedding providers
  • updated set_default_embedding_model() docs
  • added tests for:
    • provider registration / serialization
    • valid similarity scoring with a mocked sentence-transformers backend
    • helpful import error when the optional dependency is missing

Usage

from giskard.checks import SemanticSimilarity, set_default_embedding_model
from giskard.checks.utils.embeddings import SentenceTransformerEmbedding

set_default_embedding_model(
    SentenceTransformerEmbedding(model_name="all-MiniLM-L6-v2")
)

check = SemanticSimilarity(reference_text="Hello world", threshold=0.8)

Install the optional dependency with:

pip install "giskard-checks[local-embeddings]"

## Validation

Ran:

uv run --all-packages --group dev pytest libs/giskard-checks/tests/
builtin/test_semantic_similarity.py libs/giskard-checks/tests/utils/
test_embeddings.py

Result:

- 28 passed
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for local embeddings in the giskard-checks library by integrating the sentence-transformers package. Key changes include the implementation of the SentenceTransformerEmbedding provider, updates to the settings API for configuring default embedding models, and the addition of corresponding documentation and tests. Feedback suggests improving error handling by using a custom exception instead of a generic ImportError when the optional dependency is missing, which would allow callers to handle the error programmatically.

Comment on lines +44 to +48
raise ImportError(
"SentenceTransformerEmbedding requires the optional dependency "
"'sentence-transformers'. Install it with "
"'pip install \"giskard-checks[local-embeddings]\"'."
) from exc
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The error message for the missing dependency is helpful, but it is better to raise a custom exception or a more specific error type rather than a generic ImportError to allow callers to handle this specific missing dependency case programmatically.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

1 participant