Skip to content

feat(checks): add JsonValid built-in check with optional JSON Schema validation#2397

Open
abhigyan631 wants to merge 2 commits intoGiskard-AI:mainfrom
abhigyan631:feat/json-valid-check
Open

feat(checks): add JsonValid built-in check with optional JSON Schema validation#2397
abhigyan631 wants to merge 2 commits intoGiskard-AI:mainfrom
abhigyan631:feat/json-valid-check

Conversation

@abhigyan631
Copy link
Copy Markdown

Description

Resolves #2362

Summary

Adds a new built-in check JsonValid that validates whether LLM output is
well-formed JSON, with optional JSON Schema validation. This is critical for
testing structured output use cases such as tool calling, data extraction
pipelines, and function calling responses.

Implementation Details

  • Created libs/giskard-checks/src/giskard/checks/builtin/json_valid.py
    following the established built-in check pattern.
  • Uses a two-layer validation approach:
    • Layer 1json.loads() (stdlib, zero extra dependencies): validates
      syntax correctness. Fails fast with a descriptive parse error message.
    • Layer 2jsonschema.validate() (optional): validates the parsed
      object against a user-provided JSON Schema. Only runs when json_schema
      is provided.
  • Supports JSONPath extraction from traces via text_key (same pattern as
    StringMatching), so it integrates naturally into existing Giskard
    evaluation pipelines.
  • Renamed the schema field to json_schema to avoid shadowing Pydantic's
    reserved schema attribute on the base Check class.
  • Exported at both the builtin and top-level giskard.checks namespace.

Example Usage

from giskard.checks import JsonValid, Scenario

# Basic JSON validation — no extra dependencies needed
scenario = (
    Scenario(name="structured_output")
    .interact(inputs="Return a JSON object", outputs='{"name": "Alice", "age": 30}')
    .check(JsonValid())
)

# With JSON Schema validation
schema = {
    "type": "object",
    "properties": {
        "name": {"type": "string"},
        "age": {"type": "integer"}
    },
    "required": ["name", "age"]
}
scenario = (
    Scenario(name="schema_validation")
    .interact(inputs="Return user data", outputs='{"name": "Alice", "age": 30}')
    .check(JsonValid(json_schema=schema))
)

Testing

Added 6 tests in tests/builtin/test_json_valid.py:

✅ Valid JSON passes
✅ Invalid JSON fails (with descriptive parse error in details)
✅ JSON matching a schema passes
✅ JSON not matching a schema fails
✅ Text extracted from trace via JSONPath works correctly
✅ Missing key in trace returns a failure

All 6 tests pass locally.

Related Issue
Resolves #2362

Type of Change

📚 Examples / docs / tutorials / dependencies update
🔧 Bug fix (non-breaking change which fixes an issue)
🥂 Improvement (non-breaking change which improves an existing feature)
🚀 New feature (non-breaking change which adds functionality)
💥 Breaking change (fix or feature that would cause existing functionality to change)
🔐 Security fix

Checklist
I've read the CODE_OF_CONDUCT.md document.
I've read the CONTRIBUTING.md guide.
I've written tests for all new methods and classes that I created.
I've written the docstring in NumPy format for all the methods and classes that I created or modified.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new JsonValid check that validates if a string is valid JSON and optionally checks it against a provided JSON schema. The feedback focuses on handling jsonschema as an optional dependency to prevent ImportErrors, improving error handling for invalid schemas, and correcting PEP 8 formatting issues such as extra spaces and leftover developer comments.

Comment on lines +5 to +6
import json
import jsonschema
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The jsonschema library is imported at the top level. Since this is intended to be an optional dependency (as noted in the PR description), this import will cause an ImportError for users who do not have the package installed, even if they only intend to use the basic JSON validation. Move the import inside the run method to ensure the check remains usable without extra dependencies.

Suggested change
import json
import jsonschema
import json
Comment on lines +48 to +52
return CheckResult.failure(message=f"No text found at key {self.text_key}", details=details)

# handle wrong type
if not isinstance(text, str):
return CheckResult.failure(message=f"Value for text key '{self.text_key}' is not a string, got {type(text)}.", details=details)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Fix the double spaces before the details argument in the CheckResult.failure calls to adhere to PEP 8 spacing conventions for function arguments.

Suggested change
return CheckResult.failure(message=f"No text found at key {self.text_key}", details=details)
# handle wrong type
if not isinstance(text, str):
return CheckResult.failure(message=f"Value for text key '{self.text_key}' is not a string, got {type(text)}.", details=details)
if isinstance(text, NoMatch):
return CheckResult.failure(message=f"No text found at key {self.text_key}", details=details)
# handle wrong type
if not isinstance(text, str):
return CheckResult.failure(message=f"Value for text key '{self.text_key}' is not a string, got {type(text)}.", details=details)
References
  1. PEP 8 recommends using a single space to separate arguments in a function call. (link)
parsed = json.loads(text)
except json.JSONDecodeError as e:
return CheckResult.failure(
message= f"Output is not valid JSON: {e}", # hint: show the parse error e
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Remove the leftover developer hint comment and fix the unnecessary space after message= to adhere to PEP 8 standards for keyword arguments.

Suggested change
message= f"Output is not valid JSON: {e}", # hint: show the parse error e
message=f"Output is not valid JSON: {e}",
References
  1. PEP 8 states that spaces should not be used around the '=' sign when used to indicate a keyword argument. (link)
Comment on lines +63 to +70
if self.json_schema is not None:
try:
jsonschema.validate(instance=parsed, schema=self.json_schema)
except jsonschema.exceptions.ValidationError as e:
return CheckResult.failure(
message=f"JSON does not match schema: {e.message}",
details={**details, "error": e.message, "parsed": parsed},
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Handle jsonschema as an optional dependency by importing it locally. Additionally, catch jsonschema.exceptions.SchemaError to gracefully handle cases where the provided json_schema is itself invalid, preventing the check from crashing.

        if self.json_schema is not None:
            try:
                import jsonschema
            except ImportError:
                return CheckResult.failure(
                    message="JSON schema validation requires the 'jsonschema' package. Please install it with 'pip install jsonschema'.",
                    details=details,
                )

            try:
                jsonschema.validate(instance=parsed, schema=self.json_schema)
            except jsonschema.exceptions.ValidationError as e:
                return CheckResult.failure(
                    message=f"JSON does not match schema: {e.message}",
                    details={**details, "error": e.message, "parsed": parsed},
                )
            except jsonschema.exceptions.SchemaError as e:
                return CheckResult.failure(
                    message=f"The provided JSON schema is invalid: {e.message}",
                    details={**details, "error": e.message},
                )
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

1 participant