feat(checks): add JsonValid built-in check with optional JSON Schema validation#2397
feat(checks): add JsonValid built-in check with optional JSON Schema validation#2397abhigyan631 wants to merge 2 commits intoGiskard-AI:mainfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces a new JsonValid check that validates if a string is valid JSON and optionally checks it against a provided JSON schema. The feedback focuses on handling jsonschema as an optional dependency to prevent ImportErrors, improving error handling for invalid schemas, and correcting PEP 8 formatting issues such as extra spaces and leftover developer comments.
| import json | ||
| import jsonschema |
There was a problem hiding this comment.
The jsonschema library is imported at the top level. Since this is intended to be an optional dependency (as noted in the PR description), this import will cause an ImportError for users who do not have the package installed, even if they only intend to use the basic JSON validation. Move the import inside the run method to ensure the check remains usable without extra dependencies.
| import json | |
| import jsonschema | |
| import json |
| return CheckResult.failure(message=f"No text found at key {self.text_key}", details=details) | ||
|
|
||
| # handle wrong type | ||
| if not isinstance(text, str): | ||
| return CheckResult.failure(message=f"Value for text key '{self.text_key}' is not a string, got {type(text)}.", details=details) |
There was a problem hiding this comment.
Fix the double spaces before the details argument in the CheckResult.failure calls to adhere to PEP 8 spacing conventions for function arguments.
| return CheckResult.failure(message=f"No text found at key {self.text_key}", details=details) | |
| # handle wrong type | |
| if not isinstance(text, str): | |
| return CheckResult.failure(message=f"Value for text key '{self.text_key}' is not a string, got {type(text)}.", details=details) | |
| if isinstance(text, NoMatch): | |
| return CheckResult.failure(message=f"No text found at key {self.text_key}", details=details) | |
| # handle wrong type | |
| if not isinstance(text, str): | |
| return CheckResult.failure(message=f"Value for text key '{self.text_key}' is not a string, got {type(text)}.", details=details) |
References
- PEP 8 recommends using a single space to separate arguments in a function call. (link)
| parsed = json.loads(text) | ||
| except json.JSONDecodeError as e: | ||
| return CheckResult.failure( | ||
| message= f"Output is not valid JSON: {e}", # hint: show the parse error e |
There was a problem hiding this comment.
Remove the leftover developer hint comment and fix the unnecessary space after message= to adhere to PEP 8 standards for keyword arguments.
| message= f"Output is not valid JSON: {e}", # hint: show the parse error e | |
| message=f"Output is not valid JSON: {e}", |
References
- PEP 8 states that spaces should not be used around the '=' sign when used to indicate a keyword argument. (link)
| if self.json_schema is not None: | ||
| try: | ||
| jsonschema.validate(instance=parsed, schema=self.json_schema) | ||
| except jsonschema.exceptions.ValidationError as e: | ||
| return CheckResult.failure( | ||
| message=f"JSON does not match schema: {e.message}", | ||
| details={**details, "error": e.message, "parsed": parsed}, | ||
| ) |
There was a problem hiding this comment.
Handle jsonschema as an optional dependency by importing it locally. Additionally, catch jsonschema.exceptions.SchemaError to gracefully handle cases where the provided json_schema is itself invalid, preventing the check from crashing.
if self.json_schema is not None:
try:
import jsonschema
except ImportError:
return CheckResult.failure(
message="JSON schema validation requires the 'jsonschema' package. Please install it with 'pip install jsonschema'.",
details=details,
)
try:
jsonschema.validate(instance=parsed, schema=self.json_schema)
except jsonschema.exceptions.ValidationError as e:
return CheckResult.failure(
message=f"JSON does not match schema: {e.message}",
details={**details, "error": e.message, "parsed": parsed},
)
except jsonschema.exceptions.SchemaError as e:
return CheckResult.failure(
message=f"The provided JSON schema is invalid: {e.message}",
details={**details, "error": e.message},
)
Description
Resolves #2362
Summary
Adds a new built-in check
JsonValidthat validates whether LLM output iswell-formed JSON, with optional JSON Schema validation. This is critical for
testing structured output use cases such as tool calling, data extraction
pipelines, and function calling responses.
Implementation Details
libs/giskard-checks/src/giskard/checks/builtin/json_valid.pyfollowing the established built-in check pattern.
json.loads()(stdlib, zero extra dependencies): validatessyntax correctness. Fails fast with a descriptive parse error message.
jsonschema.validate()(optional): validates the parsedobject against a user-provided JSON Schema. Only runs when
json_schemais provided.
text_key(same pattern asStringMatching), so it integrates naturally into existing Giskardevaluation pipelines.
json_schemato avoid shadowing Pydantic'sreserved
schemaattribute on the baseCheckclass.builtinand top-levelgiskard.checksnamespace.Example Usage
Testing
Added 6 tests in tests/builtin/test_json_valid.py:
✅ Valid JSON passes
✅ Invalid JSON fails (with descriptive parse error in details)
✅ JSON matching a schema passes
✅ JSON not matching a schema fails
✅ Text extracted from trace via JSONPath works correctly
✅ Missing key in trace returns a failure
All 6 tests pass locally.
Related Issue
Resolves #2362
Type of Change
📚 Examples / docs / tutorials / dependencies update
🔧 Bug fix (non-breaking change which fixes an issue)
🥂 Improvement (non-breaking change which improves an existing feature)
🚀 New feature (non-breaking change which adds functionality)
💥 Breaking change (fix or feature that would cause existing functionality to change)
🔐 Security fix
Checklist
I've read the CODE_OF_CONDUCT.md document.
I've read the CONTRIBUTING.md guide.
I've written tests for all new methods and classes that I created.
I've written the docstring in NumPy format for all the methods and classes that I created or modified.