Skip to content

Commit ed93351

Browse files
dmadisettiCopilot
andauthored
feat: syntax error check rule as a base example (#6384)
## Summary Example: Syntax Error Check Rule The new syntax error rule (MB005) demonstrates the ideal pattern: - Small footprint: Minimal, focused implementation - Complete docstring: Structured sections with clear examples - Proper metadata: Follows MB### pattern with BREAKING severity - Auto-integration: Automatically appears in generated docs This rule serves as the perfect reference implementation for future contributors. See development_docs/adding_lint_rules.md for the complete guide on adding new lint rules, including: - Required docstring structure - Metadata requirements - Testing patterns - Code assignment guidelines --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
1 parent 8d078a1 commit ed93351

9 files changed

Lines changed: 627 additions & 0 deletions

File tree

Lines changed: 393 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,393 @@
1+
# Adding Lint Rules to marimo
2+
3+
This guide explains how to add new lint rules to marimo's linting system.
4+
5+
## Overview
6+
7+
marimo's lint system helps users write better, more reliable notebooks by detecting various issues that could prevent notebooks from running correctly. The system is organized around three severity levels:
8+
9+
- **Breaking (MB)**: Errors that prevent notebook execution
10+
- **Runtime (MR)**: Issues that may cause runtime problems
11+
- **Formatting (MF)**: Style and formatting issues
12+
13+
## Rule Code Assignment
14+
15+
Rule codes follow a specific pattern: `M[severity][number]`
16+
17+
### Current Code Ranges
18+
19+
- **MB001-MB099**: Breaking rules
20+
- **MR001-MR099**: Runtime rules
21+
- **MF001-MF099**: Formatting rules
22+
23+
### Assigning New Codes
24+
25+
When adding a new rule:
26+
27+
1. **Determine severity**: Choose Breaking, Runtime, or Formatting based on impact
28+
2. **Find next available code**: Check existing rules in the appropriate category
29+
3. **Use sequential numbering**: MB005, MB006, etc.
30+
31+
**Example assignments**:
32+
- MB001: unparsable-cells
33+
- MB002: multiple-definitions
34+
- MB003: cycle-dependencies
35+
- MB004: setup-cell-dependencies
36+
- MF001: general-formatting
37+
- MF002: parse-stdout
38+
- MF003: parse-stderr
39+
40+
## Step-by-Step Implementation
41+
42+
### 1. Create the Rule Class
43+
44+
Create your rule in the appropriate directory:
45+
- Breaking rules: `marimo/_lint/rules/breaking/`
46+
- Runtime rules: `marimo/_lint/rules/runtime/`
47+
- Formatting rules: `marimo/_lint/rules/formatting/`
48+
49+
**Template for a new rule**:
50+
51+
```python
52+
# Copyright 2025 Marimo. All rights reserved.
53+
from __future__ import annotations
54+
55+
from typing import TYPE_CHECKING
56+
57+
from marimo._lint.diagnostic import Diagnostic, Severity
58+
from marimo._lint.rules.base import LintRule
59+
60+
if TYPE_CHECKING:
61+
from marimo._lint.context import RuleContext
62+
63+
64+
class YourNewRule(LintRule):
65+
"""MB005: Brief description of what this rule checks.
66+
67+
Detailed explanation of what this rule does and why it's important.
68+
This should explain the technical details of how the rule works.
69+
70+
## What it does
71+
72+
Clear, concise explanation of what the rule detects.
73+
74+
## Why is this bad?
75+
76+
Explanation of why this issue is problematic:
77+
- Impact on notebook execution
78+
- Potential for bugs or confusion
79+
- Effect on reproducibility
80+
81+
## Examples
82+
83+
**Problematic:**
84+
```python
85+
# Example of code that violates this rule
86+
bad_code = "example"
87+
```
88+
89+
**Solution:**
90+
```python
91+
# Example of how to fix the violation
92+
good_code = "example"
93+
```
94+
95+
## References
96+
97+
- [Understanding Errors](https://docs.marimo.io/guides/understanding_errors/)
98+
- [Relevant Guide](https://docs.marimo.io/guides/...)
99+
"""
100+
101+
code = "MB005" # Your assigned code
102+
name = "your-rule-name" # Kebab-case name
103+
description = "Brief description for rule listings"
104+
severity = Severity.BREAKING # Or RUNTIME/FORMATTING
105+
fixable = False # True if rule can auto-fix issues
106+
107+
async def check(self, ctx: RuleContext) -> None:
108+
"""Implement your rule logic here."""
109+
# Iterate through notebook cells
110+
for cell in ctx.notebook.cells:
111+
# Your detection logic here
112+
if self._detect_violation(cell):
113+
diagnostic = Diagnostic(
114+
message="Description of the specific violation",
115+
line=cell.lineno,
116+
column=cell.col_offset + 1,
117+
code=self.code,
118+
name=self.name,
119+
severity=self.severity,
120+
fixable=self.fixable,
121+
)
122+
await ctx.add_diagnostic(diagnostic)
123+
124+
def _detect_violation(self, cell) -> bool:
125+
"""Helper method for detection logic."""
126+
# Implement your specific detection logic
127+
return False
128+
```
129+
130+
### 2. Register the Rule
131+
132+
Add your rule to the appropriate `__init__.py` file:
133+
134+
**For Breaking rules** (`marimo/_lint/rules/breaking/__init__.py`):
135+
136+
```python
137+
from marimo._lint.rules.breaking.your_file import YourNewRule
138+
139+
BREAKING_RULE_CODES: dict[str, type[LintRule]] = {
140+
"MB001": UnparsableRule,
141+
"MB002": MultipleDefinitionsRule,
142+
"MB003": CycleDependenciesRule,
143+
"MB004": SetupCellDependenciesRule,
144+
"MB005": YourNewRule, # Add your rule here
145+
}
146+
147+
__all__ = [
148+
# ... existing rules ...
149+
"YourNewRule", # Add to exports
150+
"BREAKING_RULE_CODES",
151+
]
152+
```
153+
154+
### 3. Create Test Files
155+
156+
#### a) Create a test notebook file
157+
158+
Create `tests/_lint/test_files/your_rule_name.py`:
159+
160+
```python
161+
import marimo
162+
163+
__generated_with = "0.15.2"
164+
app = marimo.App()
165+
166+
167+
@app.cell
168+
def _():
169+
# Code that should trigger your rule
170+
problematic_code = "example"
171+
return
172+
173+
174+
@app.cell
175+
def _():
176+
# Additional test cases
177+
return
178+
179+
180+
if __name__ == "__main__":
181+
app.run()
182+
```
183+
184+
#### b) Add snapshot test
185+
186+
Add to `tests/_lint/test_runtime_errors_snapshot.py`:
187+
188+
```python
189+
def test_your_rule_snapshot():
190+
"""Test snapshot for your new rule."""
191+
file = "tests/_lint/test_files/your_rule_name.py"
192+
with open(file) as f:
193+
code = f.read()
194+
195+
notebook = parse_notebook(code, filepath=file)
196+
errors = lint_notebook(notebook)
197+
198+
# Format errors for snapshot
199+
error_output = []
200+
for error in errors:
201+
error_output.append(error.format())
202+
203+
snapshot("your_rule_name_errors.txt", "\n".join(error_output))
204+
```
205+
206+
#### c) Add unit tests (optional but recommended)
207+
208+
For more rigorous testing, create `tests/_lint/test_your_rule.py`:
209+
210+
```python
211+
import pytest
212+
from marimo._ast.parse import parse_notebook
213+
from marimo._lint.context import LintContext
214+
from marimo._lint.rules.breaking import YourNewRule
215+
216+
217+
class TestYourNewRule:
218+
"""Test cases for YourNewRule."""
219+
220+
async def test_detects_violation(self):
221+
"""Test that the rule detects violations correctly."""
222+
code = """import marimo
223+
app = marimo.App()
224+
225+
@app.cell
226+
def _():
227+
# Code that should trigger the rule
228+
return
229+
"""
230+
notebook = parse_notebook(code)
231+
ctx = LintContext(notebook)
232+
233+
rule = YourNewRule()
234+
await rule.check(ctx)
235+
236+
diagnostics = await ctx.get_diagnostics()
237+
assert len(diagnostics) > 0
238+
assert diagnostics[0].code == "MB005"
239+
assert diagnostics[0].severity == Severity.BREAKING
240+
241+
async def test_no_false_positives(self):
242+
"""Test that the rule doesn't trigger on valid code."""
243+
code = """import marimo
244+
app = marimo.App()
245+
246+
@app.cell
247+
def _():
248+
# Valid code that should not trigger the rule
249+
return
250+
"""
251+
notebook = parse_notebook(code)
252+
ctx = LintContext(notebook)
253+
254+
rule = YourNewRule()
255+
await rule.check(ctx)
256+
257+
diagnostics = await ctx.get_diagnostics()
258+
assert len(diagnostics) == 0
259+
```
260+
261+
### 4. Generate Documentation
262+
263+
The documentation is automatically generated from your rule's docstring. Run:
264+
265+
```bash
266+
uv run scripts/generate_lint_docs.py
267+
```
268+
269+
This will create:
270+
- `docs/guides/lint_rules/rules/your_rule_name.md`
271+
- Updated `docs/guides/lint_rules/index.md`
272+
273+
### 5. Run Tests
274+
275+
```bash
276+
# Run lint tests
277+
uv run hatch run test:test tests/_lint
278+
279+
# Run your specific test
280+
uv run hatch run test:test tests/_lint/test_your_rule.py
281+
282+
# Update snapshots if needed
283+
uv run hatch run test:test tests/_lint/test_runtime_errors_snapshot.py --snapshot-update
284+
```
285+
286+
## Rule Implementation Guidelines
287+
288+
### Detection Logic
289+
290+
- **Prefer AST analysis** over string matching when possible
291+
- **Use context information** from `RuleContext` (notebook, graph, etc.)
292+
- **Be specific** in error messages - help users understand the exact issue
293+
- **Consider edge cases** - test with various code patterns
294+
295+
### Error Messages
296+
297+
- **Be descriptive**: Explain what the issue is
298+
- **Be actionable**: Suggest how to fix it
299+
- **Be consistent**: Follow patterns from existing rules
300+
301+
Good: `"Variable 'x' is defined in multiple cells"`
302+
Bad: `"Multiple definition error"`
303+
304+
### Performance
305+
306+
- **Avoid expensive operations** in the hot path
307+
- **Cache results** when checking multiple cells (add to context if needed)
308+
- **Early return** when possible
309+
310+
### Fixability
311+
312+
TODO: Right now, the only fixable errors are via re-serialization.
313+
314+
## Common Patterns
315+
316+
### Checking All Cells
317+
318+
```python
319+
async def check(self, ctx: RuleContext) -> None:
320+
for cell in ctx.notebook.cells:
321+
if self._check_cell(cell):
322+
# Create diagnostic
323+
```
324+
325+
### Using the Dependency Graph
326+
327+
```python
328+
async def check(self, ctx: RuleContext) -> None:
329+
graph = ctx.get_graph()
330+
for cell_id, cell_data in graph.cells.items():
331+
# Analyze dependencies
332+
```
333+
334+
## Testing Best Practices
335+
336+
### Snapshot Tests
337+
338+
- **Include in snapshot tests** for regression protection
339+
- **Use realistic examples** that demonstrate the rule clearly
340+
- **Test edge cases** in separate unit tests
341+
342+
### Unit Tests
343+
344+
- **Test positive cases** (rule triggers correctly)
345+
- **Test negative cases** (no false positives)
346+
- **Test edge cases** (empty cells, syntax errors, etc.)
347+
- **Test multiple violations** in one notebook
348+
349+
### Test File Structure
350+
351+
```
352+
tests/_lint/
353+
├── test_files/ # Test notebooks
354+
│ └── your_rule_name.py
355+
├── snapshots/ # Expected outputs
356+
│ └── your_rule_name_errors.txt
357+
├── test_your_rule.py # Unit tests
358+
└── test_runtime_errors_snapshot.py # Snapshot tests
359+
```
360+
361+
## Documentation Requirements
362+
363+
Your rule's docstring should include:
364+
365+
1. **Rule code and brief description** in the first line
366+
2. **## What it does** - Technical explanation
367+
3. **## Why is this bad?** - Impact explanation
368+
4. **## Examples** - Code samples (problematic and fixed)
369+
5. **## References** - Links to relevant documentation
370+
371+
The documentation system will automatically:
372+
- Generate individual rule pages
373+
- Update the main rules index
374+
- Create proper navigation links
375+
- Use human-readable filenames
376+
377+
## Review Checklist
378+
379+
Before submitting your rule:
380+
381+
- [ ] Rule code follows numbering convention
382+
- [ ] Rule is registered in appropriate `__init__.py`
383+
- [ ] Comprehensive docstring with all required sections
384+
- [ ] Unit tests cover positive and negative cases
385+
- [ ] Snapshot test included
386+
- [ ] Documentation generates correctly
387+
- [ ] All lint tests pass
388+
- [ ] Error messages are clear and actionable
389+
- [ ] Performance is reasonable for large notebooks
390+
391+
## Example: Simple Rule Implementation
392+
393+
Here's a complete example of a simple rule that checks for syntax errors: https://github.com/marimo-team/marimo/pull/6384

0 commit comments

Comments
 (0)