|
| 1 | +# Adding Lint Rules to marimo |
| 2 | + |
| 3 | +This guide explains how to add new lint rules to marimo's linting system. |
| 4 | + |
| 5 | +## Overview |
| 6 | + |
| 7 | +marimo's lint system helps users write better, more reliable notebooks by detecting various issues that could prevent notebooks from running correctly. The system is organized around three severity levels: |
| 8 | + |
| 9 | +- **Breaking (MB)**: Errors that prevent notebook execution |
| 10 | +- **Runtime (MR)**: Issues that may cause runtime problems |
| 11 | +- **Formatting (MF)**: Style and formatting issues |
| 12 | + |
| 13 | +## Rule Code Assignment |
| 14 | + |
| 15 | +Rule codes follow a specific pattern: `M[severity][number]` |
| 16 | + |
| 17 | +### Current Code Ranges |
| 18 | + |
| 19 | +- **MB001-MB099**: Breaking rules |
| 20 | +- **MR001-MR099**: Runtime rules |
| 21 | +- **MF001-MF099**: Formatting rules |
| 22 | + |
| 23 | +### Assigning New Codes |
| 24 | + |
| 25 | +When adding a new rule: |
| 26 | + |
| 27 | +1. **Determine severity**: Choose Breaking, Runtime, or Formatting based on impact |
| 28 | +2. **Find next available code**: Check existing rules in the appropriate category |
| 29 | +3. **Use sequential numbering**: MB005, MB006, etc. |
| 30 | + |
| 31 | +**Example assignments**: |
| 32 | +- MB001: unparsable-cells |
| 33 | +- MB002: multiple-definitions |
| 34 | +- MB003: cycle-dependencies |
| 35 | +- MB004: setup-cell-dependencies |
| 36 | +- MF001: general-formatting |
| 37 | +- MF002: parse-stdout |
| 38 | +- MF003: parse-stderr |
| 39 | + |
| 40 | +## Step-by-Step Implementation |
| 41 | + |
| 42 | +### 1. Create the Rule Class |
| 43 | + |
| 44 | +Create your rule in the appropriate directory: |
| 45 | +- Breaking rules: `marimo/_lint/rules/breaking/` |
| 46 | +- Runtime rules: `marimo/_lint/rules/runtime/` |
| 47 | +- Formatting rules: `marimo/_lint/rules/formatting/` |
| 48 | + |
| 49 | +**Template for a new rule**: |
| 50 | + |
| 51 | +```python |
| 52 | +# Copyright 2025 Marimo. All rights reserved. |
| 53 | +from __future__ import annotations |
| 54 | + |
| 55 | +from typing import TYPE_CHECKING |
| 56 | + |
| 57 | +from marimo._lint.diagnostic import Diagnostic, Severity |
| 58 | +from marimo._lint.rules.base import LintRule |
| 59 | + |
| 60 | +if TYPE_CHECKING: |
| 61 | + from marimo._lint.context import RuleContext |
| 62 | + |
| 63 | + |
| 64 | +class YourNewRule(LintRule): |
| 65 | + """MB005: Brief description of what this rule checks. |
| 66 | +
|
| 67 | + Detailed explanation of what this rule does and why it's important. |
| 68 | + This should explain the technical details of how the rule works. |
| 69 | +
|
| 70 | + ## What it does |
| 71 | +
|
| 72 | + Clear, concise explanation of what the rule detects. |
| 73 | +
|
| 74 | + ## Why is this bad? |
| 75 | +
|
| 76 | + Explanation of why this issue is problematic: |
| 77 | + - Impact on notebook execution |
| 78 | + - Potential for bugs or confusion |
| 79 | + - Effect on reproducibility |
| 80 | +
|
| 81 | + ## Examples |
| 82 | +
|
| 83 | + **Problematic:** |
| 84 | + ```python |
| 85 | + # Example of code that violates this rule |
| 86 | + bad_code = "example" |
| 87 | + ``` |
| 88 | +
|
| 89 | + **Solution:** |
| 90 | + ```python |
| 91 | + # Example of how to fix the violation |
| 92 | + good_code = "example" |
| 93 | + ``` |
| 94 | +
|
| 95 | + ## References |
| 96 | +
|
| 97 | + - [Understanding Errors](https://docs.marimo.io/guides/understanding_errors/) |
| 98 | + - [Relevant Guide](https://docs.marimo.io/guides/...) |
| 99 | + """ |
| 100 | + |
| 101 | + code = "MB005" # Your assigned code |
| 102 | + name = "your-rule-name" # Kebab-case name |
| 103 | + description = "Brief description for rule listings" |
| 104 | + severity = Severity.BREAKING # Or RUNTIME/FORMATTING |
| 105 | + fixable = False # True if rule can auto-fix issues |
| 106 | + |
| 107 | + async def check(self, ctx: RuleContext) -> None: |
| 108 | + """Implement your rule logic here.""" |
| 109 | + # Iterate through notebook cells |
| 110 | + for cell in ctx.notebook.cells: |
| 111 | + # Your detection logic here |
| 112 | + if self._detect_violation(cell): |
| 113 | + diagnostic = Diagnostic( |
| 114 | + message="Description of the specific violation", |
| 115 | + line=cell.lineno, |
| 116 | + column=cell.col_offset + 1, |
| 117 | + code=self.code, |
| 118 | + name=self.name, |
| 119 | + severity=self.severity, |
| 120 | + fixable=self.fixable, |
| 121 | + ) |
| 122 | + await ctx.add_diagnostic(diagnostic) |
| 123 | + |
| 124 | + def _detect_violation(self, cell) -> bool: |
| 125 | + """Helper method for detection logic.""" |
| 126 | + # Implement your specific detection logic |
| 127 | + return False |
| 128 | +``` |
| 129 | + |
| 130 | +### 2. Register the Rule |
| 131 | + |
| 132 | +Add your rule to the appropriate `__init__.py` file: |
| 133 | + |
| 134 | +**For Breaking rules** (`marimo/_lint/rules/breaking/__init__.py`): |
| 135 | + |
| 136 | +```python |
| 137 | +from marimo._lint.rules.breaking.your_file import YourNewRule |
| 138 | + |
| 139 | +BREAKING_RULE_CODES: dict[str, type[LintRule]] = { |
| 140 | + "MB001": UnparsableRule, |
| 141 | + "MB002": MultipleDefinitionsRule, |
| 142 | + "MB003": CycleDependenciesRule, |
| 143 | + "MB004": SetupCellDependenciesRule, |
| 144 | + "MB005": YourNewRule, # Add your rule here |
| 145 | +} |
| 146 | + |
| 147 | +__all__ = [ |
| 148 | + # ... existing rules ... |
| 149 | + "YourNewRule", # Add to exports |
| 150 | + "BREAKING_RULE_CODES", |
| 151 | +] |
| 152 | +``` |
| 153 | + |
| 154 | +### 3. Create Test Files |
| 155 | + |
| 156 | +#### a) Create a test notebook file |
| 157 | + |
| 158 | +Create `tests/_lint/test_files/your_rule_name.py`: |
| 159 | + |
| 160 | +```python |
| 161 | +import marimo |
| 162 | + |
| 163 | +__generated_with = "0.15.2" |
| 164 | +app = marimo.App() |
| 165 | + |
| 166 | + |
| 167 | +@app.cell |
| 168 | +def _(): |
| 169 | + # Code that should trigger your rule |
| 170 | + problematic_code = "example" |
| 171 | + return |
| 172 | + |
| 173 | + |
| 174 | +@app.cell |
| 175 | +def _(): |
| 176 | + # Additional test cases |
| 177 | + return |
| 178 | + |
| 179 | + |
| 180 | +if __name__ == "__main__": |
| 181 | + app.run() |
| 182 | +``` |
| 183 | + |
| 184 | +#### b) Add snapshot test |
| 185 | + |
| 186 | +Add to `tests/_lint/test_runtime_errors_snapshot.py`: |
| 187 | + |
| 188 | +```python |
| 189 | +def test_your_rule_snapshot(): |
| 190 | + """Test snapshot for your new rule.""" |
| 191 | + file = "tests/_lint/test_files/your_rule_name.py" |
| 192 | + with open(file) as f: |
| 193 | + code = f.read() |
| 194 | + |
| 195 | + notebook = parse_notebook(code, filepath=file) |
| 196 | + errors = lint_notebook(notebook) |
| 197 | + |
| 198 | + # Format errors for snapshot |
| 199 | + error_output = [] |
| 200 | + for error in errors: |
| 201 | + error_output.append(error.format()) |
| 202 | + |
| 203 | + snapshot("your_rule_name_errors.txt", "\n".join(error_output)) |
| 204 | +``` |
| 205 | + |
| 206 | +#### c) Add unit tests (optional but recommended) |
| 207 | + |
| 208 | +For more rigorous testing, create `tests/_lint/test_your_rule.py`: |
| 209 | + |
| 210 | +```python |
| 211 | +import pytest |
| 212 | +from marimo._ast.parse import parse_notebook |
| 213 | +from marimo._lint.context import LintContext |
| 214 | +from marimo._lint.rules.breaking import YourNewRule |
| 215 | + |
| 216 | + |
| 217 | +class TestYourNewRule: |
| 218 | + """Test cases for YourNewRule.""" |
| 219 | + |
| 220 | + async def test_detects_violation(self): |
| 221 | + """Test that the rule detects violations correctly.""" |
| 222 | + code = """import marimo |
| 223 | +app = marimo.App() |
| 224 | +
|
| 225 | +@app.cell |
| 226 | +def _(): |
| 227 | + # Code that should trigger the rule |
| 228 | + return |
| 229 | +""" |
| 230 | + notebook = parse_notebook(code) |
| 231 | + ctx = LintContext(notebook) |
| 232 | + |
| 233 | + rule = YourNewRule() |
| 234 | + await rule.check(ctx) |
| 235 | + |
| 236 | + diagnostics = await ctx.get_diagnostics() |
| 237 | + assert len(diagnostics) > 0 |
| 238 | + assert diagnostics[0].code == "MB005" |
| 239 | + assert diagnostics[0].severity == Severity.BREAKING |
| 240 | + |
| 241 | + async def test_no_false_positives(self): |
| 242 | + """Test that the rule doesn't trigger on valid code.""" |
| 243 | + code = """import marimo |
| 244 | +app = marimo.App() |
| 245 | +
|
| 246 | +@app.cell |
| 247 | +def _(): |
| 248 | + # Valid code that should not trigger the rule |
| 249 | + return |
| 250 | +""" |
| 251 | + notebook = parse_notebook(code) |
| 252 | + ctx = LintContext(notebook) |
| 253 | + |
| 254 | + rule = YourNewRule() |
| 255 | + await rule.check(ctx) |
| 256 | + |
| 257 | + diagnostics = await ctx.get_diagnostics() |
| 258 | + assert len(diagnostics) == 0 |
| 259 | +``` |
| 260 | + |
| 261 | +### 4. Generate Documentation |
| 262 | + |
| 263 | +The documentation is automatically generated from your rule's docstring. Run: |
| 264 | + |
| 265 | +```bash |
| 266 | +uv run scripts/generate_lint_docs.py |
| 267 | +``` |
| 268 | + |
| 269 | +This will create: |
| 270 | +- `docs/guides/lint_rules/rules/your_rule_name.md` |
| 271 | +- Updated `docs/guides/lint_rules/index.md` |
| 272 | + |
| 273 | +### 5. Run Tests |
| 274 | + |
| 275 | +```bash |
| 276 | +# Run lint tests |
| 277 | +uv run hatch run test:test tests/_lint |
| 278 | + |
| 279 | +# Run your specific test |
| 280 | +uv run hatch run test:test tests/_lint/test_your_rule.py |
| 281 | + |
| 282 | +# Update snapshots if needed |
| 283 | +uv run hatch run test:test tests/_lint/test_runtime_errors_snapshot.py --snapshot-update |
| 284 | +``` |
| 285 | + |
| 286 | +## Rule Implementation Guidelines |
| 287 | + |
| 288 | +### Detection Logic |
| 289 | + |
| 290 | +- **Prefer AST analysis** over string matching when possible |
| 291 | +- **Use context information** from `RuleContext` (notebook, graph, etc.) |
| 292 | +- **Be specific** in error messages - help users understand the exact issue |
| 293 | +- **Consider edge cases** - test with various code patterns |
| 294 | + |
| 295 | +### Error Messages |
| 296 | + |
| 297 | +- **Be descriptive**: Explain what the issue is |
| 298 | +- **Be actionable**: Suggest how to fix it |
| 299 | +- **Be consistent**: Follow patterns from existing rules |
| 300 | + |
| 301 | +Good: `"Variable 'x' is defined in multiple cells"` |
| 302 | +Bad: `"Multiple definition error"` |
| 303 | + |
| 304 | +### Performance |
| 305 | + |
| 306 | +- **Avoid expensive operations** in the hot path |
| 307 | +- **Cache results** when checking multiple cells (add to context if needed) |
| 308 | +- **Early return** when possible |
| 309 | + |
| 310 | +### Fixability |
| 311 | + |
| 312 | +TODO: Right now, the only fixable errors are via re-serialization. |
| 313 | + |
| 314 | +## Common Patterns |
| 315 | + |
| 316 | +### Checking All Cells |
| 317 | + |
| 318 | +```python |
| 319 | +async def check(self, ctx: RuleContext) -> None: |
| 320 | + for cell in ctx.notebook.cells: |
| 321 | + if self._check_cell(cell): |
| 322 | + # Create diagnostic |
| 323 | +``` |
| 324 | + |
| 325 | +### Using the Dependency Graph |
| 326 | + |
| 327 | +```python |
| 328 | +async def check(self, ctx: RuleContext) -> None: |
| 329 | + graph = ctx.get_graph() |
| 330 | + for cell_id, cell_data in graph.cells.items(): |
| 331 | + # Analyze dependencies |
| 332 | +``` |
| 333 | + |
| 334 | +## Testing Best Practices |
| 335 | + |
| 336 | +### Snapshot Tests |
| 337 | + |
| 338 | +- **Include in snapshot tests** for regression protection |
| 339 | +- **Use realistic examples** that demonstrate the rule clearly |
| 340 | +- **Test edge cases** in separate unit tests |
| 341 | + |
| 342 | +### Unit Tests |
| 343 | + |
| 344 | +- **Test positive cases** (rule triggers correctly) |
| 345 | +- **Test negative cases** (no false positives) |
| 346 | +- **Test edge cases** (empty cells, syntax errors, etc.) |
| 347 | +- **Test multiple violations** in one notebook |
| 348 | + |
| 349 | +### Test File Structure |
| 350 | + |
| 351 | +``` |
| 352 | +tests/_lint/ |
| 353 | +├── test_files/ # Test notebooks |
| 354 | +│ └── your_rule_name.py |
| 355 | +├── snapshots/ # Expected outputs |
| 356 | +│ └── your_rule_name_errors.txt |
| 357 | +├── test_your_rule.py # Unit tests |
| 358 | +└── test_runtime_errors_snapshot.py # Snapshot tests |
| 359 | +``` |
| 360 | + |
| 361 | +## Documentation Requirements |
| 362 | + |
| 363 | +Your rule's docstring should include: |
| 364 | + |
| 365 | +1. **Rule code and brief description** in the first line |
| 366 | +2. **## What it does** - Technical explanation |
| 367 | +3. **## Why is this bad?** - Impact explanation |
| 368 | +4. **## Examples** - Code samples (problematic and fixed) |
| 369 | +5. **## References** - Links to relevant documentation |
| 370 | + |
| 371 | +The documentation system will automatically: |
| 372 | +- Generate individual rule pages |
| 373 | +- Update the main rules index |
| 374 | +- Create proper navigation links |
| 375 | +- Use human-readable filenames |
| 376 | + |
| 377 | +## Review Checklist |
| 378 | + |
| 379 | +Before submitting your rule: |
| 380 | + |
| 381 | +- [ ] Rule code follows numbering convention |
| 382 | +- [ ] Rule is registered in appropriate `__init__.py` |
| 383 | +- [ ] Comprehensive docstring with all required sections |
| 384 | +- [ ] Unit tests cover positive and negative cases |
| 385 | +- [ ] Snapshot test included |
| 386 | +- [ ] Documentation generates correctly |
| 387 | +- [ ] All lint tests pass |
| 388 | +- [ ] Error messages are clear and actionable |
| 389 | +- [ ] Performance is reasonable for large notebooks |
| 390 | + |
| 391 | +## Example: Simple Rule Implementation |
| 392 | + |
| 393 | +Here's a complete example of a simple rule that checks for syntax errors: https://github.com/marimo-team/marimo/pull/6384 |
0 commit comments