AI-powered secret detection for code commits using transformers and regex.
NeuroLeaks combines traditional regex scanning with CodeBERT, a state-of-the-art transformer model, to detect secrets like API keys, passwords, tokens, and credentials — even when they're obfuscated or renamed.
| Feature | Regex Only | ML Only | NeuroLeaks ✅ |
|---|---|---|---|
| Fast detection | ✅ | ❌ | ✅ |
| Understands code context | ❌ | ✅ | ✅ |
| Finds obfuscated secrets | ❌ | ✅ | ✅ |
| Works in CI / locally | ✅ | ✅ | ✅ |
| Pre-commit compatible | ✅ | ✅ | ✅ |
git clone https://github.com/YOUR_USERNAME/neuroleaks.git
cd neuroleaksIt's recommended to do this inside a virtual environment.
pip install -r requirements.txt
pip install pre-commit
repos:
- repo: https://github.com/markorskip/neuroleaks
rev: main # or use a specific tag like v0.1.0
hooks:
- id: neuroleaks
From the root of your project:
pre-commit install
pre-commit run --all-files
You’ll get output like:
NeuroLeaks Secret Detection......................................❌
- [REGEX] Line 3: 'api_key = "AIzaSy123..."' matched api key pattern
- [ML] Line 9: 'token = ghp_abcd1234...' (risk score: 0.91)
NeuroLeaks includes a test suite to validate its ML and regex behavior.
pytest tests/
- Regex Engine: Scans added lines for common secret patterns (e.g., AWS keys, GitHub tokens, passwords).
- Transformer Model: Uses
microsoft/codebert-baseto assess whether a line is likely to contain a secret, even if it's obfuscated or renamed. - Hybrid Detection: If regex doesn’t catch it, ML does. Both methods are used on staged diffs via
git diff --cached.
MIT License
Want to display that your repo is checked by NeuroLeaks? Add this badge:

- Docker support
- Fine-tuned CodeBERT on real-world leaked secrets
- GitHub Action for CI integration
- Web dashboard to review and manage leaks
Open an issue or start a discussion. Contributions welcome!