This is an advanced web crawler built with Scrapy. It extracts various types of information from a specified starting URL, including emails, links, external files, JavaScript files, form fields, images, videos, audio, and comments.
- Custom Offsite Middleware to handle URLs with ports.
- Modular extraction functions for better readability and maintainability.
- Detailed logging and error handling.
- Outputs results to
zeroxer.json.
Replace http://abcd.com with your desired starting URL. "python3 zeroxplorer.py http://abcd.com"