Skip to content

MainRoute-Core/WebSite-Cloner

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

WebSite Cloner

 __      __   _    ___ _ _          ___ _
 \ \    / /__| |__/ __(_) |_ ___   / __| |___ _ _  ___ _ _
  \ \/\/ / -_) '_ \__ \ |  _/ -_) | (__| / _ \ ' \/ -_) '_|
   \_/\_/\___|_.__/___/_|\__\___|  \___|_\___/_||_\___|_|
                                by MainRoute Core

A lightweight, cross-platform CLI utility written in Python to clone websites locally [1]. The script features dynamic terminal color coding, recursive crawling of internal links, offline file path rewriting, and automated asset acquisition (including external stylesheets, scripts, fonts, and images) [1].


Features

  • Cross-Platform Color Coding: Renders native, stylized terminal outputs across Windows, macOS, and Linux without requiring external color libraries [1].
  • Deep Asset Fetching: Recursively processes and downloads page resources, scripts, stylesheets, responsive image srcset structures, CSS backgrounds, and nested sub-assets like web fonts [1].
  • Auto-Detect CDN & Source Providers: Automatically discovers external asset sources (such as Google Fonts, Cloudflare, unpkg, etc.) on the fly and dynamically whitelists them for seamless local asset storage [1].
  • Interactive Configurations: Standard terminal prompts let you configure target URLs, local folder destinations, external asset directories, and copyright ownership in real time [1].
  • Automated Copyright Integration: Scans downloaded pages for outdated or empty footers, updating them to a customized copyright notice formatted for the year 2026 [1]. If no footer exists, a default, well-styled fallback footer is appended [1].
  • Output Theme Highlights:
    • Light Blue: Main application ASCII logo [1].
    • Green: Successful downloads and local page mappings [1].
    • Red: Connection errors or script-level execution issues [1].
    • Cyan: Ongoing processes, whitelisting operations, and conflict resolution details [1].
    • Golden: Automatic highlighting for digits, counts, and years [1].
    • White: Standard prompt descriptions and instruction guides [1].

Prerequisites

  1. Python 3.x installed and configured in your system path [1].

Usage

1. Initial Setup & Launch

To start using the cloner, follow these steps:

Download

https://gitfolderdownloader.github.io/?=https://github.com/MainRoute/WebSite-Cloner

Clone

git clone https://github.com/MainRoute/WebSite-Cloner WebClone

Navigate to Folder: Extract the source from your zip file or use the directory command:

cd WebClone

Run the Script: Start the launcher script corresponding to your operating system:

  • Windows (Command Prompt): Double-click Run.bat / Run.cmd or execute:

    Run.bat
  • Windows (PowerShell): Run:

    PowerShell -ExecutionPolicy Bypass -File .\Run.ps1
  • Linux / macOS (Terminal Script): Grant execution permissions, then run:

    chmod +x Run.sh
    ./Run.sh
  • macOS (Double-click finder compatibility): Grant execution permissions once, then double-click Run.command:

    chmod +x Run.command

2. Setting Up Cloning Parameters

Upon launching, the script will guide you step-by-step through setting up your configurations:

  1. Target URL: Enter the web address you wish to clone (e.g., https://example.com) [1].
  2. Output Directory Path: Enter where you want the compiled pages saved (defaults to cloned_site if left empty) [1].
  3. Folder for External Sources: Specify the name of the folder where whitelisted external assets and CDN elements will reside (defaults to _external) [1].
  4. Whitelisted External Domains: Optionally define specific domains to fetch from (though the auto-detection engine will dynamically handle standard CDNs on its own) [1].
  5. Copyright Holder: Define your name or agency name to be injected into the footers (defaults to MainRoute Core) [1].

3. Review & Run

Before the download process initiates, the console displays a summary detailing the selected scope [1]. Press Enter to verify the configuration and launch the crawling cycle, or use Ctrl+C to cancel and abort [1].

About

A lightweight, cross-platform CLI utility written in Python to clone websites locally [1]. The script features dynamic terminal color coding, recursive crawling of internal links, offline file path rewriting, and automated asset acquisition (including external stylesheets, scripts, fonts, and images) [1].

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages