Skip to content

Conversation

@davidfirst
Copy link
Member

@davidfirst davidfirst commented Sep 12, 2025

Node Modules Optimization for BVM Distributions

This PR implements a cleanup script to significantly reduce the size of node_modules in BVM (Bit Version Manager) distributions by removing unnecessary files while maintaining full CLI functionality.

Key Features

  • PNPM overrides: Consolidates duplicate packages (16.9MB savings)
  • Monaco Editor cleanup: Removes dev/, esm/, min-maps/ folders, keeps production-ready min/ folder (~64MB savings)
  • Source map removal: Configurable removal of .map files (~101MB savings)
  • TypeScript definitions cleanup: Automatically removes @types directory (~17MB savings)
  • Duplicate module format removal: Removes ESM builds when packages ship both formats (~15.7MB savings)
    • Directory-based duplicates (esm/cjs folders): ~7.6MB
    • File-based duplicates (.mjs/.js pairs): ~8.1MB
  • Source directory removal: Optional removal of redundant source directories (~10.7MB savings with --remove-source flag)
  • UI dependency removal: Optional aggressive optimization that removes UI-only packages (~8MB savings)
  • Safety flags: --dry-run, --keep-teambit-maps, --remove-ui-deps, --remove-esm, --remove-source, --verbose
  • Cross-platform consistency: Uses Node.js fs.statSync() for logical file size calculation

Results (macOS September 2025)

Baseline progression:

  • v1.12.126 baseline: 786.9 MB
  • v1.12.128 (removed Prompt/Winston): 725.4 MB (-61.5 MB)
  • v1.12.134 (removed memoizee): 722.8 MB (-2.6 MB)
  • v1.12.152 (updated eslint-linter, removed duplicate TypeScript): 711.3 MB (-11.5 MB)
  • With PNPM overrides: 694.4 MB (-16.9 MB additional)

Cleanup script results (from 694.4 MB baseline):

  • Default mode: 507.8 MB (186.6 MB saved, 26.9% reduction)
  • With --keep-teambit-maps: 534.1 MB (160.3 MB saved, 23.1% reduction)
  • With --remove-esm: 483.9 MB (210.5 MB saved, 30.3% reduction)
  • With --remove-source: 488.9 MB (205.5 MB saved, 29.6% reduction)
  • With --remove-ui-deps: 491.7 MB (202.7 MB saved, 29.2% reduction)

Total optimization potential: 786.9 MB → 465.3 MB (321.6 MB saved, 40.9% reduction)

CI/CD Integration

  • Integrated into CircleCI bundle jobs for all platforms (Linux, macOS, Windows)
  • Parallel e2e test simulation jobs validate cleaned installations
  • Reusable optimize_node_modules command for consistent cross-platform deployment

Safety

  • Pre-built UI bundles ensure bit start works without UI dependencies
  • @teambit source maps preserved with --keep-teambit-maps flag for debugging
  • @types removal is safe as workspaces install their own TypeScript definitions locally
  • All CLI commands (bit status, bit compile, etc.) function normally after cleanup

Documentation

See docs/node-modules-optimization.md for detailed analysis and implementation notes.

- Add cleanup script that reduces node_modules by ~130MB
- Remove duplicate monaco-editor builds (dev, esm, min-maps)
- Remove source map files for production builds
- Integrate cleanup into CircleCI bundle jobs
- Create reusable optimize_node_modules command in CI

This optimization reduces BVM installation size from 1.1GB to ~970MB (11.5% reduction)
- Fixed buffer overflow issue when using execSync with find command
- Replaced with recursive filesystem traversal
- Now removes ALL source maps including those in nested node_modules
- Increased removal from 4,956 files (52MB) to 14,697 files (124MB)
- Total savings increased from 130MB to 205MB
- Added --keep-teambit-maps option to preserve @teambit source maps
- Allows debugging Bit's code while still optimizing bundle size
- With flag: removes 7,714 files (159MB saved), final size ~970MB
- Without flag: removes 14,697 files (205MB saved), final size ~924MB
- Updated help text with new flag and savings breakdown
- Remove date-fns locale files except English (~21MB saved)
- Remove TypeScript locale directories except English (~4MB saved)
- Improve dry-run estimation accuracy using du measurements
- Update documentation with final results: 222MB total savings (19.7%)

Major space savings come from removing unnecessary locale files that are
rarely used in CLI environments.
- Create e2e_test_bundle_simulation job to test cleanup script safety
- Simulates exact bundle process: pnpm add @teambit/bit with hoisting + overrides
- Applies cleanup script and tests full e2e suite with cleaned installation
- Uses bit-cleaned binary to avoid conflicts with repo binary
- Only runs on cleanup-related branches (optimize-node-modules*, cleanup-script*, *cleanup*script*)
- Follows CircleCI patterns: user-space linking, no sudo required
- Provides confidence that cleanup won't break BVM functionality

This ensures thorough testing of cleanup script before production deployment.
…d_artifacts

- Fix store_test_results_and_artifacts command usage in e2e_test_bundle_simulation
- Use correct parameter names: test_results_path, artifacts_path, logs_path
- Remove invalid 'path' parameter that caused CircleCI validation error

Config now validates successfully with 'circleci config validate'.
Issues fixed:
- bit-cleaned command not found: use source  in each step
- 25x redundant setup: split into setup_bundle_simulation (once) + e2e_test_bundle_simulation (25x parallel)

Changes:
- setup_bundle_simulation: does bundle + cleanup once, persists to workspace
- e2e_test_bundle_simulation: attaches workspace, links binary, runs tests
- Both jobs only run on cleanup-related branches
- Updated documentation to reflect optimized approach

This eliminates ~24x waste of expensive pnpm install + cleanup while maintaining full test coverage.
The cleanup script is located at ../bit/scripts/cleanup-node-modules.js
relative to the bit-bundle-test directory, not ../scripts/
since the workspace contains the bit/ directory from previous jobs.
- Use npx bit for verification (finds binary correctly)
- Add debugging to see actual directory structure
- Use find command to locate bit binary dynamically in test job
- The binary might be at node_modules/@teambit/bit/bin/bit instead of node_modules/.bin/bit
The bundled bit-cleaned binary doesn't need configuration commands like:
- bit config set analytics_reporting false
- bit config set package-manager.cache

These commands were causing 'Cannot find module @teambit/mdx/mdx.aspect' errors
because the bundled bit was trying to load workspace aspects.

The bundled bit should work without configuration for e2e testing.
…ircleCI config

- Add install_bit_bundle command for Linux/macOS installations
- Add install_bit_bundle_windows command for Windows-specific setup
- Simplify bundle simulation verification to only check version
- Remove redundant installation verification step
The --bin_bit parameter expects a binary name that can be found in PATH,
not a file path. Reverted to the correct approach that creates a symlink
in PATH with the expected binary name.
…imization

- Add --remove-ui-deps flag to cleanup script for additional 45MB savings
- Removes UI-only packages (react-syntax-highlighter, date-fns, d3-*, etc)
- Removes nested node_modules in @teambit/react and @teambit/ui
- Clear warning that this breaks 'bit start --dev' mode
- Update CircleCI bundle simulation to test with --remove-ui-deps flag
- Document that packages are removed because UI is pre-bundled in artifacts
This script helps identify duplicate packages that could be consolidated:
- Found 67MB of potential savings from duplicates in BVM installation
- PostCSS has 68 copies (21MB potential savings)
- TypeScript has 2 copies with different versions (5.5.3 and 5.9.2)
- Helps identify version consolidation opportunities
Added overrides for the packages with highest duplication impact:
- postcss: ^8.4.19 (68 copies, potential 21MB savings)
- tslib: ^2.6.2 (19 copies, multiple versions)

This should significantly reduce duplicate packages in bundle
installations while maintaining compatibility.
- Add overrides for ajv@6, source-map@0, readable-stream@2
- Update postcss and tslib keys to use version-specific format
- Total tested savings: 21MB in fresh installations
- Update baseline to v1.12.128 (725.4 MB after Prompt removal)
- Add version history showing 61.5MB savings from Prompt/Winston removal
- Update cleanup script results with exact measurements from macOS testing
- Show detailed breakdown of space savings per optimization
- Total potential: 786.9 MB → 524.3 MB (33.4% reduction)
…rmat cleanup

- Add flags to remove duplicate ESM or CJS builds when packages ship both
- Detect and remove esm/cjs folders in dist/, lib/, or package root
- Example: @modelcontextprotocol/sdk saves 4.5MB by removing dist/esm
- Testing shows ~5MB savings from 17 packages with duplicate builds
- Recommended: Use --remove-esm since Bit currently uses CommonJS
- Script was missing packages in scoped directories like @teambit, @types
- Added special handling for @ prefixed directories at root level
- Now correctly finds nested duplicates like typescript in @teambit/defender.eslint-linter
- Typescript now correctly shows as #1 optimization target (20.85MB savings)
- Create shared package-utils.js with common scanning logic
- Fix cleanup script to find packages in scoped directories (@teambit, @types, etc)
- TypeScript cleanup now finds all instances including nested ones
- Additional 3.7MB savings from cleaning multiple TypeScript instances
- Updated totals: 528.5MB final size (177.8MB saved, 25.2% reduction)
- Duplicate module format cleanup now handles nested packages correctly
- Remove duplicate code between cleanup and find-duplicate scripts
- Shared utilities: findPackageInstances, getDirectorySize, formatBytes
- Add cleanupTypeDefinitions() function to automatically remove node_modules/@types directory (~17MB saved)
- Update documentation with improved optimization results: 786.9MB → 503.6MB (36.0% reduction)
- Document @types removal safety: workspaces install local @types, CLI doesn't need global types
- Update help text to reflect @types cleanup in the optimization breakdown
- Remove depth === 0 restriction for scoped package handling
- Now properly detects packages like @sinclair/typebox in nested node_modules
- Adds 'build' pattern to catch packages using build/esm and build/cjs structure
- Update documentation: duplicate format removal now saves ~7.6MB (was ~5-10MB)
- Add @sinclair/typebox as example in documentation
- Recursively scan package subdirectories for .mjs/.js file pairs
- Remove .mjs files when corresponding .js files exist (ESM duplicates)
- Catches packages like Prettier that ship both formats as individual files
- Update documentation: duplicate format removal now saves ~15.7MB total
  - Directory-based duplicates (esm/cjs folders): ~7.6MB
  - File-based duplicates (.mjs/.js pairs): ~8.1MB
- Total optimization potential now 38.0% reduction (299MB saved)
- Add new --remove-source flag to safely remove source directories when compiled versions exist
- Removes src/ directories when dist/, lib/, min/, v3/, v4/ or other compiled builds exist
- Removes dist/ when min/ exists (keeps smaller production-ready build)
- Tested savings: ~10.7MB from packages like zod/src/ (2.2MB) and moment/src/ + moment/dist/ (1.9MB)
- Update documentation with new flag and source removal analysis
- Total optimization potential now 39.3% reduction (309.7MB saved)
- Remove depth === 0 restriction for scoped package handling
- Now properly detects @teambit packages at any nesting level
- Fixes issue where nested packages like @teambit/pkg.entities.registry/node_modules/@teambit/legacy.scope were not detected
- Makes --teambit-only flag work correctly for finding all @teambit duplicates
- Keep only postcss@8 and ajv@6 overrides (16.9MB savings)
- Remove tslib@2, source-map@0, readable-stream@2 overrides (minimal impact)
- Update documentation to reflect adjusted baseline (708.5MB) and savings
- Update final optimization potential to 39.1% reduction (307.5MB saved)
- Focus on fewer, more impactful overrides for easier maintenance
…ents

- Update baseline from v1.12.128 to v1.12.134 (removed memoizee, -2.6 MB)
- New baseline: 705.9 MB (was 708.5 MB)
- Remove confusing disk usage measurements section (redundant with detailed breakdown)
- Update all final sizes in results table to reflect new baseline
- Total optimization potential now 39.4% reduction (310.1 MB saved)
- Update baseline from v1.12.134 to v1.12.152 (-11.5 MB)
- v1.12.152 includes eslint-linter update that removed duplicate TypeScript copy
- New baseline: 694.4 MB (was 705.9 MB)
- Update all cleanup script results to reflect new baseline
- Default mode: 507.8 MB (186.6 MB saved, 26.9%)
- With --keep-teambit-maps: 534.1 MB (160.3 MB saved, 23.1%)
- Total optimization potential now 40.9% reduction (321.6 MB saved)
- Consolidate Key Insights with Key Findings into concise bullet points
- Shorten Testing Methodology and UI Dependencies Analysis sections
- Remove redundant examples and verbose explanations
- Make document more scannable and focused on key information
@davidfirst davidfirst enabled auto-merge (squash) October 10, 2025 00:10
@davidfirst davidfirst merged commit 86655ae into master Oct 10, 2025
12 of 13 checks passed
@davidfirst davidfirst deleted the optimize-node-modules-bvm branch October 10, 2025 00:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

3 participants