Skip to content

Improve HomotopySweep adaptive step control: success expansion, quality-gated growth, trust-monitored secant predictor#967

Draft
ChrisRackauckas-Claude wants to merge 4 commits into
SciML:masterfrom
ChrisRackauckas-Claude:homotopy-sweep-adaptive-control
Draft

Improve HomotopySweep adaptive step control: success expansion, quality-gated growth, trust-monitored secant predictor#967
ChrisRackauckas-Claude wants to merge 4 commits into
SciML:masterfrom
ChrisRackauckas-Claude:homotopy-sweep-adaptive-control

Conversation

@ChrisRackauckas-Claude

Copy link
Copy Markdown
Contributor

Note

This PR should be ignored until reviewed by @ChrisRackauckas.

Follow-up to #962 (cf. this comment): improves the adaptive algorithm in HomotopySweep.

What was there before

The adaptive logic was failure-only: a rejected step halved and the increment never grew back, so one hard region permanently degraded the resolution of the rest of the sweep. Warm starts were zeroth-order (reuse previous u).

What this adds

Three mechanisms from the predictor-corrector path-tracking literature, sized for a natural-parameter sweep whose corrector is a generic NonlinearSolve algorithm (no polynomial structure, no tangent/Jacobian access from the sweep itself):

  1. Success expansion (the classic success/failure heuristic, e.g. Timme, Mixed precision path tracking for polynomial homotopy continuation, Adv. Comput. Math. 47 (2021)): after expand_threshold consecutive accepted steps, grows by expand_factor (default ×2), capped at max_step_factor of the span width.

  2. Quality-gated growth (a Deuflhard-style local error estimate): the known failure mode of the pure success/failure heuristic is expensive trial-and-error in hard regions (cf. Hao & Zheng 2020) — the step balloons on the easy stretch before a sharp turn, and each oversized rejection costs a full inner-solver iteration budget (measured: one rejection through the default polyalgorithm ≈ 600 residual evaluations). Expansion therefore additionally requires evidence of corrector headroom: either the secant prediction error is small relative to the recent solution movement (expand_quality), or the corrector converged within 2 iterations (which covers exponentially flattening path tails, where the relative error measure stays at a constant mediocre value while the absolute corrections become negligible).

  3. Trust-monitored secant predictor: warm starts extrapolate linearly through the last two accepted points, so the corrector starts on the path tangent. Right after a sharp turn a stale tangent predicts far off the path, so the secant's measured prediction error is compared against the trivial constant prediction each accepted step, with hysteresis: two consecutive good measurements to (re-)engage, any bad measurement or rejected step to disengage and fall back to the constant warm start. predictor = :constant restores the old warm-start behavior.

All defaults preserve the old semantics where it matters: expand_factor = 1 disables growth entirely, expand_quality = Inf disables the gate, adaptive = false fixed-step mode is untouched.

Measured behavior (local runs, Julia 1.11)

Sharp-turn path u*(λ) = 2tanh(20(λ-1/2)) with corrector residual x + 2sin(x) (spurious merit-stationary traps for warm-start error > ~2.09), default polyalgorithm, maxiters = 100:

controller residual evals rejected steps
naive growth (no gate, no trust) 1406 2
+ quality gate 808 1
+ trust monitoring + hysteresis (this PR's default) 98 0
old behavior (no growth), secant 142 0

Smooth cubic path u*(λ) = 1 + λ: secant predictor 36 evals vs 64 constant; easy sweeps finish in 5 steps instead of 11 (growth doubling to the cap).

Tests

Four new test items (registered in test/runtests.jl):

  • item15: constructor defaults + validation for the new kwargs
  • item16: expansion takes strictly fewer steps; max_step_factor cap binds, including over an nsteps-derived initial step
  • item17: secant predictor spends strictly fewer residual evaluations than :constant on a linear path
  • item18: sharp-turn problem — sweep bisects through the turn, the increment regrows on the far shoulder (≥2× within the post-rejection accepted steps), and takes strictly fewer steps than expand_factor = 1

Local test results: all 18 homotopy sweep items pass (86 assertions), NonlinearSolveBase QA (Aqua + ExplicitImports) passes 14/14, Runic check clean. One pre-existing test (item7, aliasing) caught a real bug in an intermediate version of this PR — the θ measurement must not reuse the guess buffer the inner solver may have mutated in place — which is fixed and the test passes.

NonlinearSolveBase bumped 2.31.0 → 2.32.0.

Possible follow-ups (out of scope here)

  • Pseudo-arclength reparameterization for folds (the item6 fold test currently — correctly — reports failure; arclength continuation could follow the path around the turning point)
  • Hao & Zheng (2020)-style augmented systems tracking the minimum Jacobian eigenvalue near bifurcations
  • Higher-order (Padé / polynomial) predictors once more accepted points are retained

🤖 Generated with Claude Code

claude and others added 3 commits June 12, 2026 17:39
The adaptive logic was failure-only: the λ increment halved on a rejected
step and never recovered, so one hard region permanently degraded the
resolution of the rest of the sweep, and every step warm-started from the
previous solution unchanged.

This adds the classic predictor-corrector step controls from the
path-tracking literature (Timme 2021; Deuflhard-style error estimates),
sized for a natural-parameter sweep with a generic inner corrector:

- Success expansion: after `expand_threshold` consecutive accepted steps
  the increment grows by `expand_factor`, capped at `max_step_factor` of
  the span width.
- Quality-gated growth: expansion requires evidence the corrector has
  headroom — either the secant prediction error is small relative to the
  recent solution movement (`expand_quality`), or the corrector converged
  within 2 iterations. The gate keeps the step from ballooning right
  before a sharp turn, where an oversized step is rejected only after the
  inner solver exhausts its iterations (the costly trial-and-error mode
  of pure success/failure heuristics).
- Secant predictor: warm starts extrapolate linearly through the last two
  accepted points instead of reusing the previous solution. The secant is
  trust-monitored against the trivial constant prediction with hysteresis
  (two consecutive good measurements to engage, any rejection or bad
  measurement to disengage), so a stale tangent right after a sharp turn
  falls back to the constant warm start instead of predicting far off the
  path. `predictor = :constant` restores the old warm-start behavior.

On a benchmark path with a sharp turn (u* = 2tanh(20(λ-1/2)) with a
corrector basin of width ~2), the previous controller extended with naive
growth needed 1406 residual evaluations due to oversized-step rejections;
the gated controller needs 98 with zero rejections. On a smooth cubic
path the secant predictor reduces residual evaluations from 64 to 36, and
easy sweeps finish in 5 steps instead of 11.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
…ocstring

SciMLBase.HomotopyProblem's docstring is not included on any docs page, so
the @ref fails the docs build with a :cross_references error. Fixes SciML#968.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
@ChrisRackauckas-Claude

Copy link
Copy Markdown
Contributor Author

Two follow-up commits since the initial push:

For context when reading this PR's CI: master is currently red for unrelated, pre-existing reasons — Runic format-check failures on six sublibrary test/runtests.jl files from #961 (#969), pre-existing :example_block docs errors (large_systems.md, snes_ex2.md), a stale DiffEqBase = "6.213.0" compat in test/trim/Project.toml, a long-standing Downgrade Sublibraries FastClosures lower-bound conflict, and self-hosted-runner infra flakes (gitconfig lock races, lost runners). The signal jobs for this PR are the Core test group (homotopy sweep items 1–18) and the NonlinearSolveBase sublibrary CI.

The root test env's [sources] table is ignored on Julia < 1.11, so the
base-env groups (Core, NoPre, ...) tested the registered sublibraries
instead of the PR branch code. This failed outright on lts CI here (the
new HomotopySweep kwargs do not exist in the registered NonlinearSolveBase
2.31.0) and silently tested stale code on lts for any sublibrary-only
change; it is also the failure mode behind the registry-lag Core lts
failure on master at 633d93a. The sublibrary and dep-adding group paths
already develop in-repo paths; do the same for the base-env path.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
@ChrisRackauckas-Claude

Copy link
Copy Markdown
Contributor Author

Pushed Develop in-repo sources for base-env test groups on Julia < 1.11, fixing the only PR-attributable CI failure (tests / Core (julia lts) on both platforms).

Root cause: the root test env's [sources] table is ignored on Julia < 1.11, and unlike the sublibrary and dep-adding group paths, the base-env group path in test/runtests.jl never developed the in-repo paths. So the lts Core jobs resolved NonlinearSolveBase 2.31.0 from the registry and tested that instead of the PR branch — failing on this PR's new kwargs with a kwerr, and silently testing stale sublibrary code on lts for any PR before it. This is also the same mechanism behind the registry-lag Core (julia lts, macos) failure on master at 633d93a (the #962 merge ran CI before 2.31.0 had propagated to the pkg servers).

Verification: ran the full root GROUP=Core suite via Pkg.test() on a fresh Julia 1.10.11 resolve locally — all 49 testsets pass, including the previously-failing HomotopySweep step-control kwargs validation + defaults (18/18), exit 0.

Status of the remaining red checks on the previous run, none PR-caused:

@ChrisRackauckas-Claude

Copy link
Copy Markdown
Contributor Author

Final CI status — all PR-relevant jobs green

Green (caused by / exercising this PR's code):

  • tests / Coreall 6 variants (julia 1 / lts / pre × ubuntu / macos), including the lts jobs that the _develop_inrepo_sources commit fixed. The homotopy sweep test items 1–18 run here.
  • sublibraries / lib/NonlinearSolveBase — Core (julia 1 / lts / pre) and QA (julia 1 / lts), all pass.
  • tests / Adjoint, Bounds, NoPre, QA, Verbosity, Wrappers, Downstream — all variants pass.
  • Downgrade / Downgrade Tests - Core, Spell Check, runic-suggestions, all other sublibrary jobs — pass.
  • OrdinaryDiffEq.jl Interface + Regression downstream — pass.

Red, none caused by this PR:

Net: every job that compiles or tests this PR's changes is green. Remaining reds are pre-existing (#968 fixed in-PR, #969/#970/#971 filed) or self-hosted-runner infra.

ChrisRackauckas-Claude pushed a commit to ChrisRackauckas-Claude/NonlinearSolve.jl that referenced this pull request Jun 13, 2026
The root test env's [sources] table is ignored on Julia < 1.11, so the
base-env Core group tested the registered NonlinearSolveBase (2.31.0)
instead of this branch's 2.32.0, failing on lts with
`UndefVarError: ArcLengthContinuation not defined`. Develop the in-repo
paths first, as the sublibrary and dep-adding group paths already do.

(Same fix as SciML#967; expect a trivial merge with whichever lands first.)

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants