Skip to content

Add underdetermined Levenberg-Marquardt support (minimum-norm mode)#859

Open
ChrisRackauckas-Claude wants to merge 1 commit into
SciML:masterfrom
ChrisRackauckas-Claude:feature/underdetermined-lm
Open

Add underdetermined Levenberg-Marquardt support (minimum-norm mode)#859
ChrisRackauckas-Claude wants to merge 1 commit into
SciML:masterfrom
ChrisRackauckas-Claude:feature/underdetermined-lm

Conversation

@ChrisRackauckas-Claude

Copy link
Copy Markdown
Contributor

PR: Add underdetermined Levenberg-Marquardt support (minimum-norm mode)

Summary

This PR implements support for underdetermined nonlinear least-squares problems (n > m, more unknowns than equations) in the Levenberg-Marquardt algorithm, addressing issue #851.

Background

The current L-M implementation in DampedNewtonDescent assumes overdetermined systems (m ≥ n) and forms JᵀJ + λD. For underdetermined systems (n > m), this is inefficient because:

  • JᵀJ is n×n (large)
  • We want the minimum-norm solution among infinitely many solutions

Algorithm

For underdetermined systems, instead of solving (JᵀJ + λD)δ = -Jᵀf, we solve:

  1. (JJᵀ + λI) z = -f(x) — This is an m×m system (smaller when n > m)
  2. δ = Jᵀz — This gives the minimum-norm step

This approach:

  • Keeps the linear system small (m×m instead of n×n)
  • Finds the minimum-norm step that solves the linearized equations
  • Is numerically well-conditioned with appropriate λ

Changes

lib/NonlinearSolveBase/src/descent/damped_newton.jl

  • New mode: Added :minimum_norm to DampedNewtonDescentCache
  • New kwarg: min_norm_mode in DampedNewtonDescent:
    • :auto (default): Auto-detect and use minimum-norm for underdetermined systems
    • :minimum_norm: Force minimum-norm mode
    • :disabled: Never use minimum-norm mode
  • Auto-detection: Checks length(fu) < length(u) to detect underdetermined systems
  • New caches: JJᵀ_cache, z_cache, J_raw for minimum_norm computations
  • Helper functions: dampen_jacobian_minimum_norm!! and _extract_scalar_damping

lib/NonlinearSolveFirstOrder/test/underdetermined_tests.jl

New test file with comprehensive tests:

  • Linear underdetermined systems
  • Nonlinear underdetermined systems
  • In-place formulation
  • Overdetermined regression tests (ensure we didn't break anything)
  • Large underdetermined systems (50 unknowns, 10 equations)

Usage

using NonlinearSolve

# Underdetermined: 2 equations, 4 unknowns
function f(u, p)
    return [
        u[1] + u[2] - 1.0,
        u[3] + u[4] - 2.0,
    ]
end

u0 = zeros(4)
prob = NonlinearLeastSquaresProblem(f, u0)
sol = solve(prob, LevenbergMarquardt())
# Automatically uses minimum-norm mode

References

Checklist

  • Implementation in damped_newton.jl
  • Tests for underdetermined systems
  • Documentation updated in docstring
  • CI passes (needs testing)
This implements support for underdetermined nonlinear least-squares problems
(n > m, more unknowns than equations) in the Levenberg-Marquardt algorithm.

## Changes

### DampedNewtonDescent
- Added new  mode for underdetermined systems
- New  kwarg:  (default), ,
- Auto-detects underdetermined systems when

### Algorithm (minimum_norm mode)
Instead of the normal form (JᵀJ + λD)δ = -Jᵀf, we solve:
1. (JJᵀ + λI) z = -f(x)  [m×m system, smaller when n > m]
2. δ = Jᵀz              [minimum-norm step]

This finds the minimum-norm step that solves the linearized equations,
keeping the linear system small (m×m instead of n×n).

### Implementation Details
- Added `dampen_jacobian_minimum_norm!!` for scalar damping extraction
- Added caches: `JJᵀ_cache`, `z_cache`, `J_raw` for minimum_norm mode
- Extracts mean of diagonal damping matrix as scalar λ

### Tests
Added comprehensive tests for:
- Linear underdetermined systems
- Nonlinear underdetermined systems
- In-place formulation
- Overdetermined regression tests
- Large underdetermined systems (efficiency test)

Closes SciML#851

Reference: Chen et al. (2026), appendix B - Optica Express 34(4):5729
@stevengj

stevengj commented May 1, 2026

Copy link
Copy Markdown

Thanks Chris, I didn't realize that you had worked on this.

cc @aristeidis-karalis and @mochen4

@mochen4

mochen4 commented May 2, 2026

Copy link
Copy Markdown

Thanks! This looks great!

@stevengj

stevengj commented May 3, 2026

Copy link
Copy Markdown

A simple test case would be to put in a linear function $f(x) = Ax - b$ for a "wide" matrix $A$ and check that the result of minimizing $\Vert f(x) \Vert_2^2$ starting at $x=0$ matches the minimum-norm solution A \ b.

@aristeidis-karalis

Copy link
Copy Markdown

@ChrisRackauckas-Claude, it's awesome you implemented this!

Just a comment:
I could not figure out (from the code) what D is in your original (m>n) formulation. Usually, it's best if it's something that scales with , so that the result is independent from the scale of the problem. Some commonly used options are D = diag(JᵀJ) or D ~ ||J||²∙I (where ||J|| can be the Frobenius norm).

The same holds for the minimum-norm solution, it's often preferable if it is independent from the scale of f, J. Therefore, instead of only the λI option, it would be good to give the user the ability to choose λD also with, e.g., D = diag(JJᵀ) or D ~ ||J||²∙I. We used the latter, in particular D ~ ||J||²/m∙I

@stevengj

Comment on lines +413 to +428
# Helper to extract scalar damping from various damping representations
_extract_scalar_damping(D::Number) = D
function _extract_scalar_damping(D::Diagonal)
# Use the mean of diagonal elements as scalar λ
# This works well for L-M where DᵀD is typically uniform or slowly varying
return sum(D.diag) / length(D.diag)
end
function _extract_scalar_damping(D::AbstractMatrix)
# Extract diagonal and take mean
n = minimum(size(D))
s = zero(eltype(D))
for i in 1:n
s += D[i, i]
end
return s / n
end

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't all three of these methods be replaced by

_extract_scalar_damping(D) = tr(D) / size(D,1)

assuming D is always a square matrix as seems to be the case?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

5 participants