Optimization and Control
See recent articles
Showing new listings for Monday, 1 December 2025
- [1] arXiv:2511.21871 [pdf, html, other]
-
Title: Bayesian Risk-averse Model Predictive Control with Consistency and Stability GuaranteesSubjects: Optimization and Control (math.OC)
Model Predictive Control (MPC) is a powerful framework for constrained control, but its performance and safety can be severely degraded when the prediction model is learned online and thus remains uncertain. In this work, we develop a Bayesian risk-averse MPC framework for stochastic, discrete-time, nonlinear systems that provides theoretical guarantees on the consistency of Bayesian learning and closed-loop stability. First, we study Bayesian learning under the conditionally independent state transitions induced by feedback control and establish explicit conditions for Bayesian consistency on an infinitely countable parameter space. Second, we introduce a general notion of risk-averse asymptotic stability (RAAS), defined via comparison function classes and independent of any specific coherent risk measure or convergence rate, and we derive a risk-averse Lyapunov stability theorem together with MPC-specific stability conditions. Third, building on these foundations, we design a practical Bayesian risk-averse MPC scheme that separates epistemic (parametric) and aleatoric (disturbance) uncertainty: additive disturbances are treated in a risk-neutral fashion, while parametric uncertainty is managed via dynamically shrinking ambiguity sets constructed from Bayesian credible intervals, approximated online using particle filtering. To enable real-time implementation, we propose both an optimal and a sub-optimal receding-horizon control policy, the latter obtained by warm-starting from the previous solution, and prove that asymptotic RAAS is recovered as the Bayesian estimator becomes consistent.
- [2] arXiv:2511.21917 [pdf, html, other]
-
Title: Generalization of Silver Stepsize Schedule to Stochastic OptimizationComments: 29 pages, 5 figuresSubjects: Optimization and Control (math.OC)
This work introduces a two-step stepsize schedule for stochastic gradient methods minimizing smooth strongly convex functions. We consider the setting where only stochastic gradient approximations, which are unbiased, of bounded variance, and supported on a finite set, are accessible. When the variance bound is relatively smaller than a ratio of the initial optimality gap, the proposed stepsize schedule achieves better convergence performance compared to the well-regarded constant stepsize {\alpha} = 2/(M+m), where m and M denote the strong convexity and gradient-Lipschitz parameters, respectively. Our stepsize schedule can be viewed as a generalization of the well-known two-step silver stepsize schedule in [J. M. Altschuler and P. A. Parrilo, Journal of the ACM, 72(2):1-38, 2025] from deterministic setting to stochastic optimization.
- [3] arXiv:2511.21980 [pdf, html, other]
-
Title: A stochastic maximum principle for singular mean-field regime-switching optimal controlComments: 23 pagesSubjects: Optimization and Control (math.OC)
In this paper, we investigate a mean-field singular stochastic optimal control problem for systems governed by mean-field regime-switching singular stochastic differential equations. The state process is assumed to depend on both a regular and a singular control, and the coefficient associated with the singular component is allowed to be regime dependent. We derive both necessary and sufficient singular stochastic maximum principles. Because the regular control domain is not assumed to be convex, we employ the spike variation technique and obtain the necessary maximum principle by introducing a second-order adjoint process. As an application, we use the main theoretical results to analyse an inter-bank borrowing and lending model with transaction costs.
- [4] arXiv:2511.22011 [pdf, html, other]
-
Title: A nonmonotone extrapolated proximal gradient-subgradient algorithm beyond global Lipschitz gradient continuityComments: 38 pages, comments welcomeSubjects: Optimization and Control (math.OC)
With the advancement of modern applications, an increasing number of composite optimization problems arise whose smooth component does not possess a globally Lipschitz continuous gradient. This setting prevents the direct use of the proximal gradient (PG) method and its variants, and has motivated a growing body of research on new PG-type methods and their convergence theory, in particular, global convergence analysis without imposing any explicit or implicit boundedness assumptions on the iterates. Until recently, the first complete analysis of this kind has been established for the PG method and its specific nonmonotone variants, which has since stimulated further exploration along this research direction. In this paper, we consider a general composite optimization model beyond the global Lipschitz gradient continuity setting. We propose a novel problem-parameter-free algorithm that incorporates a carefully designed nonmonotone line search to handle the non-global Lipschitz gradient continuity, together with an extrapolation step to achieve potential acceleration. Despite the added technical challenges introduced by combining extrapolation with nonmonotone line search, we establish a refined convergence analysis for the proposed algorithm under the Kurdyka-Ł ojasiewicz property, without requiring any boundedness assumptions on the iterates. This work thus further advances the theoretical understanding of PG-type methods in the non-global Lipschitz gradient continuity setting. Finally, we conduct numerical experiments to illustrate the effectiveness of our algorithm and highlight the advantages of integrating extrapolation with a nonmonotone line search.
- [5] arXiv:2511.22123 [pdf, html, other]
-
Title: Model Predictive Path Planning in Navier-Stokes Flow with POD-Based Reduced-Order ModelsSubjects: Optimization and Control (math.OC); Systems and Control (eess.SY)
We present a framework for optimal trajectory generation in flow-driven systems governed by the Navier-Stokes equations, combining a Proper Orthogonal Decomposition (POD) reduced0order model (ROM) with Model Predictive Control (MPC). The approach (i) approximates the velocity field from data via snapshot POD and orthogonal projection, (ii) derives a Galerkin-projected dynamical model in reduced coordinates, and (iii) employs MPC to plan control inputs that steer an agent through the predicted flow while satisfying state and actuation constraints. By leveraging reduced-order modeling, the method enables real-time control in high-dimensional flow environments. Simulations demonstrate accurate flow-field reconstruction and efficient trajectory generation within realistic wind environments.
- [6] arXiv:2511.22218 [pdf, other]
-
Title: A Two-Stage Stochastic Optimization Framework for Environmentally Sensitive Oil Spill Response Resource Allocation in the ArcticComments: 18 pages, 11 figures, 5 tables, 1 algorithmSubjects: Optimization and Control (math.OC)
The risk of oil spills in the Alaskan Arctic has become an urgent environmental and logistical concern as maritime traffic increases under climate driven sea ice retreat. Traditional deterministic response planning models fail to represent key uncertainties, including variable spill magnitudes, changing environmental sensitivity, and infrastructure limitations. This study develops a two-stage stochastic mixed integer linear programming framework that jointly optimizes the location of oil spill response stations and the allocation of heterogeneous resources across multiple probabilistic spill scenarios. The model integrates a weighted objective that combines spill volume, environmental sensitivity index (ESI), response time, and costs for station setup, deployment, and inter station transfer. Separate importance weights for coverage and cost, together with internal ecological weights, allow decision makers to balance ecological protection and operational efficiency. Data was compiled from Alaska Department of Environmental Conservation spill records and National Oceanic and Atmospheric Administration ESI layers and are converted into model ready scenarios through harmonization and sampling. The model is solved with the Gurobi optimizer, and sensitivity analysis is performed over 324 combinations of importance and ecological weights. Results show about a 35.45% percent improvement in response effectiveness over deterministic methods, as confirmed by the value of the stochastic solution, and reveal clear tradeoffs between cost and ecological coverage. The framework provides a data driven decision support tool for Arctic emergency planners that simultaneously accounts for uncertainty, environmental sensitivity, and realistic logistical constraints.
- [7] arXiv:2511.22251 [pdf, html, other]
-
Title: A Framework for Handling and Exploiting Symmetry in Benders' DecompositionComments: 20 pages, 2 figuresSubjects: Optimization and Control (math.OC)
Benders' decomposition (BD) is a framework for solving optimization problems by removing some variables and modeling their contribution to the original problem via so-called Benders cuts. While many advanced optimization techniques can be applied in a BD framework, one central technique has not been applied systematically in BD: symmetry handling. The main reason for this is that Benders cuts are not known explicitly but only generated via a separation oracle.
In this work, we close this gap by developing a theory of symmetry detection within the BD framework. To this end, we introduce a tailored family of graphs that capture the symmetry information of both the Benders master problem and the Benders oracles. Once symmetries of these graphs are known, which can be found by established techniques, classical symmetry handling approaches become available to accelerate BD. We complement these approaches by devising techniques for the separation and aggregation of symmetric Benders cuts by means of tailored separation routines and extended formulations. Both substantially reduce the number of executions of the separation oracles. In a numerical study, we show the effect of both symmetry handling and cut aggregation for bin packing and scheduling problems. - [8] arXiv:2511.22331 [pdf, html, other]
-
Title: On the Condition Number Dependency in Bilevel OptimizationSubjects: Optimization and Control (math.OC); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Bilevel optimization minimizes an objective function, defined by an upper-level problem whose feasible region is the solution of a lower-level problem. We study the oracle complexity of finding an $\epsilon$-stationary point with first-order methods when the upper-level problem is nonconvex and the lower-level problem is strongly convex. Recent works (Ji et al., ICML 2021; Arbel and Mairal, ICLR 2022; Chen el al., JMLR 2025) achieve a $\tilde{\mathcal{O}}(\kappa^4 \epsilon^{-2})$ upper bound that is near-optimal in $\epsilon$. However, the optimal dependency on the condition number $\kappa$ is unknown. In this work, we establish a new $\Omega(\kappa^2 \epsilon^{-2})$ lower bound and $\tilde{\mathcal{O}}(\kappa^{7/2} \epsilon^{-2})$ upper bound for this problem, establishing the first provable gap between bilevel problems and minimax problems in this setup. Our lower bounds can be extended to various settings, including high-order smooth functions, stochastic oracles, and convex hyper-objectives: (1) For second-order and arbitrarily smooth problems, we show $\Omega(\kappa_y^{13/4} \epsilon^{-12/7})$ and $\Omega(\kappa^{17/10} \epsilon^{-8/5})$ lower bounds, respectively. (2) For convex-strongly-convex problems, we improve the previously best lower bound (Ji and Liang, JMLR 2022) from $\Omega(\kappa /\sqrt{\epsilon})$ to $\Omega(\kappa^{5/4} / \sqrt{\epsilon})$. (3) For smooth stochastic problems, we show an $\Omega(\kappa^4 \epsilon^{-4})$ lower bound.
- [9] arXiv:2511.22428 [pdf, html, other]
-
Title: A Variational Approach to Mean Field Type ControlSubjects: Optimization and Control (math.OC)
Variational methods have been used to study stochastic control for long, see Bensoussan (1982) and Bensoussan-Lions (1978) for the early works. More precisely, variational approaches apply to the study of Bellman equation as a parabolic quasi-linear equation, when the nonlinearity affects only the gradient of the solution, and the second order derivative term is linear and not degenerate. This corresponds to a stochastic control problem, where the state equation is a diffusion process. The primary objective of this article is to extend this approach to mean field control theory, as an alternative to the current approach, which considers a coupled system of Hamilton-Jacobi (HJ) and Fokker-Planck (FP) equations, since the introduction of the theory by Lasry-Lions (2007). The main novelty lies in that the equation studied here is the HJB equation, neither the HJ-FP system nor the master equation; and our results also provide another perspective for probabilistic approaches; see Chassagneux-Crisan-Delarue (2022), Bensoussan-Wong-Yam-Yuan (2024), Bensoussan-Tai-Yam (2025) and Bensoussan-Huang-Tang-Yam (2025) for instance. Within the scope of the PDE methods, the advantage of this article is to solve a larger class of mean field control problems, with moderate regularity; and this kind of variational methods fairly require few conditions on the regularity of the coefficients.
- [10] arXiv:2511.22579 [pdf, html, other]
-
Title: Consistent inverse optimal control for discrete-time nonlinear stochastic systemsSubjects: Optimization and Control (math.OC)
Inverse Optimal Control (IOC) seeks to recover an unknown cost from expert demonstrations, and it provides a systematic way of modeling experts' decision mechanisms while considering the prior information of the cost functions. Nevertheless, existing IOC methods have consistency issue with the estimator under noisy and nonlinear settings. In this paper, we consider a discrete-time nonlinear system with process noise, and it is controlled by an optimal policy that minimizes the expectation of a discounted cumulative cost function across an infinite time-horizon. In particular, the cost function takes the form of a linear combination of a priori known feature functions. In this setting, we first adopt Lasserre's reformulation of the forward problem with occupancy measure. Next, we propose the infinite dimensional IOC algorithm and further approximate it with Lagrange interpolating polynomials, which results in a convex, finite-dimensional sum-of-squares optimization. Moreover, the estimator is shown to be asymptotically and statistically consistent. Finally, we validate the theoretical results and illustrate the performance of our method with numerical experiments. In addition, the robustness and generalizability performance of the proposed IOC algorithm are also illustrated.
- [11] arXiv:2511.22613 [pdf, html, other]
-
Title: Variational analysis of determinantal varietiesComments: 71 pages, 6 figures, 2 tablesSubjects: Optimization and Control (math.OC); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Determinantal varieties -- the sets of bounded-rank matrices or tensors -- have attracted growing interest in low-rank optimization. The tangent cone to low-rank sets is widely studied and underpins a range of geometric methods. The second-order geometry, which encodes curvature information, is more intricate. In this work, we develop a unified framework to derive explicit formulas for both first- and second-order tangent sets to various low-rank sets, including low-rank matrices, tensors, symmetric matrices, and positive semidefinite matrices. The framework also accommodates the intersection of a low-rank set and another set satisfying mild assumptions, thereby yielding a tangent intersection rule. Through the lens of tangent sets, we establish a necessary and sufficient condition under which a nonsmooth problem and its smooth parameterization share equivalent second-order stationary points. Moreover, we exploit tangent sets to characterize optimality conditions for low-rank optimization and prove that verifying second-order optimality is NP-hard. In a separate line of analysis, we investigate variational geometry of the graph of the normal cone to matrix varieties, deriving the explicit Bouligand tangent cone, Fréchet and Mordukhovich normal cones to the graph. These results are further applied to develop optimality conditions for low-rank bilevel programs.
- [12] arXiv:2511.22745 [pdf, html, other]
-
Title: A lasso-alternative to Dijkstra's algorithm for identifying short paths in networksComments: 25 pages, 7 figuresSubjects: Optimization and Control (math.OC); Distributed, Parallel, and Cluster Computing (cs.DC); Social and Information Networks (cs.SI); Image and Video Processing (eess.IV)
We revisit the problem of finding the shortest path between two selected vertices of a graph and formulate this as an $\ell_1$-regularized regression -- Least Absolute Shrinkage and Selection Operator (lasso). We draw connections between a numerical implementation of this lasso-formulation, using the so-called LARS algorithm, and a more established algorithm known as the bi-directional Dijkstra. Appealing features of our formulation include the applicability of the Alternating Direction of Multiplier Method (ADMM) to the problem to identify short paths, and a relatively efficient update to topological changes.
- [13] arXiv:2511.22753 [pdf, html, other]
-
Title: On Minimax Optimal Dual Control for Fully Actuated SystemsComments: 5 pagesSubjects: Optimization and Control (math.OC)
A multi-variable adaptive controller is derived as the explicit solution to a minimax dynamic game. The minimizing player selects the control action as a function of past state measurements and inputs. The maximizing player selects disturbances and model parameters for the underlying linear time-invariant dynamics. This leads to a Bellman equation that can be solved explicitly for the case with unitary B-matrix known up to a sign and no input penalty. The minimizing policy is a dual controller that optimizes the tradeoff between exploration and exploitation.
- [14] arXiv:2511.22758 [pdf, html, other]
-
Title: Minimax Optimal Adaptive Control for Systems on ConesComments: 3 pages, preprint for the 2025 IEEE Conference on Decision and ControlSubjects: Optimization and Control (math.OC)
The theory of optimal control on positive cones has recently identified several new problem classes where the Bellman equation can be solved explicitly, in analogy with classical linear quadratic control. In this paper, the idea is extended to minimax adaptive control, yielding exact solutions to instances of the Bellman equation for dual control. In particular, this allows for optimization of the fundamental tradeoff between exploration and exploitation.
- [15] arXiv:2511.22807 [pdf, html, other]
-
Title: Deciding lower-boundedness of polynomialsComments: 19 pagesSubjects: Optimization and Control (math.OC)
This paper addresses the problem of deciding the lower-boundedness of an arbitrary real polynomial p in n variables.
- [16] arXiv:2511.22838 [pdf, html, other]
-
Title: Cutting Planes for Binarized Integer ProgramsComments: 14 pages, 7 tablesSubjects: Optimization and Control (math.OC)
We consider integer programming problems with bounded general-integer variables belonging to the general class of network flow problems. For those, we computationally investigate the effect on mixed-integer linear programming (MIP) solvers of the different ways of producing extended formulations that replace a bounded general integer variable by a linear combination of a set of auxiliary binary variables linked by additional linear constraints. We show that MILP solvers perform very differently depending on which extended formulations is used and we interpret that different performance through the lens of cutting planes generation. Finally, we discuss a simple family of mixed-integer rounding inequalities that especially benefit from the reformulation, and we show its benefit within different MIP solvers. This provides methodological and practical guidelines for the use of those extended formulations in MIP and, to the best of our knowledge, this is the first extensive computational analysis of the topic.
- [17] arXiv:2511.22916 [pdf, html, other]
-
Title: A Quadratically Convergent Alternating Projection Method for Nonconvex SetsComments: 25 pagesSubjects: Optimization and Control (math.OC)
In this paper, we consider the feasibility problem, which aims to find a feasible point for the constraint set $\{x \in \mathbb{R}^n: c(x) = 0\}$ over a possibly non-regular subset $\mathcal{X} \subset \mathbb{R}^n$. Under the constraint nondegeneracy condition, we propose a modified alternating projection method. In our proposed method, based on the concept of projective mapping for $\mathcal{X}$, we alternate a Newton step for finding an inexact solution within the limiting tangent cone of $\mathcal{X}$ and a projection to $\mathcal{X}$. Under mild conditions, we prove the local quadratic convergence of our proposed method. Preliminary numerical experiments demonstrate the high efficiency of our proposed alternating projection method.
- [18] arXiv:2511.23100 [pdf, html, other]
-
Title: On Rank Graduation Metrics for High Dimensional Ordinal DataSubjects: Optimization and Control (math.OC)
Evaluating the reliability of machine learning classifications remains a fundamental challenge in Artificial Intelligence (AI), particularly when the target variable is multidimensional. Classification variables can be expressed by means of a categorical scale which, at best, is ordinal. Because ordinal data lack a natural metric structure in their underlying space, most conventional distance measures aimed at assessing the accuracy of machine learning classifications cannot be directly or meaningfully applied. In this paper, we develop a mathematical framework for comparing ordinal data based on a family of Rank Graduation $(\mathrm{RGX}_p)$ \emph{metrics}. We demonstrate that these metrics can quantify the proportion of variability of the response explained by the predictions, in a similar manner as the predictive $R^2$ for continuous response variables. After establishing theoretical connections between the $\mathrm{RGX}_p$ family and other prominent metrics in AI, we conduct extensive experiments across diverse datasets and learning tasks to evaluate their empirical performance. The results underscore the versatility, interpretability, and robustness of the $\mathrm{RGX}_p$ metrics as a principled foundation for developing trustworthy and SAFE AI systems.
- [19] arXiv:2511.23175 [pdf, html, other]
-
Title: Minimizing risk measures with applications in network traffic engineeringSubjects: Optimization and Control (math.OC); Probability (math.PR)
This paper presents a novel two-stage optimization framework designed to model integrated quantile functions, which leads to the formulation of a bilinear optimization problem (P). A specific instance of this framework offers a new approach to minimizing the Value-at-risk (Var) and the Conditional Value-at-risk (CVar), thus providing a broader perspective on risk assessment and optimization. We investigate various convexification techniques to under- and over-estimate the optimal value of (P), resulting in new and tighter lower- and upper-convex estimators for the Var minimization problems. Furthermore, we explore the properties and implications of the bilinear optimization problem (P) in connection to the integrated quantile functions. Finally, to illustrate the practical applications of our approach, we present computational comparisons in the context of real-life network traffic engineering problems, demonstrating the effectiveness of our proposed framework.
- [20] arXiv:2511.23268 [pdf, html, other]
-
Title: Avoidance of non-strict saddle points by blow-upSubjects: Optimization and Control (math.OC); Dynamical Systems (math.DS)
It is an old idea to use gradient flows or time-discretized variants thereof as methods for solving minimization problems. In some applications, for example in machine learning contexts, it is important to know that for generic initial data, gradient flow trajectories do not get stuck at saddle points. There are classical results concerned with the nondegenerate situation. But if the Hessian of the objective function has a nontrivial kernel at the critical point, then these results are inconclusive. In this paper, we show how relevant information can be extracted by ``blowing up'' the objective function around the non-strict saddle point, i.e., by a suitable nonlinear rescaling that makes the higher order geometry visible. Then the center-stable manifold theorem of dynamical system theory can be applied.
- [21] arXiv:2511.23336 [pdf, html, other]
-
Title: Two energy methods for distributed port-Hamiltonian systems and their application to stability analysisSubjects: Optimization and Control (math.OC); Analysis of PDEs (math.AP); Functional Analysis (math.FA)
We develop two local energy methods for distributed parameter port-Hamiltonian (pH) systems on one-dimensional spatial domains. The methods are applied to derive a characterization of exponential stability directly in terms of the energy passing through the boundary over a given time horizon. The resulting condition is verified for a network of vibrating strings where existing sufficient conditions cannot be applied. Moreover, we use a local energy method to study the short-time behavior of pH systems with boundary damping which was recently studied in the context of hypocoercivity.
New submissions (showing 21 of 21 entries)
- [22] arXiv:2511.21962 (cross-list from eess.SY) [pdf, html, other]
-
Title: Communication-Aware Dissipative Control for Networks of Heterogeneous Nonlinear AgentsComments: Under review for IFAC 2026. 8 pages, 4 figures, 1 tableSubjects: Systems and Control (eess.SY); Optimization and Control (math.OC)
Communication-aware control is essential to reduce costs and complexity in large-scale networks. However, it is challenging to simultaneously determine a sparse communication topology and achieve high performance and robustness. This work achieves all three objectives through dissipativity-based, sparsity-promoting controller synthesis. The approach identifies an optimal sparse structure using either weighted l1 penalties or alternating direction methods of multipliers (ADMM) with a cardinality term, and iteratively solves a convexified version of the NP hard structured optimal control problem. The proposed methods are demonstrated on heterogeneous networks with uncertain and unstable agents.
- [23] arXiv:2511.22104 (cross-list from cs.LG) [pdf, html, other]
-
Title: Representative Action Selection for Large Action Space: From Bandits to MDPsComments: Journal version of arXiv:2505.18269Subjects: Machine Learning (cs.LG); Optimization and Control (math.OC); Probability (math.PR); Machine Learning (stat.ML)
We study the problem of selecting a small, representative action subset from an extremely large action space shared across a family of reinforcement learning (RL) environments -- a fundamental challenge in applications like inventory management and recommendation systems, where direct learning over the entire space is intractable. Our goal is to identify a fixed subset of actions that, for every environment in the family, contains a near-optimal action, thereby enabling efficient learning without exhaustively evaluating all actions.
This work extends our prior results for meta-bandits to the more general setting of Markov Decision Processes (MDPs). We prove that our existing algorithm achieves performance comparable to using the full action space. This theoretical guarantee is established under a relaxed, non-centered sub-Gaussian process model, which accommodates greater environmental heterogeneity. Consequently, our approach provides a computationally and sample-efficient solution for large-scale combinatorial decision-making under uncertainty. - [24] arXiv:2511.22804 (cross-list from math.AP) [pdf, html, other]
-
Title: Large $n$-limit of matrix control problems and non-commutative controlsComments: 43 pagesSubjects: Analysis of PDEs (math.AP); Operator Algebras (math.OA); Optimization and Control (math.OC); Probability (math.PR)
Building on the free-probability stochastic control framework introduced in arXiv:2502.17329, we connect optimal control problems for $n \times n$ random matrix ensembles with their infinite-dimensional, free-probability analogues. Under natural convexity hypotheses, we prove that the non-commutative value function captures the large-$n$ limit of the corresponding finite-matrix control problems. As an application, we give a new perspective on the Laplace principle for convex functionals in the theory of large deviations for random matrices.
- [25] arXiv:2511.22866 (cross-list from cs.LG) [pdf, html, other]
-
Title: ARM-Explainer -- Explaining and improving graph neural network predictions for the maximum clique problem using node features and association rule miningSubjects: Machine Learning (cs.LG); Optimization and Control (math.OC)
Numerous graph neural network (GNN)-based algorithms have been proposed to solve graph-based combinatorial optimization problems (COPs), but methods to explain their predictions remain largely undeveloped. We introduce ARM-Explainer, a post-hoc, model-level explainer based on association rule mining, and demonstrate it on the predictions of the hybrid geometric scattering (HGS) GNN for the maximum clique problem (MCP), a canonical NP-hard graph-based COP. The eight most explanatory association rules discovered by ARM-Explainer achieve high median lift and confidence values of 2.42 and 0.49, respectively, on test instances from the TWITTER and BHOSLIB-DIMACS benchmark datasets. ARM-Explainer identifies the most important node features, together with their value ranges, that influence the GNN's predictions on these datasets. Furthermore, augmenting the GNN with informative node features substantially improves its performance on the MCP, increasing the median largest-found clique size by 22% (from 29.5 to 36) on large graphs from the BHOSLIB-DIMACS dataset.
- [26] arXiv:2511.23022 (cross-list from eess.SY) [pdf, html, other]
-
Title: Control Barrier Function for Unknown Systems: An Approximation-free ApproachSubjects: Systems and Control (eess.SY); Robotics (cs.RO); Optimization and Control (math.OC)
We study the prescribed-time reach-avoid (PT-RA) control problem for nonlinear systems with unknown dynamics operating in environments with moving obstacles. Unlike robust or learning based Control Barrier Function (CBF) methods, the proposed framework requires neither online model learning nor uncertainty bound estimation. A CBF-based Quadratic Program (CBF-QP) is solved on a simple virtual system to generate a safe reference satisfying PT-RA conditions with respect to time-varying, tightened obstacle and goal sets. The true system is confined to a Virtual Confinement Zone (VCZ) around this reference using an approximation-free feedback law. This construction guarantees real-time safety and prescribed-time target reachability under unknown dynamics and dynamic constraints without explicit model identification or offline precomputation. Simulation results illustrate reliable dynamic obstacle avoidance and timely convergence to the target set.
- [27] arXiv:2511.23295 (cross-list from q-fin.PM) [pdf, html, other]
-
Title: Signature approach for pricing and hedging path-dependent options with frictionsSubjects: Portfolio Management (q-fin.PM); Optimization and Control (math.OC); Mathematical Finance (q-fin.MF); Pricing of Securities (q-fin.PR)
We introduce a novel signature approach for pricing and hedging path-dependent options with instantaneous and permanent market impact under a mean-quadratic variation criterion. Leveraging the expressive power of signatures, we recast an inherently nonlinear and non-Markovian stochastic control problem into a tractable form, yielding hedging strategies in (possibly infinite) linear feedback form in the time-augmented signature of the control variables, with coefficients characterized by non-standard infinite-dimensional Riccati equations on the extended tensor algebra. Numerical experiments demonstrate the effectiveness of these signature-based strategies for pricing and hedging general path-dependent payoffs in the presence of frictions. In particular, market impact naturally smooths optimal trading strategies, making low-truncated signature approximations highly accurate and robust in frictional markets, contrary to the frictionless case.
- [28] arXiv:2511.23424 (cross-list from econ.TH) [pdf, html, other]
-
Title: Contracting with discretionary bonusesSubjects: Theoretical Economics (econ.TH); Optimization and Control (math.OC)
We study a continuous time contracting model in which a principal hires a risk averse agent to manage a project over a finite horizon and provides sequential payments whose timing is endogenously determined. The resulting nonzero-sum interaction between the principal and the agent is reformulated as a mixed control and stopping problem. Using numerical simulations, we investigate how factors such as the relative impatience of the parties and the number of bonus payments influence the principal's value and the structure of the optimal bonus payment scheme. A notable finding is that, in some contractual environments, the principal optimally offers a sign-on bonus to front-load incentives.
Cross submissions (showing 7 of 7 entries)
- [29] arXiv:2309.01781 (replaced) [pdf, other]
-
Title: Self-concordant smoothing in proximal quasi-Newton algorithms for large-scale convex composite optimizationSubjects: Optimization and Control (math.OC); Machine Learning (cs.LG)
We introduce a notion of self-concordant smoothing for minimizing the sum of two convex functions, one of which is smooth and the other nonsmooth. The key highlight is a natural property of the resulting problem's structure that yields a variable metric selection method and a step length rule especially suited to proximal quasi-Newton algorithms. Also, we efficiently handle specific structures promoted by the nonsmooth term, such as l1-regularization and group lasso penalties. A convergence analysis for the class of proximal quasi-Newton methods covered by our framework is presented. In particular, we obtain guarantees, under standard assumptions, for two algorithms: Prox-N-SCORE (a proximal Newton method) and Prox-GGN-SCORE (a proximal generalized Gauss-Newton method). The latter uses a low-rank approximation of the Hessian inverse, reducing most of the cost of matrix inversion and making it effective for overparameterized machine learning models. Numerical experiments on synthetic and real data demonstrate the efficiency of both algorithms against state-of-the-art approaches. A Julia implementation is publicly available at this https URL.
- [30] arXiv:2405.11068 (replaced) [pdf, other]
-
Title: New finite relaxation hierarchies for concavo-convex, disjoint bilinear programs, and facial disjunctionsSubjects: Optimization and Control (math.OC)
This paper introduces novel relaxation hierarchies for concavo-convex programs (CXP), a class of problems that includes disjoint bilinear programming (DBP) and concave minimization (CM) as special cases. At the core of these hierarchies is an algorithm based on double-description (DD) that computes the barycentric coordinates of a polyhedral cone as rational, non-negative functions representing multipliers associated with the cone's rays. These hierarchies combine geometric structure derived from barycentric coordinates with algebraic techniques via rational functions, achieving the convex hull in $m$ iterations, where $m$ is the number of inequalities that a subset of the variables must satisfy. Our framework offers the first unified approach to analyze and tighten relaxations from disjunctive programming (DP) and reformulation-linearization technique (RLT) for CXP. We also demonstrate that our methods extend to facial disjunctive programs (FDP), where solutions are constrained to lie on faces of a Cartesian product of polytopes, generalizing known hierarchies for 0-1 programs.
- [31] arXiv:2408.06243 (replaced) [pdf, html, other]
-
Title: Complexity of trust-region methods in the presence of unbounded Hessian approximationsComments: Accepted in Mathematical ProgrammingSubjects: Optimization and Control (math.OC)
We extend traditional complexity analyses of trust-region methods for unconstrained, possibly nonconvex, optimization. Whereas most complexity analyses assume uniform boundedness of the model Hessians, we work with potentially unbounded model Hessians. Boundedness is not guaranteed in practical implementations, in particular ones based on quasi-Newton updates such as PSB, BFGS and SR1. We examine two regimes of Hessian growth: one bounded by a power of the number of successful iterations, and one bounded by a power of the number of iterations. This allows us to formalize and address the intuition of Powell [IMA J. Numer. Ana. 30(1):289-301,2010], who studied convergence under a special case of our assumptions, but whose proof contained complexity arguments. Specifically, for \(0 \leq p < 1\), we establish sharp \(O([(1-p)\epsilon^{-2}]^{1/(1-p)})\) evaluation complexity to find an \(\epsilon\)-stationary point when model Hessians are \(O(|\mathcal{S}_{k-1}|^p)\), where \(|\mathcal{S}_{k-1}|\) is the number of iterations where the step was accepted, up to iteration \(k-1\). For \(p = 1\), which is the case studied by Powell, we establish a sharp \(O(\exp(c_1\epsilon^{-2}))\) evaluation complexity for a certain constant \(c_1 > 0\). This is far better than the double exponential bound that \citet{powell-2010} suspected, and is far worse than other bounds surmised elsewhere in the literature. We establish similar sharp bounds when model Hessians are \(O(k^p)\), where \(k\) is the iteration counter, for \(0 \leq p < 1\). When \(p = 1\), the complexity bound depends on the parameters of the family, but reduces to \(O((1 - \log(\epsilon))\exp(c_2\epsilon^{-2}))\) for a certain constant \(c_2 > 0\) for the special case of the standard trust-region method. As special cases, we derive novel complexity bounds for (strongly) convex objectives under the same growth assumptions.
- [32] arXiv:2409.19428 (replaced) [pdf, html, other]
-
Title: A Proximal Modified Quasi-Newton Method for Nonsmooth Regularized OptimizationComments: Accepted in SIAM Journal on OptimizationSubjects: Optimization and Control (math.OC); Machine Learning (cs.LG)
We develop R2N, a modified quasi-Newton method for minimizing the sum of a $\mathcal{C}^1$ function $f$ and a lower semi-continuous prox-bounded $h$. Both $f$ and $h$ may be nonconvex. At each iteration, our method computes a step by minimizing the sum of a quadratic model of $f$, a model of $h$, and an adaptive quadratic regularization term. A step may be computed by a variant of the proximal-gradient method. An advantage of R2N over trust-region (TR) methods is that proximal operators do not involve an extra TR indicator. We also develop the variant R2DH, in which the model Hessian is diagonal, which allows us to compute a step without relying on a subproblem solver when $h$ is separable. R2DH can be used as standalone solver, but also as subproblem solver inside R2N. We describe non-monotone variants of both R2N and R2DH. Global convergence of a first-order stationarity measure to zero holds without relying on local Lipschitz continuity of $\nabla f$, while allowing model Hessians to grow unbounded, an assumption particularly relevant to quasi-Newton models. Under Lipschitz-continuity of $\nabla f$, we establish a tight worst-case complexity bound of $O(1 / \epsilon^{2/(1 - p)})$ to bring said measure below $\epsilon > 0$, where $0 \leq p < 1$ controls the growth of model Hessians. The latter must not diverge faster than $|\mathcal{S}_k|^p$, where $\mathcal{S}_k$ is the set of successful iterations up to iteration $k$. When $p = 1$, we establish the tight exponential complexity bound $O(\exp(c \epsilon^{-2}))$ where $c > 0$ is a constant. We describe our Julia implementation and report numerical experience on a classic basis-pursuit problem, an image denoising problem, a minimum-rank matrix completion problem, a nonlinear support vector machine and an inverse nonlinear problem.
- [33] arXiv:2409.20275 (replaced) [pdf, html, other]
-
Title: On System Operators with Variation Bounding PropertiesSubjects: Optimization and Control (math.OC); Dynamical Systems (math.DS)
The property of linear discrete-time time-invariant system operators mapping inputs with at most $k-1$ sign changes to outputs with at most$k-1$ sign changes is investigated. We show that this property is tractable via the notion of $k$-sign consistency in case of the observability/controllability operator, which as such can also be used as a sufficient condition for the Hankel operator. Our results complement the mathematical literature by providing an algebraic characterization, independent of rank and dimension for variation bounding and diminishing matrices as well as by discussing their computational tractability. Based on these, we conduct our studies of variation bounding system operators beyond existing studies on order-preserving $k$-variation diminishment. Our findings are applied to the open problem of bounding the number of sign changes in a system's impulse response, which appears, e.g., when bounding the number of over- and undershoots in a step response or the number of bangs in bounded optimal control problems.
- [34] arXiv:2410.02188 (replaced) [pdf, html, other]
-
Title: Nonsmooth exact penalty methods for equality-constrained optimization: complexity and implementationComments: Accepted in SIAM Journal on OptimizationSubjects: Optimization and Control (math.OC)
Penalty methods are a well known class of algorithms for constrained optimization. They transform a constrained problem into a sequence of unconstrained \emph{penalized} problems in the hope that approximate solutions of the latter converge to a solution of the former. If Lagrange multipliers exist, exact penalty methods ensure that the penalty parameter only need increase a finite number of times, but are typically scorned in smooth optimization for the penalized problems are not smooth. This led researchers to consider the implementation of exact penalty methods inconvenient. Recent advances in proximal methods have led to increasingly efficient solvers for nonsmooth optimization. We study a general exact penalty algorithm and use it to show that the exact $\ell_2$-penalty method for equality-constrained optimization can, in fact, be implemented efficiently by solving the penalized problem using a proximal-type algorithm. We study the convergence of our algorithm and establish a worst-case complexity bound of $\mathcal{O}(\epsilon^{-2})$ to bring a stationarity measure below $\epsilon > 0$ under the Mangarasian-Fromowitz constraint qualification and Lipschitz continuity of the objective gradient and constraint Jacobian. While the Lipschitz continuity of the objective gradient is not required for convergence in view of recent works, it is used in our analysis to derive the complexity bound. In a degenerate scenario where the penalty parameter grows unbounded, the complexity becomes $\mathcal{O}(\epsilon^{-8})$, which is worse than another bound found in the literature. Finally, we report numerical experience on small-scale problems from a standard collection and compare our solver with an augmented-Lagrangian and an SQP method. Our preliminary implementation is superior to the augmented Lagrangian in terms of robustness and efficiency, and is competitive with the SQP method.
- [35] arXiv:2411.07661 (replaced) [pdf, html, other]
-
Title: A boosted second-order convex splitting algorithm based on gradient flowsSubjects: Optimization and Control (math.OC); Numerical Analysis (math.NA)
This paper introduces a preconditioned convex splitting algorithm enhanced by line search techniques for nonconvex optimization problems. The algorithm utilizes second-order backward differentiation formulas (BDF) for the implicit components and employs the Adams-Bashforth scheme for the nonlinear and explicit parts of the gradient flow in variational problems. The resulting scheme can be interpreted as a varying (or dynamic) difference-of-convex (DC) algorithm. It integrates the Armijo line search strategy to improve performance. The study also discusses classical preconditioners such as symmetric Gauss-Seidel and Jacobi within this context. The global convergence of the algorithm is established through the Kurdyka-Łojasiewicz properties, ensuring convergence under the preconditioned scheme. Numerical experiments demonstrate significantly higher efficiency than standard DC algorithms and other boosted algorithms.
- [36] arXiv:2411.11664 (replaced) [pdf, html, other]
-
Title: Randomized Block Coordinate DC ProgrammingComments: 25 Pages, 3 figureSubjects: Optimization and Control (math.OC); Statistics Theory (math.ST)
We introduce an extension of the Difference of Convex Algorithm (DCA) in the form of a randomized block coordinate approach for problems with separable structure. For $n$ coordinate-blocks and $k$ iterations, our main result proves a non-asymptotic convergence rate of $O(n/k)$ in expectation, with respect to a stationarity measure based on a Forward-Backward envelope. Furthermore, leveraging the connection between DCA and Expectation Maximization (EM), we propose a randomized block coordinate EM algorithm.
- [37] arXiv:2503.08921 (replaced) [pdf, other]
-
Title: Revisiting Frank-Wolfe for Structured Nonconvex OptimizationComments: 20 pages, 6 figuresSubjects: Optimization and Control (math.OC); Machine Learning (cs.LG)
We introduce a new projection-free (Frank-Wolfe) method for optimizing structured nonconvex functions that are expressed as a difference of two convex functions. This problem class subsumes smooth nonconvex minimization, positioning our method as a promising alternative to the classical Frank-Wolfe algorithm. DC decompositions are not unique; by carefully selecting a decomposition, we can better exploit the problem structure, improve computational efficiency, and adapt to the underlying problem geometry to find better local solutions. We prove that the proposed method achieves a first-order stationary point in $O(1/\epsilon^2)$ iterations, matching the complexity of the standard Frank-Wolfe algorithm for smooth nonconvex minimization in general. Specific decompositions can, for instance, yield a gradient-efficient variant that requires only $O(1/\epsilon)$ calls to the gradient oracle. Finally, we present numerical experiments demonstrating the effectiveness of the proposed method compared to other projection-free algorithms.
- [38] arXiv:2504.16899 (replaced) [pdf, html, other]
-
Title: Linear convergence of a one-cut conditional gradient method for total variation regularizationComments: 27 pages, 6 FiguresSubjects: Optimization and Control (math.OC); Numerical Analysis (math.NA)
We introduce a fully-corrective generalized conditional gradient method for convex minimization problems involving total variation regularization on multidimensional domains. It relies on alternatively updating an active set of subsets of the spatial domain and an iterate given by a conic combination of the associated characteristic functions. Different to previous approaches in the same spirit, the computation of a new candidate set only requires the solution of one prescribed mean curvature problem, instead of the resolution of a fractional minimization task analogous to finding a generalized Cheeger set. After discretization, the former can be realized by a single run of a graph cut algorithm, leading to a significant speedup in practice. We prove the global sublinear convergence of the resulting method, under mild assumptions, and its asymptotic linear convergence in a more restrictive two-dimensional setting which uses results of stability of surfaces of prescribed mean curvature under perturbations of the curvature. Finally, we numerically demonstrate this convergence behavior in some model PDE-constrained minimization problems.
- [39] arXiv:2506.18278 (replaced) [pdf, html, other]
-
Title: Finite-Time Minimax Bounds and an Optimal Lyapunov Policy in Queueing ControlSubjects: Optimization and Control (math.OC); Information Theory (cs.IT); Machine Learning (cs.LG)
We introduce an original minimax framework for finite-time performance analysis in queueing control and propose a surprisingly simple Lyapunov-based scheduling policy with superior finite-time performance. The framework quantitatively characterizes how the expected total queue length scales with key system parameters, including the capacity of the scheduling set and the variability of arrivals and departures across queues. To our knowledge, this provides the first firm foundation for evaluating and comparing scheduling policies in the finite-time regime, including nonstationary settings, and shows that the proposed policy can provably and empirically outperform classical MaxWeight in finite time. Within this framework, we establish three main sets of results. First, we derive minimax lower bounds on the expected total queue length for parallel-queue scheduling via a novel Brownian coupling argument. Second, we propose a new policy, LyapOpt, which minimizes the full quadratic Lyapunov drift-capturing both first- and second-order terms-and achieves optimal finite-time performance in heavy traffic while retaining classical stability guarantees. Third, we identify a key limitation of the classical MaxWeight policy, which optimizes only the first-order drift: its finite-time performance depends suboptimally on system parameters, leading to substantially larger backlogs in explicitly characterized settings. Together, these results delineate the scope and limitations of classical drift-based scheduling and motivate new queueing-control methods with rigorous finite-time guarantees.
- [40] arXiv:2506.18884 (replaced) [pdf, html, other]
-
Title: A synthetic approach to comparison principles for variational problems, with applications to optimal transportSubjects: Optimization and Control (math.OC)
We develop a synthetic, variational framework for deriving comparison principles in infinite-dimensional Banach spaces. Unlike traditional approaches that rely on the regularity of minimizers and Euler--Lagrange equations, our method exploits the order-theoretic structure of the energy. Central to our analysis is the notion of submodularity and its convex dual, substitutability, which we extend here to the infinite-dimensional setting. We prove a duality theorem establishing that a convex functional is submodular if and only if its conjugate is substitutable. We apply these results to problems in optimal transport, and derive comparison principles for Kantorovich potentials in standard, entropic, and unbalanced settings without requiring regularity assumptions on the cost or domain. Finally, we prove that general transport costs are substitutable, yielding comparison principles for JKO schemes driven by internal energies.
- [41] arXiv:2506.21373 (replaced) [pdf, html, other]
-
Title: Proximal aiming in weak KAM theory with nonsmooth LagrangianComments: 22 pagesSubjects: Optimization and Control (math.OC); Dynamical Systems (math.DS)
This work extends weak KAM theory to the case of a nonsmooth Lagrangian satisfying a superlinear growth condition. Using the solution of a weak KAM equation that is a stationary Hamilton-Jacobi equation and the proximal aiming method, we construct a family of discontinuous feedback strategies that are nearly optimal for every time interval. This result leads to an analogue of the weak KAM theorem. Additionally, as in classical weak KAM theory, we demonstrate that the effective Hamiltonian (Mañé critical value) can be determined by solving a linear programming problem in the class of probability measures.
- [42] arXiv:2507.07428 (replaced) [pdf, html, other]
-
Title: Relocated Fixed-Point Iterations with Applications to Variable Stepsize Resolvent SplittingComments: 29 pagesSubjects: Optimization and Control (math.OC)
In this work, we develop a convergence framework for iterative algorithms whose updates can be described by a one-parameter family of nonexpansive operators. Within the framework, each step involving one of the main algorithmic operators is followed by a second step which ''relocates'' fixed-points of the current operator to the next. As a consequence, our analysis does not require the family of nonexpansive operators to have a common fixed-point, as is common in the literature. Our analysis uses a parametric extension of the demiclosedness principle for nonexpansive operators. As an application of our convergence results, we develop a version of the graph-based extension of the Douglas--Rachford algorithm for finding a zero of the sum of $N\geq 2$ maximally monotone operators, which does not require the resolvent parameter to be constant across iterations.
- [43] arXiv:2511.01100 (replaced) [pdf, html, other]
-
Title: Ergodic Risk Sensitive Control of Diffusions under a General Structural HypothesisComments: The manuscript is 57 pages longSubjects: Optimization and Control (math.OC); Probability (math.PR)
We study the infinite-horizon average (ergodic) risk sensitive control problem for diffusion processes under a general structural hypothesis: there is a partition of state space into two subsets, where the controlled diffusion process satisfies a Foster-Lyapunov type drift condition in one subset, under any stationary Markov control, while the near-monotonicity condition is satisfied with the running cost function being inf-compact in its complement. Under these conditions, we completely characterize the optimal stationary Markov controls. To prove this, we consider an inf-compact perturbation to the running cost over the entire space such that the resulting ergodic risk sensitive control problem is well-defined and then use the corresponding existing results. The heart of the analysis lies in exploiting the variational formula of exponential functionals of Brownian motion and applying it to the objective exponential cost function of the controlled diffusion. This representation facilitates us to view the risk sensitive cost for any stationary Markov control as the optimal value of a control problem of an extended diffusion involving a new auxiliary control where the optimal criterion is to maximize the associated long-run average cost criterion that is a difference of the original running cost and an extra term that is quadratic in the auxiliary control. The main difficulty in using this approach lies in the fact that tightness of mean empirical measures of the extended diffusion is not a priori implied by the analogous tightness property of the original diffusion. We overcome this by establishing a priori estimates for the extended diffusion associated with the nearly optimal auxiliary controls.
- [44] arXiv:2511.10622 (replaced) [pdf, html, other]
-
Title: Verification of Sequential Convex Programming for Parametric Non-convex OptimizationSubjects: Optimization and Control (math.OC)
We introduce a verification framework to exactly verify the worst-case performance of sequential convex programming (SCP) algorithms for parametric non-convex optimization. The verification problem is formulated as an optimization problem that maximizes a performance metric (e.g., the suboptimality after a given number of iterations) over parameters constrained to be in a parameter set and iterate sequences consistent with the SCP update rules. Our framework is general, extending the notion of SCP to include both conventional variants such as trust-region, convex-concave, and prox-linear methods, and algorithms that combine convex subproblems with rounding steps, as in relaxing and rounding schemes. Unlike existing analyses that may only provide local guarantees under limited conditions, our framework delivers global worst-case guarantees--quantifying how well an SCP algorithm performs across all problem instances in the specified family. Applications in control, signal processing, and operations research demonstrate that our framework provides, for the first time, global worst-case guarantees for SCP algorithms in the parametric setting.
- [45] arXiv:2511.16420 (replaced) [pdf, other]
-
Title: A Fast Relax-and-Round Approach to Unit Commitment for Data Center Own GenerationComments: Limited to 5 pages and this format for IEEE PESGM conferenceSubjects: Optimization and Control (math.OC); Distributed, Parallel, and Cluster Computing (cs.DC); Numerical Analysis (math.NA)
The rapid growth of data centers increasingly requires data center operators to "bring own generation" to complement the available utility power plants to supply all or part of data center load. This practice sharply increases the number of generators on the bulk power system and shifts operational focus toward fuel costs rather than traditional startup and runtime constraints. Conventional mixed-integer unit commitment formulations are not well suited for systems with thousands of flexible, fast-cycling units. We propose a unit commitment formulation that relaxes binary commitment decisions by allowing generators to be fractionally on, enabling the use of algorithms for continuous solvers. We then use a rounding approach to get a feasible unit commitment. For a 276-unit system, solution time decreases from 10 hours to less than a second, with no accuracy degradation. Our approach scales with no issues to tens of thousands of generators, which allows solving problems on the scale of the major North America interconnections. The bulk of computation is parallel and GPU compatible, enabling further acceleration in future work.
- [46] arXiv:2511.18815 (replaced) [pdf, html, other]
-
Title: An Axiomatic Analysis of Distributionally Robust Optimization with $q$-Norm Ambiguity Sets for Probability SmoothingComments: 13 pagesSubjects: Optimization and Control (math.OC)
We analyze the axiomatic properties of a class of probability estimators derived from Distributionally Robust Optimization (DRO) with $q$-norm ambiguity sets ($q$-DRO), a principled approach to the zero-frequency problem. While classical estimators such as Laplace smoothing are characterized by strong linearity axioms like Ratio Preservation, we show that $q$-DRO provides a flexible alternative that satisfies other desirable properties. We first prove that for any $q \in [1, \infty]$, the $q$-DRO estimator satisfies the fundamental axioms of Positivity and Symmetry. For the case of $q \in (1, \infty)$, we then prove that it also satisfies Order Preservation. Our analysis of the optimality conditions also reveals that the $q$-DRO formulation is equivalent to the regularized empirical loss minimization.
- [47] arXiv:2511.19981 (replaced) [pdf, html, other]
-
Title: On the Fundamental Limit of Stochastic Gradient Identification Algorithm Under Non-Persistent ExcitationSubjects: Optimization and Control (math.OC)
Stochastic gradient methods are of fundamental importance in system identification and machine learning, enabling online parameter estimation for large-scale and data-streaming processes. The stochastic gradient algorithm stands as a classical identification method that has been extensively studied for decades. Under non-persistent excitation, the best known convergence result requires the condition number of the Fisher information matrix to satisfy $\kappa(\sum_{i=1}^n \varphi_i \varphi_i^\top) = O((\log r_n)^\alpha)$, where $r_n = 1 + \sum_{i=1}^n \|\varphi_i\|^2$, with strong consistency guaranteed for $\alpha \leq 1/3$ but known to fail for $\alpha > 1$. This paper establishes that strong consistency in fact holds for the entire range $0 \leq \alpha < 1$, achieved through a novel algebraic framework that yields substantially sharper matrix norm bounds. Our result nearly resolves the four-decade-old conjecture of Chen and Guo (1986), bridging the theoretical gap from $\alpha \leq 1/3$ to nearly the entire feasible range.
- [48] arXiv:2511.20207 (replaced) [pdf, other]
-
Title: Adaptive SGD with Line-Search and Polyak Stepsizes: Nonconvex Convergence and Accelerated RatesComments: Omission of necessary citations and creditSubjects: Optimization and Control (math.OC); Machine Learning (stat.ML)
We extend the convergence analysis of AdaSLS and AdaSPS in [Jiang and Stich, 2024] to the nonconvex setting, presenting a unified convergence analysis of stochastic gradient descent with adaptive Armijo line-search (AdaSLS) and Polyak stepsize (AdaSPS) for nonconvex optimization. Our contributions include: (1) an $\mathcal{O}(1/\sqrt{T})$ convergence rate for general nonconvex smooth functions, (2) an $\mathcal{O}(1/T)$ rate under quasar-convexity and interpolation, and (3) an $\mathcal{O}(1/T)$ rate under the strong growth condition for general nonconvex functions.
- [49] arXiv:2202.10875 (replaced) [pdf, html, other]
-
Title: Fast Gradient Methods for Data-Consistent Local Super-Resolution of Medical ImagesSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Optimization and Control (math.OC)
In this work, we propose a new paradigm of iterative model-based reconstruction algorithms for providing real-time solution for zooming-in and refining a region of interest in medical and clinical tomographic images. This algorithmic framework is tailored for a clinical need in medical imaging practice that after a reconstruction of the full tomographic image, the clinician may believe that some critical parts of the image are not clear enough, and may wish to see clearer these regions of interest. A naive approach (which is highly not recommended) would be to perform the global reconstruction of a higher resolution image, which has two major limitations: first, it is computationally inefficient, and second, the image regularization is still applied globally, which may over-smooth some local regions. Furthermore, if one wishes to fine-tune the regularization parameter for local parts, it would be computationally infeasible in practice for the case of using global reconstruction. Our new iterative approaches for such tasks are based on jointly utilizing the measurement information, efficient up-sampling/down-sampling across image spaces, and locally adjusted image prior for efficient and high-quality post-processing. The numerical results in low-dose X-ray CT image local zoom-in demonstrate the effectiveness of our approach.
- [50] arXiv:2209.05790 (replaced) [pdf, html, other]
-
Title: Globally optimal control of quantum dynamicsComments: Published version, 12 pages, 6 figuresJournal-ref: Phys. Rev. Research 7, 043202 (2025)Subjects: Quantum Physics (quant-ph); Optimization and Control (math.OC)
Optimization of constrained quantum control problems powers quantum technologies. This task becomes very difficult when these control problems are nonconvex and plagued with dense local extrema. For such problems current optimization methods must be repeated many times to find good solutions, each time requiring many simulations of the system. Here, we present quantum control via polynomial optimization (QCPOP), a method that eliminates this problem by directly finding globally optimal solutions. The resulting increase in speed, which can be a thousandfold or more, makes it possible to solve problems that were previously intractable. This remarkable advance is due to global optimization methods recently developed for polynomial functions. We demonstrate the power of this method by showing that it obtains an optimal solution in a single run for a problem in which local extrema are so dense that gradient methods require thousands of runs to reach a similar fidelity. Since QCPOP is able to find the global optimum for quantum control, we expect that it will not only enhance the utility of quantum control by making it much easier to find the necessary protocols, but also provide a key tool for understanding the precise limits of quantum technologies. Finally, we note that the ability to cast quantum control as polynomial optimization resolves an open question regarding the computability of exact solutions to quantum control problems.
- [51] arXiv:2311.13278 (replaced) [pdf, html, other]
-
Title: Randomisation with moral hazard: a path to existence of optimal contractsComments: 48 pagesSubjects: Probability (math.PR); General Economics (econ.GN); Optimization and Control (math.OC)
We study a generic principal-agent problem in continuous time on a finite time horizon. We introduce a framework in which the agent is allowed to employ measure-valued controls and characterise the continuation utility as a solution to a specific form of a backward stochastic differential equation driven by a martingale measure. We leverage this characterisation to prove that, under appropriate conditions, an optimal solution to the principal's problem exists, even when constraints on the contract are imposed. In doing so, we employ compactification techniques and, as a result, circumvent the typical challenge of showing well-posedness for a degenerate partial differential equation with potential boundary conditions, where regularity problems often arise.
- [52] arXiv:2411.18432 (replaced) [pdf, html, other]
-
Title: SPO-VCS: An End-to-End Smart Predict-then-Optimize Framework with Alternating Differentiation Method for Relocation Problems in Large-Scale Vehicle Crowd SensingComments: Accepted by Transportation Research Part E: Logistics and Transportation ReviewJournal-ref: Transportation Research Part E: Logistics and Transportation Review, 2026, 205: 104515Subjects: Machine Learning (cs.LG); Optimization and Control (math.OC)
Ubiquitous mobile devices have catalyzed the development of vehicle crowd sensing (VCS). In particular, vehicle sensing systems show great potential in the flexible acquisition of spatio-temporal urban data through built-in sensors under diverse sensing scenarios. However, vehicle systems often exhibit biased coverage due to the heterogeneous nature of trip requests and routes. To achieve a high sensing coverage, a critical challenge lies in optimally relocating vehicles to minimize the divergence between vehicle distributions and target sensing distributions. Conventional approaches typically employ a two-stage predict-then-optimize (PTO) process: first predicting real-time vehicle distributions and subsequently generating an optimal relocation strategy based on the predictions. However, this approach can lead to suboptimal decision-making due to the propagation of errors from upstream prediction. To this end, we develop an end-to-end Smart Predict-then-Optimize (SPO) framework by integrating optimization into prediction within the deep learning architecture, and the entire framework is trained by minimizing the task-specific matching divergence rather than the upstream prediction error. Methodologically, we formulate the vehicle relocation problem by quadratic programming (QP) and incorporate a novel unrolling approach based on the Alternating Direction Method of Multipliers (ADMM) within the SPO framework to compute gradients of the QP layer, facilitating backpropagation and gradient-based optimization for end-to-end learning. The effectiveness of the proposed framework is validated by real-world taxi datasets in Hong Kong. Utilizing the alternating differentiation method, the general SPO framework presents a novel concept of addressing decision-making problems with uncertainty, demonstrating significant potential for advancing applications in intelligent transportation systems.
- [53] arXiv:2411.19258 (replaced) [pdf, other]
-
Title: L4acados: Learning-based models for acados, applied to Gaussian process-based predictive controlAmon Lahr, Joshua Näf, Kim P. Wabersich, Jonathan Frey, Pascal Siehl, Andrea Carron, Moritz Diehl, Melanie N. ZeilingerSubjects: Systems and Control (eess.SY); Optimization and Control (math.OC)
Incorporating learning-based models, such as artificial neural networks or Gaussian processes, into model predictive control (MPC) strategies can significantly improve control performance and online adaptation capabilities for real-world applications. Still, enabling state-of-the-art implementations of learning-based models for MPC is complicated by the challenge of interfacing machine learning frameworks with real-time optimal control software. This work aims at filling this gap by incorporating external sensitivities in sequential quadratic programming solvers for nonlinear optimal control. To this end, we provide L4acados, a general framework for incorporating Python-based dynamics models in the real-time optimal control software acados. By computing external sensitivities via a user-defined Python module, L4acados enables the implementation of MPC controllers with learning-based residual models in acados, while supporting parallelization of sensitivity computations when preparing the quadratic subproblems. We demonstrate significant speed-ups and superior scaling properties of L4acados compared to available software using a neural-network-based control example. Last, we provide an efficient and modular real-time implementation of Gaussian process-based MPC using L4acados, which is applied to two hardware examples: autonomous miniature racing, as well as motion control of a full-scale autonomous vehicle for an ISO lane change maneuver.
- [54] arXiv:2502.05149 (replaced) [pdf, other]
-
Title: Nonlocal perimeters and variations: Extremality and decomposability for finite and infinite horizonsComments: 39 pages, 4 figures. V2: Accepted version, minor changesSubjects: Functional Analysis (math.FA); Analysis of PDEs (math.AP); Optimization and Control (math.OC)
We analyze the extremality and decomposability properties with respect to two types of nonlocal perimeters available in the literature, the Gagliardo perimeter based on the eponymous seminorms and the nonlocal distributional Caccioppoli perimeter, both with finite and infinite interaction ranges. A nonlocal notion of indecomposability associated to these perimeters is introduced, and we prove that in both cases it can be characterized solely in terms of the interaction range or horizon $\varepsilon$. Utilizing this, we show that it is possible to uniquely decompose a set into its $\varepsilon$-connected components, establishing a nonlocal analogue of the decomposition theorem of Ambrosio, Caselles, Masnou and Morel. Moreover, the extreme points of the balls induced by the Gagliardo and nonlocal total variation seminorm are identified, which naturally correspond to the two nonlocal perimeters. Surprisingly, while the extreme points in the former case are normalized indicator functions of $\varepsilon$-simple sets, akin to the classical TV-ball, in the latter case they are instead obtained from a nonlocal transformation applied to the extreme points of the TV-ball. Finally, we explore the nonlocal-to-local transition via a $\Gamma$-limit as $\varepsilon \rightarrow 0$ for both perimeters, recovering the classical Caccioppoli perimeter.
- [55] arXiv:2504.11903 (replaced) [pdf, html, other]
-
Title: FedCanon: Non-Convex Composite Federated Learning with Efficient Proximal Operation on Heterogeneous DataSubjects: Machine Learning (cs.LG); Distributed, Parallel, and Cluster Computing (cs.DC); Optimization and Control (math.OC)
Composite federated learning offers a general framework for solving machine learning problems with additional regularization terms. However, existing methods often face significant limitations: many require clients to perform computationally expensive proximal operations, and their performance is frequently vulnerable to data heterogeneity. To overcome these challenges, we propose a novel composite federated learning algorithm called \textbf{FedCanon}, designed to solve the optimization problems comprising a possibly non-convex loss function and a weakly convex, potentially non-smooth regularization term. By decoupling proximal mappings from local updates, FedCanon requires only a single proximal evaluation on the server per iteration, thereby reducing the overall proximal computation cost. Concurrently, it integrates control variables into local updates to mitigate the client drift arising from data heterogeneity. The entire architecture avoids the complex subproblems of primal-dual alternatives. The theoretical analysis provides the first rigorous convergence guarantees for this proximal-skipping framework in the general non-convex setting. It establishes that FedCanon achieves a sublinear convergence rate, and a linear rate under the Polyak-Łojasiewicz condition, without the restrictive bounded heterogeneity assumption. Extensive experiments demonstrate that FedCanon outperforms the state-of-the-art methods in terms of both accuracy and computational efficiency, particularly under heterogeneous data distributions.
- [56] arXiv:2506.13150 (replaced) [pdf, html, other]
-
Title: Federated ADMM from Bayesian DualityComments: First two authors contributed equally. Code is at this https URLSubjects: Machine Learning (cs.LG); Optimization and Control (math.OC); Machine Learning (stat.ML)
We propose a new Bayesian approach to derive and extend the federated Alternating Direction Method of Multipliers (ADMM). We show that the solutions of variational-Bayesian objectives are associated with a duality structure that not only resembles ADMM but also extends it. For example, ADMM-like updates are recovered when the objective is optimized over the isotropic-Gaussian family, and new non-trivial extensions are obtained for other more flexible exponential families. Examples include a Newton-like variant that converges in one step on quadratics and an Adam-like variant called IVON-ADMM that has the same cost as Adam but yields up to 7% accuracy boosts in heterogeneous deep learning. Our work opens a new direction to use Bayes to extend ADMM and other primal-dual methods.
- [57] arXiv:2507.06764 (replaced) [pdf, html, other]
-
Title: Fast Equivariant Imaging: Acceleration for Unsupervised Learning via Augmented Lagrangian and Auxiliary PnP DenoisersComments: 17 pagesSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Optimization and Control (math.OC)
In this work, we propose Fast Equivariant Imaging (FEI), a novel unsupervised learning framework to rapidly and efficiently train deep imaging networks without ground-truth data. From the perspective of reformulating the Equivariant Imaging based optimization problem via the method of Lagrange multipliers and utilizing plug-and-play denoisers, this novel unsupervised scheme shows superior efficiency and performance compared to the vanilla Equivariant Imaging paradigm. In particular, our FEI schemes achieve an order-of-magnitude (10x) acceleration over standard EI on training U-Net for X-ray CT reconstruction and image inpainting, with improved generalization performance.
- [58] arXiv:2510.15097 (replaced) [pdf, html, other]
-
Title: Reduced order method based Anderson-type acceleration method for nonlinear least square problems and large scale ill-posed problemsSubjects: Numerical Analysis (math.NA); Optimization and Control (math.OC)
In this paper, we propose an acceleration framework for a class of iterative methods using the Reduced Order Method (ROM). Assuming that the underlying iterative scheme generates a rich basis for the solution space, we construct the next iterate by minimizing the equation error over the linear manifold spanned by this basis. The resulting optimal linear combination yields a more accurate approximation of the solution and significantly enhances convergence. In essence, the method can be seen as a history-based acceleration technique, akin to a delayed or memory-enhanced iterative scheme. This approach effectively remedies semi-ill-posed problems, enabling convergence where standard methods may fail, and also acts as a stabilizing and regularizing mechanism for the original iteration.
- [59] arXiv:2511.10293 (replaced) [pdf, html, other]
-
Title: Zeroes and Extrema of Functions via Random MeasuresComments: Function extrema and zeroes; Optimization; Poisson point process; Random counting measures; Riemann's zeta functionSubjects: Methodology (stat.ME); Optimization and Control (math.OC); Computation (stat.CO)
We present methods that provide all zeroes and extrema of a function that do not require differentiation. Using point process theory, we are able to describe the locations of zeroes or maxima, their number, as well as their distribution over a given window of observation. The algorithms in order to accomplish the theoretical development are also provided, and they are exemplified using many illustrative examples, for real and complex functions.
- [60] arXiv:2511.18551 (replaced) [pdf, other]
-
Title: Dissipativity and L2 Stability of Large-Scale Networks with Changing InterconnectionsComments: Under review for IFAC 2026. 6 pages, 2 figures, 1 tableSubjects: Systems and Control (eess.SY); Optimization and Control (math.OC)
In this paper, the L2 stability of switched networks is studied based on the QSR-dissipativity of each agent. While the integration of dissipativity with switched systems has received considerable attention, most previous studies have focused on passivity, internal stability, or feedback networks involving only two agents. This work makes two contributions: first, the relationship between switched QSR-dissipativity and L2 stability is established based on the properties of dissipativity parameters of switched systems; and second, conditions for L2 stability of networks consisting of QSR-dissipative agents with switching interconnection topologies are derived. Crucially, this shows that a common storage function will exist across all modes, avoiding the need to find one, which becomes computationally taxing for large networks with many possible configurations. Numerical examples demonstrate how this can facilitate stability analysis for networked systems under arbitrary switching of swarm drones.