Alternate Training of Shared and Task-Specific Parameters for Multi-Task Neural Networks

This paper introduces novel alternate training procedures for hard-parameter sharing Multi-Task Neural Networks (MTNNs). Traditional MTNN training faces challenges in managing conflicting loss gradients, often yielding sub-optimal performance. The proposed alternate training method updates shared and task-specific weights alternately, exploiting the multi-head architecture of the model. This approach reduces computational costs, enhances training regularization, and … Read more

An optimally fast objective-function-free minimization algorithm using random subspaces

Article Download View An optimally fast objective-function-free minimization algorithm using random subspaces

Inexact Newton methods with matrix approximation by sampling for nonlinear least-squares and systems

We develop and analyze stochastic inexact Gauss-Newton methods for nonlinear least-squares problems and inexact Newton methods for nonlinear systems of equations. Random models are formed using suitable sampling strategies for the matrices involved in the deterministic models. The analysis of the expected number of iterations needed in the worst case to achieve a desired level … Read more

Regularized methods via cubic subspace minimization for nonconvex optimization

\(\) The main computational cost per iteration of adaptive cubic regularization methods for solving large-scale nonconvex problems is the computation of the step \(s_k\), which requires an approximate minimizer of the cubic model. We propose a new approach in which this minimizer is sought in a low dimensional subspace that, in contrast to classical approaches, … Read more

Trust-region algorithms: probabilistic complexity and intrinsic noise with applications to subsampling techniques

A trust-region algorithm is presented for finding approximate minimizers of smooth unconstrained functions whose values and derivatives are subject to random noise. It is shown that, under suitable probabilistic assumptions, the new method finds (in expectation) an epsilon-approximate minimizer of arbitrary order q > 0 in at most O(epsilon^{-(q+1)}) inexact evaluations of the function and … Read more

A stochastic first-order trust-region method with inexact restoration for finite-sum minimization

We propose a stochastic first-order trust-region method with inexact function and gradient evaluations for solving finite-sum minimization problems. At each iteration, the function and the gradient are approximated by sampling. The sample size in gradient approximations is smaller than the sample size in function approximations and the latter is determined using a deterministic rule inspired … Read more

The Impact of Noise on Evaluation Complexity: The Deterministic Trust-Region Case

Intrinsic noise in objective function and derivatives evaluations may cause premature termination of optimization algorithms. Evaluation complexity bounds taking this situation into account are presented in the framework of a deterministic trust-region method. The results show that the presence of intrinsic noise may dominate these bounds, in contrast with what is known for methods in … Read more

High-order Evaluation Complexity of a Stochastic Adaptive Regularization Algorithm for Nonconvex Optimization Using Inexact Function Evaluations and Randomly Perturbed Derivatives

A stochastic adaptive regularization algorithm allowing random noise in derivatives and inexact function values is proposed for computing strong approximate minimizers of any order for inexpensively constrained smooth optimization problems. For an objective function with Lipschitz continuous p-th derivative in a convex neighbourhood of the feasible set and given an arbitrary optimality order q, it … Read more

A relaxed interior point method for low-rank semidefinite programming problems with applications to matrix completion

A new relaxed variant of interior point method for low-rank semidefinite programming problems is proposed in this paper. The method is a step outside of the usual interior point framework. In anticipation to converging to a low-rank primal solution, a special nearly low-rank form of all primal iterates is imposed. To accommodate such a (restrictive) … Read more