清纯唯美另类亚洲欧美综合-精品激情在线观看视频

We study the approximation of general multiobjective optimization problems with the help of scalarizations. Existing results state that multiobjective minimization problems can be approximated well by norm-based scalarizations. However, for multiobjective maximization problems, only impossibility results are known so far. Countering this, we show that all multiobjective optimization problems can, in principle, be approximated equally well by scalarizations. In this context, we introduce a transformation theory for scalarizations that establishes the following: Suppose there exists a scalarization that yields an approximation of a certain quality for arbitrary instances of multiobjective optimization problems with a given decomposition specifying which objective functions are to be minimized / maximized. Then, for each other decomposition, our transformation yields another scalarization that yields the same approximation quality for arbitrary instances of problems with this other decomposition. In this sense, the existing results about the approximation via scalarizations for minimization problems carry over to any other objective decomposition -- in particular, to maximization problems -- when suitably adapting the employed scalarization. We further provide necessary and sufficient conditions on a scalarization such that its optimal solutions achieve a constant approximation quality. We give an upper bound on the best achievable approximation quality that applies to general scalarizations and is tight for the majority of norm-based scalarizations applied in the context of multiobjective optimization. As a consequence, none of these norm-based scalarizations can induce approximation sets for optimization problems with maximization objectives, which unifies and generalizes the existing impossibility results concerning the approximation of maximization problems.

相關內容

標量

關注 0

泛函 · 線性的 · 回合 · FAST · CASE ·

2023 年 7 月 12 日

Self-adjusting Population Sizes for the $(1, λ)$-EA on Monotone Functions

Marc Kaufmann,Maxime Larcher,Johannes Lengler,Xun Zou

We study the $(1,\lambda)$-EA with mutation rate $c/n$ for $c\le 1$, where the population size is adaptively controlled with the $(1:s+1)$-success rule. Recently, Hevia Fajardo and Sudholt have shown that this setup with $c=1$ is efficient on \onemax for $s<1$, but inefficient if $s \ge 18$. Surprisingly, the hardest part is not close to the optimum, but rather at linear distance. We show that this behavior is not specific to \onemax. If $s$ is small, then the algorithm is efficient on all monotone functions, and if $s$ is large, then it needs superpolynomial time on all monotone functions. In the former case, for $c<1$ we show a $O(n)$ upper bound for the number of generations and $O(n\log n)$ for the number of function evaluations, and for $c=1$ we show $O(n\log n)$ generations and $O(n^2\log\log n)$ evaluations. We also show formally that optimization is always fast, regardless of $s$, if the algorithm starts in proximity of the optimum. All results also hold in a dynamic environment where the fitness function changes in each generation.

在線 · 情景 · 優化器 · INFORMS · motivation ·

2023 年 7 月 12 日

Online Inventory Problems: Beyond the i.i.d. Setting with Online Convex Optimization

Massil Hihat,Stéphane Ga?ffas,Guillaume Garrigos,Simon Bussy

We study multi-product inventory control problems where a manager makes sequential replenishment decisions based on partial historical information in order to minimize its cumulative losses. Our motivation is to consider general demands, losses and dynamics to go beyond standard models which usually rely on newsvendor-type losses, fixed dynamics, and unrealistic i.i.d. demand assumptions. We propose MaxCOSD, an online algorithm that has provable guarantees even for problems with non-i.i.d. demands and stateful dynamics, including for instance perishability. We consider what we call non-degeneracy assumptions on the demand process, and argue that they are necessary to allow learning.

Integration · 泛函 · 離散化 · 近似 · Performer ·

2023 年 7 月 11 日

Efficient computation of the sinc matrix function for the integration of second-order differential equations

Lidia Aceto,Fabio Durastante

This work deals with the numerical solution of systems of oscillatory second-order differential equations which often arise from the semi-discretization in space of partial differential equations. Since these differential equations exhibit (pronounced or highly) oscillatory behavior, standard numerical methods are known to perform poorly. Our approach consists in directly discretizing the problem by means of Gautschi-type integrators based on $\operatorname{sinc}$ matrix functions. The novelty contained here is that of using a suitable rational approximation formula for the $\operatorname{sinc}$ matrix function to apply a rational Krylov-like approximation method with suitable choices of poles. In particular, we discuss the application of the whole strategy to a finite element discretization of the wave equation.

泛化誤差 · Learning · 泛化理論 · 統計量 · 優化器 ·

2023 年 7 月 10 日

Generalization Error of First-Order Methods for Statistical Learning with Generic Oracles

Kevin Scaman,Mathieu Even,Laurent Massoulié

from arxiv, 18 pages, 0 figures

In this paper, we provide a novel framework for the analysis of generalization error of first-order optimization algorithms for statistical learning when the gradient can only be accessed through partial observations given by an oracle. Our analysis relies on the regularity of the gradient w.r.t. the data samples, and allows to derive near matching upper and lower bounds for the generalization error of multiple learning problems, including supervised learning, transfer learning, robust learning, distributed learning and communication efficient learning using gradient quantization. These results hold for smooth and strongly-convex optimization problems, as well as smooth non-convex optimization problems verifying a Polyak-Lojasiewicz assumption. In particular, our upper and lower bounds depend on a novel quantity that extends the notion of conditional standard deviation, and is a measure of the extent to which the gradient can be approximated by having access to the oracle. As a consequence, our analysis provides a precise meaning to the intuition that optimization of the statistical learning objective is as hard as the estimation of its gradient. Finally, we show that, in the case of standard supervised learning, mini-batch gradient descent with increasing batch sizes and a warm start can reach a generalization error that is optimal up to a multiplicative factor, thus motivating the use of this optimization scheme in practical applications.

向量空間 · Networking · Neural Networks · 推斷 · 視覺識別系統 ·

2023 年 7 月 10 日

Law of Large Numbers for Bayesian two-layer Neural Network trained with Variational Inference

Arnaud Descours,Tom Huix,Arnaud Guillin,Manon Michel,éric Moulines,Boris Nectoux

We provide a rigorous analysis of training by variational inference (VI) of Bayesian neural networks in the two-layer and infinite-width case. We consider a regression problem with a regularized evidence lower bound (ELBO) which is decomposed into the expected log-likelihood of the data and the Kullback-Leibler (KL) divergence between the a priori distribution and the variational posterior. With an appropriate weighting of the KL, we prove a law of large numbers for three different training schemes: (i) the idealized case with exact estimation of a multiple Gaussian integral from the reparametrization trick, (ii) a minibatch scheme using Monte Carlo sampling, commonly known as Bayes by Backprop, and (iii) a new and computationally cheaper algorithm which we introduce as Minimal VI. An important result is that all methods converge to the same mean-field limit. Finally, we illustrate our results numerically and discuss the need for the derivation of a central limit theorem.

線性的 · binary · 整數線性規劃 · 泛函 · 幾乎必然 ·

2023 年 7 月 8 日

The Binary Linearization Complexity of Pseudo-Boolean Functions

Matthias Walter

from arxiv, 11 pages, 1 table

We consider the problem of linearizing a pseudo-Boolean function $f : \{0,1\}^n \to \mathbb{R}$ by means of $k$ Boolean functions. Such a linearization yields an integer linear programming formulation with only $k$ auxiliary variables. This motivates the definition of the linarization complexity of $f$ as the minimum such $k$. Our theoretical contributions are the proof that random polynomials almost surely have a high linearization complexity and characterizations of its value in case we do or do not restrict the set of admissible Boolean functions. The practical relevance is shown by devising and evaluating integer linear programming models of two such linearizations for the low auto-correlation binary sequences problem. Still, many problems around this new concept remain open.

Branch · 線性的 · MoDELS · 生成方法 · 似然 ·

2023 年 7 月 8 日

Linear-time logics -- a coalgebraic perspective

Corina Cirstea

We describe a general approach to deriving linear-time logics for a wide variety of state-based, quantitative systems, by modelling the latter as coalgebras whose type incorporates both branching and linear behaviour. Concretely, we define logics whose syntax is determined by the type of linear behaviour, and whose domain of truth values is determined by the type of branching behaviour, and we provide two semantics for them: a step-wise semantics akin to that of standard coalgebraic logics, and a path-based semantics akin to that of standard linear-time logics. The former semantics is useful for model checking, whereas the latter is the more natural semantics, as it measures the extent with which qualitative properties hold along computation paths from a given state. Our main result is the equivalence of the two semantics. We also provide a semantic characterisation of a notion of logical distance induced by these logics. Instances of our logics support reasoning about the possibility, likelihood or minimal cost of exhibiting a given linear-time property.

正則化項 · 平滑 · 極小值 · 優化器 · 全局極小值 ·

2023 年 7 月 7 日

Smoothing the Edges: A General Framework for Smooth Optimization in Sparse Regularization using Hadamard Overparametrization

Chris Kolb,Christian L. Müller,Bernd Bischl,David Rügamer

This paper introduces a smooth method for (structured) sparsity in $\ell_q$ and $\ell_{p,q}$ regularized optimization problems. Optimization of these non-smooth and possibly non-convex problems typically relies on specialized procedures. In contrast, our general framework is compatible with prevalent first-order optimization methods like Stochastic Gradient Descent and accelerated variants without any required modifications. This is accomplished through a smooth optimization transfer, comprising an overparametrization of selected model parameters using Hadamard products and a change of penalties. In the overparametrized problem, smooth and convex $\ell_2$ regularization of the surrogate parameters induces non-smooth and non-convex $\ell_q$ or $\ell_{p,q}$ regularization in the original parametrization. We show that our approach yields not only matching global minima but also equivalent local minima. This is particularly useful in non-convex sparse regularization, where finding global minima is NP-hard and local minima are known to generalize well. We provide a comprehensive overview consolidating various literature strands on sparsity-inducing parametrizations and propose meaningful extensions to existing approaches. The feasibility of our approach is evaluated through numerical experiments, which demonstrate that its performance is on par with or surpasses commonly used implementations of convex and non-convex regularization methods.

線性的 · Weight · 圖 · Performer · 稀疏 ·

2023 年 7 月 6 日

Efficient parallel implementation of the multiplicative weight update method for graph-based linear programs

Caleb Ju,Serif Yesil,Mengyuan Sun,Chandra Chekuri,Edgar Solomonik

from arxiv, Pre-print. 13 pages, comments welcome

Positive linear programs (LPs) model many graph and operations research problems. One can solve for a $(1+\epsilon)$-approximation for positive LPs, for any selected $\epsilon$, in polylogarithmic depth and near-linear work via variations of the multiplicative weight update (MWU) method. Despite extensive theoretical work on these algorithms through the decades, their empirical performance is not well understood. In this work, we implement and test an efficient parallel algorithm for solving positive LP relaxations, and apply it to graph problems such as densest subgraph, bipartite matching, vertex cover and dominating set. We accelerate the algorithm via a new step size search heuristic. Our implementation uses sparse linear algebra optimization techniques such as fusion of vector operations and use of sparse format. Furthermore, we devise an implicit representation for graph incidence constraints. We demonstrate the parallel scalability with the use of threading OpenMP and MPI on the Stampede2 supercomputer. We compare this implementation with exact libraries and specialized libraries for the above problems in order to evaluate MWU's practical standing for both accuracy and performance among other methods. Our results show this implementation is faster than general purpose LP solvers (IBM CPLEX, Gurobi) in all of our experiments, and in some instances, outperforms state-of-the-art specialized parallel graph algorithms.

全局極小值 · 優化器 · 極小值 · 非凸 · 近似 ·

2021 年 3 月 24 日

Why Do Local Methods Solve Nonconvex Problems?

Tengyu Ma

from arxiv, This is the Chapter 21 of the book "Beyond the Worst-Case Analysis of Algorithms"

Non-convex optimization is ubiquitous in modern machine learning. Researchers devise non-convex objective functions and optimize them using off-the-shelf optimizers such as stochastic gradient descent and its variants, which leverage the local geometry and update iteratively. Even though solving non-convex functions is NP-hard in the worst case, the optimization quality in practice is often not an issue -- optimizers are largely believed to find approximate global minima. Researchers hypothesize a unified explanation for this intriguing phenomenon: most of the local minima of the practically-used objectives are approximately global minima. We rigorously formalize it for concrete instances of machine learning problems.