免费在线黄色电影_国内三级自拍小视频在线观看_成人欧美精品一区二区三区_亚洲欧美另类在线区_国产日本欧美精品一二区_中文字幕人妻在线视频_国产高清精品在线

We study nonlinear optimization problems with stochastic objective and deterministic equality and inequality constraints, which emerge in numerous applications including finance, manufacturing, power systems and, recently, deep neural networks. We propose an active-set stochastic sequential quadratic programming algorithm, using a differentiable exact augmented Lagrangian as the merit function. The algorithm adaptively selects the penalty parameters of augmented Lagrangian and performs stochastic line search to decide the stepsize. The global convergence is established: for any initialization, the "liminf" of the KKT residuals converges to zero almost surely. Our algorithm and analysis further develop the prior work \cite{Na2021Adaptive} by allowing nonlinear inequality constraints. We demonstrate the performance of the algorithm on a subset of nonlinear problems collected in the CUTEst test set.

相關內容

二次規劃

關注 1

未標記 · MoDELS · 駐點 · CASE · 平穩的 ·

2021 年 11 月 16 日

r-local sensing: Improved algorithm and applications

Ahmed Ali Abbasi,Abiy Tasissa,Shuchin Aeron

The unlabeled sensing problem is to solve a noisy linear system of equations under unknown permutation of the measurements. We study a particular case of the problem where the permutations are restricted to be r-local, i.e. the permutation matrix is block diagonal with r x r blocks. Assuming a Gaussian measurement matrix, we argue that the r-local permutation model is more challenging compared to a recent sparse permutation model. We propose a proximal alternating minimization algorithm for the general unlabeled sensing problem that provably converges to a first order stationary point. Applied to the r-local model, we show that the resulting algorithm is efficient. We validate the algorithm on synthetic and real datasets. We also formulate the 1-d unassigned distance geometry problem as an unlabeled sensing problem with a structured measurement matrix.

重要性采樣 · Lipschitz · 樣本 · CASES · state-of-the-art ·

2021 年 11 月 16 日

Stochastic Extragradient: General Analysis and Improved Rates

Eduard Gorbunov,Hugo Berard,Gauthier Gidel,Nicolas Loizou

from arxiv, 50 pages, 3 figures, 2 tables

The Stochastic Extragradient (SEG) method is one of the most popular algorithms for solving min-max optimization and variational inequalities problems (VIP) appearing in various machine learning tasks. However, several important questions regarding the convergence properties of SEG are still open, including the sampling of stochastic gradients, mini-batching, convergence guarantees for the monotone finite-sum variational inequalities with possibly non-monotone terms, and others. To address these questions, in this paper, we develop a novel theoretical framework that allows us to analyze several variants of SEG in a unified manner. Besides standard setups, like Same-Sample SEG under Lipschitzness and monotonicity or Independent-Samples SEG under uniformly bounded variance, our approach allows us to analyze variants of SEG that were never explicitly considered in the literature before. Notably, we analyze SEG with arbitrary sampling which includes importance sampling and various mini-batching strategies as special cases. Our rates for the new variants of SEG outperform the current state-of-the-art convergence guarantees and rely on less restrictive assumptions.

預測器/決策函數 · MoDELS · 模型評估 · 統計量 · 稀疏 ·

2021 年 11 月 16 日

Split Regression Modeling

Anthony Christidis,Stefan Van Aelst,Ruben Zamar

In the statistical literature, sparse modeling is the standard approach to achieve improvements in prediction tasks and interpretability. Alternatively, in the seminal paper "Statistical Modeling: The Two Cultures," Breiman (2001) advocated for the adoption of algorithmic approaches to generate ensembles to achieve superior prediction accuracy than single-model methods at the cost of loss of interpretability. In a recent important and critical paper, Rudin (2019) argued that blackbox algorithmic approaches should be avoided for high-stakes decisions and that the tradeoff between accuracy and interpretability is a myth. In response to this recent change in philosophy, we generalize best subset selection (BSS) to best split selection (BSpS), a data-driven approach aimed at finding the optimal split of predictor variables among the models of an ensemble. The proposed methodology results in an ensemble of sparse and diverse models that provide possible mechanisms that explain the relationship between the predictors and the response. The high computational cost of BSpS motivates the need for computational tractable ways to approximate the exhaustive search, and we benchmark one such recent proposal by Christidis et al. (2020) based on a multi-convex relaxation. Our objective with this article is to motivate research in this new exciting field with great potential for data analysis tasks for high-dimensional data.

噪聲 · 全 · 離散化 · AIM · 后向 ·

2021 年 11 月 16 日

Weak convergence rates for a full implicit scheme of stochastic Cahn-Hilliard equation with additive noise

Meng Cai,Siqing Gan,Yaozhong Hu

from arxiv, 25 pages

The aim of this study is the weak convergence rate of a temporal and spatial discretization scheme for stochastic Cahn-Hilliard equation with additive noise, where the spectral Galerkin method is used in space and the backward Euler scheme is used in time. The presence of the unbounded operator in front of the nonlinear term and the lack of the associated Kolmogorov equations make the error analysis much more challenging and demanding. To overcome these difficulties, we further exploit a novel approach proposed in [7] and combine it with Malliavin calculus to obtain an improved weak rate of convergence, in comparison with the corresponding strong convergence rates. The techniques used here are quite general and hence have the potential to be applied to other non-Markovian equations. As a byproduct the rate of the strong error can also be easily obtained.

SPAA · INFOCOM · 約束 · 情景 · 結點 ·

2021 年 11 月 16 日

Improved Bounds for Scheduling Flows under Endpoint Capacity Constraints

Searidang Pa,Rajmohan Rajaraman,David Stalfa

We study flow scheduling under node capacity constraints. We are given capacitated nodes and an online sequence of jobs, each with a release time and a demand to be routed between two nodes. A schedule specifies which jobs are routed in each step, guaranteeing that the total demand on a node in any step is at most its capacity. A key metric in this scenario is response time: the time between a job's release and its completion. Prior work shows no un-augmented algorithm is competitive for average response time, and that a constant factor competitive ratio is achievable with augmentation exceeding 2 (Dinitz-Moseley Infocom 2020). For maximum response time, the best known result is a 2-competitive algorithm with a augmentation 4 (Jahanjou et al SPAA 2020). We improve these bounds under various response time objectives. We show that, without resource augmentation, the best competitive ratio for maximum response time is $\Omega(n)$, where $n$ is the number of nodes. Our Proportional Allocation algorithm uses $(1+\varepsilon)$ resource augmentation to achieve a $(1/\varepsilon)$-competitive ratio in the setting with general demands and capacities, and splittable jobs. Our Batch Decomposition algorithm is $2$-competitive (resp., optimal) for maximum response time using resource augmentation 2 (resp., 4) in the setting with unit demands and capacities, and unsplittable jobs. We also derive bounds for the simultaneous approximation of average and maximum response time metrics.

優化器 · 穩健性 · 拉索回歸 · MoDELS · 價值函數 ·

2021 年 11 月 12 日

Sensitivity analysis of Wasserstein distributionally robust optimization problems

Daniel Bartl,Samuel Drapeau,Jan Obloj,Johannes Wiesel

from arxiv, Forthcoming in "Proceedings of the Royal Society A"

We consider sensitivity of a generic stochastic optimization problem to model uncertainty. We take a non-parametric approach and capture model uncertainty using Wasserstein balls around the postulated model. We provide explicit formulae for the first order correction to both the value function and the optimizer and further extend our results to optimization under linear constraints. We present applications to statistics, machine learning, mathematical finance and uncertainty quantification. In particular, we provide explicit first-order approximation for square-root LASSO regression coefficients and deduce coefficient shrinkage compared to the ordinary least squares regression. We consider robustness of call option pricing and deduce a new Black-Scholes sensitivity, a non-parametric version of the so-called Vega. We also compute sensitivities of optimized certainty equivalents in finance and propose measures to quantify robustness of neural networks to adversarial examples.

統計量 · 優化器 · 隨機搜索 · 推斷 · Fisher信息矩陣 ·

2021 年 11 月 11 日

Online Statistical Inference for Stochastic Optimization via Kiefer-Wolfowitz Methods

Xi Chen,Zehua Lai,He Li,Yichen Zhang

This paper investigates the problem of online statistical inference of model parameters in stochastic optimization problems via the Kiefer-Wolfowitz algorithm with random search directions. We first present the asymptotic distribution for the Polyak-Ruppert-averaging type Kiefer-Wolfowitz (AKW) estimators, whose asymptotic covariance matrices depend on the function-value query complexity and the distribution of search directions. The distributional result reflects the trade-off between statistical efficiency and function query complexity. We further analyze the choices of random search directions to minimize the asymptotic covariance matrix, and conclude that the optimal search direction depends on the optimality criteria with respect to different summary statistics of the Fisher information matrix. Based on the asymptotic distribution result, we conduct online statistical inference by providing two construction procedures of valid confidence intervals. We provide numerical experiments verifying our theoretical results with the practical effectiveness of the procedures.

優化器 · 可約的 · 近似 · 控制器 · Principle ·

2020 年 6 月 29 日

Differential Dynamic Programming Neural Optimizer

Guan-Horng Liu,Tianrong Chen,Evangelos A. Theodorou

Interpretation of Deep Neural Networks (DNNs) training as an optimal control problem with nonlinear dynamical systems has received considerable attention recently, yet the algorithmic development remains relatively limited. In this work, we make an attempt along this line by reformulating the training procedure from the trajectory optimization perspective. We first show that most widely-used algorithms for training DNNs can be linked to the Differential Dynamic Programming (DDP), a celebrated second-order trajectory optimization algorithm rooted in the Approximate Dynamic Programming. In this vein, we propose a new variant of DDP that can accept batch optimization for training feedforward networks, while integrating naturally with the recent progress in curvature approximation. The resulting algorithm features layer-wise feedback policies which improve convergence rate and reduce sensitivity to hyper-parameter over existing methods. We show that the algorithm is competitive against state-ofthe-art first and second order methods. Our work opens up new avenues for principled algorithmic design built upon the optimal control theory.

GANs · 可辨認的 · 優化器 · 注意力機制 · MoDELS ·

2018 年 11 月 29 日

On the Implicit Assumptions of GANs

Ke Li,Jitendra Malik

from arxiv, 8 pages

Generative adversarial nets (GANs) have generated a lot of excitement. Despite their popularity, they exhibit a number of well-documented issues in practice, which apparently contradict theoretical guarantees. A number of enlightening papers have pointed out that these issues arise from unjustified assumptions that are commonly made, but the message seems to have been lost amid the optimism of recent years. We believe the identified problems deserve more attention, and highlight the implications on both the properties of GANs and the trajectory of research on probabilistic models. We recently proposed an alternative method that sidesteps these problems.

向量空間 · 推斷 · 估計/估計量 · Networking · 學習器 ·

2018 年 2 月 27 日

ADMM-based Networked Stochastic Variational Inference

Hamza Anwar,Quanyan Zhu

from arxiv, to be submitted for publishing

Owing to the recent advances in "Big Data" modeling and prediction tasks, variational Bayesian estimation has gained popularity due to their ability to provide exact solutions to approximate posteriors. One key technique for approximate inference is stochastic variational inference (SVI). SVI poses variational inference as a stochastic optimization problem and solves it iteratively using noisy gradient estimates. It aims to handle massive data for predictive and classification tasks by applying complex Bayesian models that have observed as well as latent variables. This paper aims to decentralize it allowing parallel computation, secure learning and robustness benefits. We use Alternating Direction Method of Multipliers in a top-down setting to develop a distributed SVI algorithm such that independent learners running inference algorithms only require sharing the estimated model parameters instead of their private datasets. Our work extends the distributed SVI-ADMM algorithm that we first propose, to an ADMM-based networked SVI algorithm in which not only are the learners working distributively but they share information according to rules of a graph by which they form a network. This kind of work lies under the umbrella of `deep learning over networks' and we verify our algorithm for a topic-modeling problem for corpus of Wikipedia articles. We illustrate the results on latent Dirichlet allocation (LDA) topic model in large document classification, compare performance with the centralized algorithm, and use numerical experiments to corroborate the analytical results.