久久久久久久精品少妇9999,国产女做A性色精品视频免费

We investigate a family of bilevel imaging learning problems where the lower-level instance corresponds to a convex variational model involving first- and second-order nonsmooth sparsity-based regularizers. By using geometric properties of the primal-dual reformulation of the lower-level problem and introducing suitable auxiliar variables, we are able to reformulate the original bilevel problems as Mathematical Programs with Complementarity Constraints (MPCC). For the latter, we prove tight constraint qualification conditions (MPCC-RCPLD and partial MPCC-LICQ) and derive Mordukhovich (M-) and Strong (S-) stationarity conditions. The stationarity systems for the MPCC turn also into stationarity conditions for the original formulation. Second-order sufficient optimality conditions are derived as well, together with a local uniqueness result for stationary points. The proposed reformulation may be extended to problems in function spaces, leading to MPCC's with constraints on the gradient of the state. The MPCC reformulation also leads to the efficient use of available large-scale nonlinear programming solvers, as shown in a companion paper, where different imaging applications are studied.

相關內容

約束

關注 0

分布式哈希表技術 · MoDELS · INFORMS · 平穩的 · 聯合分布 ·

2023 年 5 月 11 日

An Information-Spectrum Approach to Distributed Hypothesis Testing for General Sources

Ismaila Salihou Adamou,Elsa Dupraz,Tad Matsumoto

from arxiv, Submitted to Globecom 2023

This paper investigates Distributed Hypothesis testing (DHT), in which a source $\mathbf{X}$ is encoded given that side information $\mathbf{Y}$ is available at the decoder only. Based on the received coded data, the receiver aims to decide on the two hypotheses $H_0$ or $H_1$ related to the joint distribution of $\mathbf{X}$ and $\mathbf{Y}$. While most existing contributions in the literature on DHT consider i.i.d. assumptions, this paper assumes more generic, non-i.i.d., non-stationary, and non-ergodic sources models. It relies on information-spectrum tools to provide general formulas on the achievable Type-II error exponent under a constraint on the Type-I error. The achievability proof is based on a quantize-and-binning scheme. It is shown that with the quantize-and-binning approach, the error exponent boils down to a trade-off between a binning error and a decision error, as already observed for the i.i.d. sources. The last part of the paper provides error exponents for particular source models, \emph{e.g.}, Gaussian, stationary, and ergodic models.

損失函數（機器學習） · 泛函 · 損失 · 穩健性 · 非凸 ·

2023 年 5 月 11 日

Adaptive Graduated Nonconvexity Loss

Kyungmin Jung,Thomas Hitchcox,James Richard Forbes

Many problems in robotics, such as estimating the state from noisy sensor data or aligning two LiDAR point clouds, can be posed and solved as least-squares problems. Unfortunately, vanilla nonminimal solvers for least-squares problems are notoriously sensitive to outliers. As such, various robust loss functions have been proposed to reduce the sensitivity to outliers. Examples of loss functions include pseudo-Huber, Cauchy, and Geman-McClure. Recently, these loss functions have been generalized into a single loss function that enables the best loss function to be found adaptively based on the distribution of the residuals. However, even with the generalized robust loss function, most nonminimal solvers can only be solved locally given a prior state estimate due to the nonconvexity of the problem. The first contribution of this paper is to combine graduated nonconvexity (GNC) with the generalized robust loss function to solve least-squares problems without a prior state estimate and without the need to specify a loss function. Moreover, existing loss functions, including the generalized loss function, are based on Gaussian-like distribution. However, residuals are often defined as the squared norm of a multivariate error and distributed in a Chi-like fashion. The second contribution of this paper is to apply a norm-aware adaptive robust loss function within a GNC framework. This leads to additional robustness when compared with state-of-the-art methods. Simulations and experiments demonstrate that the proposed approach is more robust and yields faster convergence times compared to other GNC formulations.

優化器 · Continuity · 有向 · 泛函 · 總回報 ·

2023 年 5 月 11 日

Policy Gradient Algorithms Implicitly Optimize by Continuation

Adrien Bolland,Gilles Louppe,Damien Ernst

Direct policy optimization in reinforcement learning is usually solved with policy-gradient algorithms, which optimize policy parameters via stochastic gradient ascent. This paper provides a new theoretical interpretation and justification of these algorithms. First, we formulate direct policy optimization in the optimization by continuation framework. The latter is a framework for optimizing nonconvex functions where a sequence of surrogate objective functions, called continuations, are locally optimized. Second, we show that optimizing affine Gaussian policies and performing entropy regularization can be interpreted as implicitly optimizing deterministic policies by continuation. Based on these theoretical results, we argue that exploration in policy-gradient algorithms consists in computing a continuation of the return of the policy at hand, and that the variance of policies should be history-dependent functions adapted to avoid local extrema rather than to maximize the return of the policy.

cache · 圖 · state-of-the-art · Performer · 縮放 ·

2023 年 5 月 11 日

Characterizing the impact of last-level cache replacement policies on big-data workloads

Alexandre Valentin Jamet,Lluc Alvarez,Marc Casas

from arxiv, Extended abstract submitted to the 10th BSC Doctoral Symposium

In recent years, graph-processing has become an essential class of workloads with applications in a rapidly growing number of fields. Graph-processing typically uses large input sets, often in multi-gigabyte scale, and data-dependent graph traversal methods exhibiting irregular memory access patterns. Recent work demonstrates that, due to the highly irregular memory access patterns of data-dependent graph traversals, state-of-the-art graph-processing workloads spend up to 80 % of the total execution time waiting for memory accesses to be served by the DRAM. The vast disparity between the Last Level Cache (LLC) and main memory latencies is a problem that has been addressed for years in computer architecture. One of the prevailing approaches when it comes to mitigating this performance gap between modern CPUs and DRAM is cache replacement policies. In this work, we characterize the challenges drawn by graph-processing workloads and evaluate the most relevant cache replacement policies.

劃分 · 可約的 · 優化器 · 非線性規劃 · state-of-the-art ·

2023 年 5 月 11 日

Adaptive Privacy-Preserving Coded Computing With Hierarchical Task Partitioning

Qicheng Zeng,Zhaojun Nan,Sheng Zhou

from arxiv, 14 pages, 6 figures

Distributed computing is known as an emerging and efficient technique to support various intelligent services, such as large-scale machine learning. However, privacy leakage and random delays from straggling servers pose significant challenges. To address these issues, coded computing, a promising solution that combines coding theory with distributed computing, recovers computation tasks with results from a subset of workers. In this paper, we propose the adaptive privacy-preserving coded computing (APCC) strategy, which can adaptively provide accurate or approximated results according to the form of computation functions, so as to suit diverse types of computation tasks. We prove that APCC achieves complete data privacy preservation and demonstrate its optimality in terms of encoding rate, defined as the ratio between the computation loads of tasks before and after encoding. To further alleviate the straggling effect and reduce delay, we integrate hierarchical task partitioning and task cancellation into the coding design of APCC. The corresponding partitioning problems are formulated as mixed-integer nonlinear programming (MINLP) problems with the objective of minimizing task completion delay. We propose a low-complexity maximum value descent (MVD) algorithm to optimally solve these problems. Simulation results show that APCC can reduce task completion delay by at least 42.9% compared to other state-of-the-art benchmarks.

路徑 · Learning · 極大 · Performer · 極小點 ·

2023 年 5 月 11 日

Provably Efficient Risk-Sensitive Reinforcement Learning: Iterated CVaR and Worst Path

Yihan Du,Siwei Wang,Longbo Huang

In this paper, we study a novel episodic risk-sensitive Reinforcement Learning (RL) problem, named Iterated CVaR RL, which aims to maximize the tail of the reward-to-go at each step, and focuses on tightly controlling the risk of getting into catastrophic situations at each stage. This formulation is applicable to real-world tasks that demand strong risk avoidance throughout the decision process, such as autonomous driving, clinical treatment planning and robotics. We investigate two performance metrics under Iterated CVaR RL, i.e., Regret Minimization and Best Policy Identification. For both metrics, we design efficient algorithms ICVaR-RM and ICVaR-BPI, respectively, and provide nearly matching upper and lower bounds with respect to the number of episodes $K$. We also investigate an interesting limiting case of Iterated CVaR RL, called Worst Path RL, where the objective becomes to maximize the minimum possible cumulative reward. For Worst Path RL, we propose an efficient algorithm with constant upper and lower bounds. Finally, our techniques for bounding the change of CVaR due to the value function shift and decomposing the regret via a distorted visitation distribution are novel, and can find applications in other risk-sensitive RL problems.

Microsoft Surface · Integration · 向量化 · 講稿 · 核化 ·

2023 年 5 月 10 日

FMM-accelerated solvers for the Laplace-Beltrami problem on complex surfaces in three dimensions

Dhwanit Agarwal,Michael O'Neil,Manas Rachh

from arxiv, 18 pages, 5 tables, 3 figures

The Laplace-Beltrami problem on closed surfaces embedded in three dimensions arises in many areas of physics, including molecular dynamics (surface diffusion), electromagnetics (harmonic vector fields), and fluid dynamics (vesicle deformation). Using classical potential theory, the Laplace-Beltrami operator can be pre-/post-conditioned with an integral operator whose kernel is translation invariant, resulting in well-conditioned Fredholm integral equations of the second-kind. These equations have the standard~$1/r$ kernel from potential theory, and therefore the equations can be solved rapidly and accurately using a combination of fast multipole methods (FMMs) and high-order quadrature corrections. In this work we detail such a scheme, presenting two alternative integral formulations of the Laplace-Beltrami problem, each of whose solution can be obtained via FMM acceleration. We then present several applications of the solvers, focusing on the computation of what are known as harmonic vector fields, relevant for many applications in electromagnetics. A battery of numerical results are presented for each application, detailing the performance of the solver in various geometries.

價值函數 · 泛函 · 優化器 · 控制器 · 核化 ·

2023 年 5 月 10 日

Hermite kernel surrogates for the value function of high-dimensional nonlinear optimal control problems

Tobias Ehring,Bernard Haasdonk

from arxiv, 20 pages, 4 figures

Numerical methods for the optimal feedback control of high-dimensional dynamical systems typically suffer from the curse of dimensionality. In the current presentation, we devise a mesh-free data-based approximation method for the value function of optimal control problems, which partially mitigates the dimensionality problem. The method is based on a greedy Hermite kernel interpolation scheme and incorporates context-knowledge by its structure. Especially, the value function surrogate is elegantly enforced to be 0 in the target state, non-negative and constructed as a correction of a linearized model. The algorithm is proposed in a matrix-free way, which circumvents the large-matrix-problem for multivariate Hermite interpolation. For finite time horizons, both convergence of the surrogate to the value function as well as for the surrogate vs. the optimal controlled dynamical system are proven. Experiments support the effectiveness of the scheme, using among others a new academic model that has a scalable dimension and an explicitly given value function. It may also be useful for the community to validate other optimal control approaches.

Analysis · 分解的 · 動量 · Adam · 情景 ·

2023 年 5 月 9 日

UAdam: Unified Adam-Type Algorithmic Framework for Non-Convex Stochastic Optimization

Yiming Jiang,Jinlan Liu,Dongpo Xu,Danilo P. Mandic

Adam-type algorithms have become a preferred choice for optimisation in the deep learning setting, however, despite success, their convergence is still not well understood. To this end, we introduce a unified framework for Adam-type algorithms (called UAdam). This is equipped with a general form of the second-order moment, which makes it possible to include Adam and its variants as special cases, such as NAdam, AMSGrad, AdaBound, AdaFom, and Adan. This is supported by a rigorous convergence analysis of UAdam in the non-convex stochastic setting, showing that UAdam converges to the neighborhood of stationary points with the rate of $\mathcal{O}(1/T)$. Furthermore, the size of neighborhood decreases as $\beta$ increases. Importantly, our analysis only requires the first-order momentum factor to be close enough to 1, without any restrictions on the second-order momentum factor. Theoretical results also show that vanilla Adam can converge by selecting appropriate hyperparameters, which provides a theoretical guarantee for the analysis, applications, and further developments of the whole class of Adam-type algorithms.

全局極小值 · 優化器 · 極小值 · 非凸 · 近似 ·

2021 年 3 月 24 日

Why Do Local Methods Solve Nonconvex Problems?

Tengyu Ma

from arxiv, This is the Chapter 21 of the book "Beyond the Worst-Case Analysis of Algorithms"

Non-convex optimization is ubiquitous in modern machine learning. Researchers devise non-convex objective functions and optimize them using off-the-shelf optimizers such as stochastic gradient descent and its variants, which leverage the local geometry and update iteratively. Even though solving non-convex functions is NP-hard in the worst case, the optimization quality in practice is often not an issue -- optimizers are largely believed to find approximate global minima. Researchers hypothesize a unified explanation for this intriguing phenomenon: most of the local minima of the practically-used objectives are approximately global minima. We rigorously formalize it for concrete instances of machine learning problems.