久久久久久久精品少妇9999_日韩纯肉无遮挡一区二区视频_欧美亚洲日本国产黑白配看_又粗又大又爽又硬的片又黄_欧美另类视频一区二区三区四区_国产另类久久久精品网站_国产一区二区三区四区

With the commercial availability of mixed precision hardware, mixed precision GMRES-based iterative refinement schemes have emerged as popular approaches for solving sparse linear systems. Existing analyses of these approaches, however, are all based on using a full LU factorization to construct preconditioners for use within GMRES in each refinement step. In practical applications, inexact preconditioning techniques, such as incomplete LU or sparse approximate inverses, are often used for performance reasons. In this work, we investigate the use of sparse approximate inverse preconditioners within GMRES-based iterative refinement. We analyze the computation of sparse approximate inverses in finite precision and derive constraints under which the user-specified stopping criteria will be satisfied. We then analyze the behavior of and convergence constraints for a GMRES-based iterative refinement scheme that uses sparse approximate inverse preconditioning, which we call SPAI-GMRES-IR. Our numerical experiments confirm that in some cases, sparse approximate inverse preconditioning can have an advantage over using a full LU factorization.

相關內容

查準(zhun)率/準(zhun)確率

關注 0

馬爾可夫鏈 · 近似 · 圖 · 混合 · 均勻采樣 ·

2022 年 4 月 20 日

Approximate Sampling and Counting of Graphs with Near-$P$-stable Degree Intervals

Péter L. Erd?s,Tamás Róbert Mezei,István Miklós

from arxiv, 23 pages

The approximate uniform sampling of graph realizations with a given degree sequence is an everyday task in several social science, computer science, engineering etc. projects. One approach is using Markov chains. The best available current result about the well-studied switch Markov chain is that it is rapidly mixing on P-stable degree sequences (see DOI:10.1016/j.ejc.2021.103421). The switch Markov chain does not change any degree sequence. However, there are cases where degree intervals are specified rather than a single degree sequence. (A natural scenario where this problem arises is in hypothesis testing on social networks that are only partially observed.) Rechner, Strowick, and M\"uller-Hannemann introduced in 2018 the notion of degree interval Markov chain which uses three (separately well-studied) local operations (switch, hinge-flip and toggle), and employing on degree sequence realizations where any two sequences under scrutiny have very small coordinate-wise distance. Recently Amanatidis and Kleer published a beautiful paper (arXiv:2110.09068), showing that the degree interval Markov chain is rapidly mixing if the sequences are coming from a system of very thin intervals which are centered not far from a regular degree sequence. In this paper we extend substantially their result, showing that the degree interval Markov chain is rapidly mixing if the intervals are centred at P-stable degree sequences.

局部極小 · 極小值 · 鞍點 · 極小點 · 非凸 ·

2022 年 4 月 20 日

Faster Perturbed Stochastic Gradient Methods for Finding Local Minima

Zixiang Chen,Dongruo Zhou,Quanquan Gu

from arxiv, 29 pages, 1 figure, 1 table. In ALT 2022

Escaping from saddle points and finding local minimum is a central problem in nonconvex optimization. Perturbed gradient methods are perhaps the simplest approach for this problem. However, to find $(\epsilon, \sqrt{\epsilon})$-approximate local minima, the existing best stochastic gradient complexity for this type of algorithms is $\tilde O(\epsilon^{-3.5})$, which is not optimal. In this paper, we propose LENA (Last stEp shriNkAge), a faster perturbed stochastic gradient framework for finding local minima. We show that LENA with stochastic gradient estimators such as SARAH/SPIDER and STORM can find $(\epsilon, \epsilon_{H})$-approximate local minima within $\tilde O(\epsilon^{-3} + \epsilon_{H}^{-6})$ stochastic gradient evaluations (or $\tilde O(\epsilon^{-3})$ when $\epsilon_H = \sqrt{\epsilon}$). The core idea of our framework is a step-size shrinkage scheme to control the average movement of the iterates, which leads to faster convergence to the local minima.

近似 · 推斷 · 圖 · 估計/估計量 · 相互獨立的 ·

2022 年 4 月 19 日

Graph-based Approximate Message Passing Iterations

Cédric Gerbelot,Rapha?l Berthier

from arxiv, 59 pages, 24 main, 35 appendix

Approximate-message passing (AMP) algorithms have become an important element of high-dimensional statistical inference, mostly due to their adaptability and concentration properties, the state evolution (SE) equations. This is demonstrated by the growing number of new iterations proposed for increasingly complex problems, ranging from multi-layer inference to low-rank matrix estimation with elaborate priors. In this paper, we address the following questions: is there a structure underlying all AMP iterations that unifies them in a common framework? Can we use such a structure to give a modular proof of state evolution equations, adaptable to new AMP iterations without reproducing each time the full argument ? We propose an answer to both questions, showing that AMP instances can be generically indexed by an oriented graph. This enables to give a unified interpretation of these iterations, independent from the problem they solve, and a way of composing them arbitrarily. We then show that all AMP iterations indexed by such a graph admit rigorous SE equations, extending the reach of previous proofs, and proving a number of recent heuristic derivations of those equations. Our proof naturally includes non-separable functions and we show how existing refinements, such as spatial coupling or matrix-valued variables, can be combined with our framework.

推斷 · 估計/估計量 · 優化器 · 馬爾可夫鏈蒙特卡羅 · 馬爾可夫鏈 ·

2022 年 4 月 18 日

Reversible Gromov-Monge Sampler for Simulation-Based Inference

YoonHaeng Hur,Wenxuan Guo,Tengyuan Liang

from arxiv, 49 pages, 7 figures

This paper introduces a new simulation-based inference procedure to model and sample from multi-dimensional probability distributions given access to i.i.d. samples, circumventing the usual approaches of explicitly modeling the density function or designing Markov chain Monte Carlo. Motivated by the seminal work on distance and isomorphism between metric measure spaces, we propose a new notion called the Reversible Gromov-Monge (RGM) distance and study how RGM can be used to design new transform samplers to perform simulation-based inference. Our RGM sampler can also estimate optimal alignments between two heterogeneous metric measure spaces $(\mathcal{X}, \mu, c_{\mathcal{X}})$ and $(\mathcal{Y}, \nu, c_{\mathcal{Y}})$ from empirical data sets, with estimated maps that approximately push forward one measure $\mu$ to the other $\nu$, and vice versa. Analytic properties of the RGM distance are derived; statistical rate of convergence, representation, and optimization questions regarding the induced sampler are studied. Synthetic and real-world examples showcasing the effectiveness of the RGM sampler are also demonstrated.

估計/估計量 · SOTA · MoDELS · Better · Performer ·

2022 年 4 月 18 日

Deep Equilibrium Optical Flow Estimation

Shaojie Bai,Zhengyang Geng,Yash Savani,J. Zico Kolter

from arxiv, CVPR 2022

Many recent state-of-the-art (SOTA) optical flow models use finite-step recurrent update operations to emulate traditional algorithms by encouraging iterative refinements toward a stable flow estimation. However, these RNNs impose large computation and memory overheads, and are not directly trained to model such stable estimation. They can converge poorly and thereby suffer from performance degradation. To combat these drawbacks, we propose deep equilibrium (DEQ) flow estimators, an approach that directly solves for the flow as the infinite-level fixed point of an implicit layer (using any black-box solver), and differentiates through this fixed point analytically (thus requiring $O(1)$ training memory). This implicit-depth approach is not predicated on any specific model, and thus can be applied to a wide range of SOTA flow estimation model designs. The use of these DEQ flow estimators allows us to compute the flow faster using, e.g., fixed-point reuse and inexact gradients, consumes $4\sim6\times$ times less training memory than the recurrent counterpart, and achieves better results with the same computation budget. In addition, we propose a novel, sparse fixed-point correction scheme to stabilize our DEQ flow estimators, which addresses a longstanding challenge for DEQ models in general. We test our approach in various realistic settings and show that it improves SOTA methods on Sintel and KITTI datasets with substantially better computational and memory efficiency.

賭博機/老虎機 · PDE · 優化器 · 貝葉斯風險 · 規范化的 ·

2022 年 4 月 18 日

Risk and optimal policies in bandit experiments

Karun Adusumilli

We provide a decision theoretic analysis of bandit experiments. The setting corresponds to a dynamic programming problem, but solving this directly is typically infeasible. Working within the framework of diffusion asymptotics, we define suitable notions of asymptotic Bayes and minimax risk for bandit experiments. For normally distributed rewards, the minimal Bayes risk can be characterized as the solution to a nonlinear second-order partial differential equation (PDE). Using a limit of experiments approach, we show that this PDE characterization also holds asymptotically under both parametric and non-parametric distribution of the rewards. The approach further describes the state variables it is asymptotically sufficient to restrict attention to, and therefore suggests a practical strategy for dimension reduction. The upshot is that we can approximate the dynamic programming problem defining the bandit experiment with a PDE which can be efficiently solved using sparse matrix routines. We derive the optimal Bayes and minimax policies from the numerical solutions to these equations. The proposed policies substantially dominate existing methods such as Thompson sampling. The framework also allows for substantial generalizations to the bandit problem such as time discounting and pure exploration motives.

估計/估計量 · FPG · PG · 估計誤差 · 價值函數 ·

2022 年 4 月 15 日

Optimal Estimation of Off-Policy Policy Gradient via Double Fitted Iteration

Chengzhuo Ni,Ruiqi Zhang,Xiang Ji,Xuezhou Zhang,Mengdi Wang

Policy gradient (PG) estimation becomes a challenge when we are not allowed to sample with the target policy but only have access to a dataset generated by some unknown behavior policy. Conventional methods for off-policy PG estimation often suffer from either significant bias or exponentially large variance. In this paper, we propose the double Fitted PG estimation (FPG) algorithm. FPG can work with an arbitrary policy parameterization, assuming access to a Bellman-complete value function class. In the case of linear value function approximation, we provide a tight finite-sample upper bound on policy gradient estimation error, that is governed by the amount of distribution mismatch measured in feature space. We also establish the asymptotic normality of FPG estimation error with a precise covariance characterization, which is further shown to be statistically optimal with a matching Cramer-Rao lower bound. Empirically, we evaluate the performance of FPG on both policy gradient estimation and policy optimization, using either softmax tabular or ReLU policy networks. Under various metrics, our results show that FPG significantly outperforms existing off-policy PG estimation methods based on importance sampling and variance reduction techniques.

Performer · 多樣性 · 近似 · state-of-the-art · 學成 ·

2022 年 4 月 15 日

Approximating Gradients for Differentiable Quality Diversity in Reinforcement Learning

Bryon Tjanaka,Matthew C. Fontaine,Julian Togelius,Stefanos Nikolaidis

from arxiv, Published as a conference paper at the 2022 Genetic and Evolutionary Computation Conference (GECCO '22); Online article available at //dqd-rl.github.io

Consider the problem of training robustly capable agents. One approach is to generate a diverse collection of agent polices. Training can then be viewed as a quality diversity (QD) optimization problem, where we search for a collection of performant policies that are diverse with respect to quantified behavior. Recent work shows that differentiable quality diversity (DQD) algorithms greatly accelerate QD optimization when exact gradients are available. However, agent policies typically assume that the environment is not differentiable. To apply DQD algorithms to training agent policies, we must approximate gradients for performance and behavior. We propose two variants of the current state-of-the-art DQD algorithm that compute gradients via approximation methods common in reinforcement learning (RL). We evaluate our approach on four simulated locomotion tasks. One variant achieves results comparable to the current state-of-the-art in combining QD and RL, while the other performs comparably in two locomotion tasks. These results provide insight into the limitations of current DQD algorithms in domains where gradients must be approximated. Source code is available at //github.com/icaros-usc/dqd-rl

優化器 · Performer · 學成 · 深度 Q 學習 · 強化學習 ·

2022 年 4 月 15 日

A Reinforcement Learning Approach to Parameter Selection for Distributed Optimal Power Flow

Sihan Zeng,Alyssa Kody,Youngdae Kim,Kibaek Kim,Daniel K. Molzahn

With the increasing penetration of distributed energy resources, distributed optimization algorithms have attracted significant attention for power systems applications due to their potential for superior scalability, privacy, and robustness to a single point-of-failure. The Alternating Direction Method of Multipliers (ADMM) is a popular distributed optimization algorithm; however, its convergence performance is highly dependent on the selection of penalty parameters, which are usually chosen heuristically. In this work, we use reinforcement learning (RL) to develop an adaptive penalty parameter selection policy for the AC optimal power flow (ACOPF) problem solved via ADMM with the goal of minimizing the number of iterations until convergence. We train our RL policy using deep Q-learning, and show that this policy can result in significantly accelerated convergence (up to a 59% reduction in the number of iterations compared to existing, curvature-informed penalty parameter selection methods). Furthermore, we show that our RL policy demonstrates promise for generalizability, performing well under unseen loading schemes as well as under unseen losses of lines and generators (up to a 50% reduction in iterations). This work thus provides a proof-of-concept for using RL for parameter selection in ADMM for power systems applications.

任務對話系統 · INTERACT · 學成 · 話題 · 情景 ·

2022 年 4 月 7 日

Interacting with Non-Cooperative User: A New Paradigm for Proactive Dialogue Policy

Wenqiang Lei,Yao Zhang,Feifan Song,Hongru Liang,Jiaxin Mao,Jiancheng Lv,Zhenglu Yang,Tat-Seng Chua

from arxiv, Accepted to SIGIR 2022

Proactive dialogue system is able to lead the conversation to a goal topic and has advantaged potential in bargain, persuasion and negotiation. Current corpus-based learning manner limits its practical application in real-world scenarios. To this end, we contribute to advance the study of the proactive dialogue policy to a more natural and challenging setting, i.e., interacting dynamically with users. Further, we call attention to the non-cooperative user behavior -- the user talks about off-path topics when he/she is not satisfied with the previous topics introduced by the agent. We argue that the targets of reaching the goal topic quickly and maintaining a high user satisfaction are not always converge, because the topics close to the goal and the topics user preferred may not be the same. Towards this issue, we propose a new solution named I-Pro that can learn Proactive policy in the Interactive setting. Specifically, we learn the trade-off via a learned goal weight, which consists of four factors (dialogue turn, goal completion difficulty, user satisfaction estimation, and cooperative degree). The experimental results demonstrate I-Pro significantly outperforms baselines in terms of effectiveness and interpretability.