99欧美日韩精品一区二区红桃_尤物视频一区二区_久久久国产成人一区二区三区_欧美成年午夜性视频_国产日韩欧美在线一区二区不卡_欧美日韩国产一级视频_强开乖女嫩苞又嫩又紧免费视频

from arxiv, 14 pages, 2 figures. For web appendix please see //www.probability.ca/NoticesApp. This is the updated version of previous paper: Markov Chain Convergence Rates from Coupling Constructions

This review paper provides an introduction of Markov chains and their convergence rates which is an important and interesting mathematical topic which also has important applications for very widely used Markov chain Monte Carlo (MCMC) algorithm. We first discuss eigenvalue analysis for Markov chains on finite state spaces. Then, using the coupling construction, we prove two quantitative bounds based on minorization condition and drift conditions, and provide descriptive and intuitive examples to showcase how these theorems can be implemented in practice. This paper is meant to provide a general overview of the subject and spark interest in new Markov chain research areas.

相關內容

馬爾(er)可夫鏈

關注 289

馬(ma)爾(er)(er)可(ke)(ke)夫(fu)鏈(lian)(lian)，因(yin)安德烈·馬(ma)爾(er)(er)可(ke)(ke)夫(fu)（A.A.Markov，1856－1922）得(de)名(ming)，是(shi)指數學中(zhong)(zhong)具有馬(ma)爾(er)(er)可(ke)(ke)夫(fu)性質的(de)(de)(de)(de)離散(san)事件隨(sui)機(ji)過程(cheng)。該過程(cheng)中(zhong)(zhong)，在(zai)給定當前(qian)(qian)知識(shi)或信息的(de)(de)(de)(de)情況(kuang)下(xia)，過去(qu)（即當前(qian)(qian)以(yi)(yi)前(qian)(qian)的(de)(de)(de)(de)歷史狀(zhuang)(zhuang)態(tai)）對于(yu)預測將(jiang)來(lai)（即當前(qian)(qian)以(yi)(yi)后的(de)(de)(de)(de)未(wei)來(lai)狀(zhuang)(zhuang)態(tai)）是(shi)無關的(de)(de)(de)(de)。在(zai)馬(ma)爾(er)(er)可(ke)(ke)夫(fu)鏈(lian)(lian)的(de)(de)(de)(de)每(mei)一(yi)(yi)(yi)步(bu)，系統根據概(gai)率(lv)分布(bu)，可(ke)(ke)以(yi)(yi)從(cong)一(yi)(yi)(yi)個(ge)狀(zhuang)(zhuang)態(tai)變到(dao)另一(yi)(yi)(yi)個(ge)狀(zhuang)(zhuang)態(tai)，也(ye)可(ke)(ke)以(yi)(yi)保持(chi)當前(qian)(qian)狀(zhuang)(zhuang)態(tai)。狀(zhuang)(zhuang)態(tai)的(de)(de)(de)(de)改變叫(jiao)做轉(zhuan)(zhuan)移(yi)(yi)，與不同的(de)(de)(de)(de)狀(zhuang)(zhuang)態(tai)改變相關的(de)(de)(de)(de)概(gai)率(lv)叫(jiao)做轉(zhuan)(zhuan)移(yi)(yi)概(gai)率(lv)。隨(sui)機(ji)漫(man)步(bu)就是(shi)馬(ma)爾(er)(er)可(ke)(ke)夫(fu)鏈(lian)(lian)的(de)(de)(de)(de)例子。隨(sui)機(ji)漫(man)步(bu)中(zhong)(zhong)每(mei)一(yi)(yi)(yi)步(bu)的(de)(de)(de)(de)狀(zhuang)(zhuang)態(tai)是(shi)在(zai)圖形中(zhong)(zhong)的(de)(de)(de)(de)點(dian)，每(mei)一(yi)(yi)(yi)步(bu)可(ke)(ke)以(yi)(yi)移(yi)(yi)動(dong)到(dao)任何一(yi)(yi)(yi)個(ge)相鄰的(de)(de)(de)(de)點(dian)，在(zai)這里移(yi)(yi)動(dong)到(dao)每(mei)一(yi)(yi)(yi)個(ge)點(dian)的(de)(de)(de)(de)概(gai)率(lv)都是(shi)相同的(de)(de)(de)(de)（無論之前(qian)(qian)漫(man)步(bu)路徑是(shi)如何的(de)(de)(de)(de)）。

MoDELS · 代價 · 推斷 · 相關系數 · 評論員 ·

2021 年 10 月 25 日

The Efficiency Misnomer

Mostafa Dehghani,Anurag Arnab,Lucas Beyer,Ashish Vaswani,Yi Tay

Model efficiency is a critical aspect of developing and deploying machine learning models. Inference time and latency directly affect the user experience, and some applications have hard requirements. In addition to inference costs, model training also have direct financial and environmental impacts. Although there are numerous well-established metrics (cost indicators) for measuring model efficiency, researchers and practitioners often assume that these metrics are correlated with each other and report only few of them. In this paper, we thoroughly discuss common cost indicators, their advantages and disadvantages, and how they can contradict each other. We demonstrate how incomplete reporting of cost indicators can lead to partial conclusions and a blurred or incomplete picture of the practical considerations of different models. We further present suggestions to improve reporting of efficiency metrics.

INFORMS · Processing（編程語言） · 學成 · 信息理論 · Better ·

2021 年 10 月 24 日

Problems with information theoretic approaches to causal learning

Nithin Nagaraj

from arxiv, 10 pages, 2 figures

The language of information theory is favored in both causal reasoning and machine learning frameworks. But, is there a better language than this? In this study, we demonstrate the pitfalls of infotheoretic estimation using first order statistics on (short) sequences for causal learning. We recommend the use of data compression based approaches for causality testing since these make very little assumptions on data as opposed to infotheoretic measures, and are more robust to finite data length effects. We conclude with a discussion on the challenges posed in modeling the effects of conditioning process $X$ with another process $Y$ in causal machine learning. Specifically, conditioning can increase 'confusion' which can be difficult to model by classical information theory. A conscious causal agent creates new choices, decisions and meaning which poses huge challenges for AI.

近似 · Performer · Neural Networks · 樣例 · 欠采樣 ·

2021 年 10 月 24 日

WARPd: A linearly convergent first-order method for inverse problems with approximate sharpness conditions

Matthew J. Colbrook

Reconstruction of signals from undersampled and noisy measurements is a topic of considerable interest. Sharpness conditions directly control the recovery performance of restart schemes for first-order methods without the need for restrictive assumptions such as strong convexity. However, they are challenging to apply in the presence of noise or approximate model classes (e.g., approximate sparsity). We provide a first-order method: Weighted, Accelerated and Restarted Primal-dual (WARPd), based on primal-dual iterations and a novel restart-reweight scheme. Under a generic approximate sharpness condition, WARPd achieves stable linear convergence to the desired vector. Many problems of interest fit into this framework. For example, we analyze sparse recovery in compressed sensing, low-rank matrix recovery, matrix completion, TV regularization, minimization of $\|Bx\|_{l^1}$ under constraints ($l^1$-analysis problems for general $B$), and mixed regularization problems. We show how several quantities controlling recovery performance also provide explicit approximate sharpness constants. Numerical experiments show that WARPd compares favorably with specialized state-of-the-art methods and is ideally suited for solving large-scale problems. We also present a noise-blind variant based on the Square-Root LASSO decoder. Finally, we show how to unroll WARPd as neural networks. This approximation theory result provides lower bounds for stable and accurate neural networks for inverse problems and sheds light on architecture choices. Code and a gallery of examples are made available online as a MATLAB package.

正則化項 · 估計/估計量 · 優化器 · 近似 · 噪聲 ·

2021 年 10 月 22 日

Stochastic convergence of regularized solutions and their finite element approximations to inverse source problems

Zhiming Chen,Wenlong Zhang,Jun Zou

In this work, we investigate the regularized solutions and their finite element solutions to the inverse source problems governed by partial differential equations, and establish the stochastic convergence and optimal finite element convergence rates of these solutions, under pointwise measurement data with random noise. Unlike most existing regularization theories, the regularization error estimates are derived without any source conditions, while the error estimates of finite element solutions show their explicit dependence on the noise level, regularization parameter, mesh size, and time step size, which can guide practical choices among these key parameters in real applications. The error estimates also suggest an iterative algorithm for determining an optimal regularization parameter. Numerical experiments are presented to demonstrate the effectiveness of the analytical results.

Principle · 泛函 · MoDELS ·

2021 年 10 月 22 日

Converse extensionality and apartness

Benno van den Berg,Robert Passmann

from arxiv, Fixed typos and added an appendix with a proof-theoretic treatment of our results

In this paper we try to find a computational interpretation for a strong form of extensionality, which we call ``converse extensionality''. These converse extensionality principles, which arise as the Dialectica interpretation of the axiom of extensionality, were first studied by Howard. In order to give a computational interpretation to these principles, we reconsider Brouwer's apartness relation, a strong constructive form of inequality. Formally, we provide a categorical construction to endow every typed combinatory algebra with an apartness relation. We then exploit that functions reflect apartness, in addition to preserving equality, to prove that the resulting categories of assemblies model a converse extensionality principle.

核化 · 估計/估計量 · Less · 線性的 · 情景 ·

2021 年 10 月 21 日

A Geometric Approach for Computing the Kernelof a Polyhedron

Tommaso Sorgente,Silvia Biasotti,Michela Spagnuolo

We present a geometric algorithm to compute the geometric kernel of a generic polyhedron. The geometric kernel (or simply kernel) is definedas the set of points from which the whole polyhedron is visible. Whilst the computation of the kernel for a polygon has already been largely addressed in the literature, less has been done for polyhedra. Currently, the principal implementation of the kernel estimation is based on the solution of a linear programming problem. We compare against it on several examples, showing that our method is more efficient in analysing the elements of a generic tessellation. Details on the technical implementation and discussions on pros and cons of the method are also provided.

離散化 · 近似 · Lipschitz · Weight · 平穩的 ·

2021 年 10 月 20 日

A stabilized finite element method for inverse problems subject to the convection-diffusion equation. II: convection-dominated regime

Erik Burman,Mihai Nechita,Lauri Oksanen

from arxiv, 25 pages, 16 figures; in v2 we made some minor corrections and updated the first 3 figures

We consider the numerical approximation of the ill-posed data assimilation problem for stationary convection-diffusion equations and extend our previous analysis in [Numer. Math. 144, 451--477, 2020] to the convection-dominated regime. Slightly adjusting the stabilized finite element method proposed for dominant diffusion, we draw upon a local error analysis to obtain quasi-optimal convergence along the characteristics of the convective field through the data set. The weight function multiplying the discrete solution is taken to be Lipschitz and a corresponding super approximation result (discrete commutator property) is proven. The effect of data perturbations is included in the analysis and we conclude the paper with some numerical experiments.

SGD · 小批量 · 估計/估計量 · contrastive · SimPLe ·

2021 年 10 月 20 日

Minibatch vs Local SGD with Shuffling: Tight Convergence Bounds and Beyond

Chulhee Yun,Shashank Rajput,Suvrit Sra

from arxiv, 72 pages

In distributed learning, local SGD (also known as federated averaging) and its simple baseline minibatch SGD are widely studied optimization methods. Most existing analyses of these methods assume independent and unbiased gradient estimates obtained via with-replacement sampling. In contrast, we study shuffling-based variants: minibatch and local Random Reshuffling, which draw stochastic gradients without replacement and are thus closer to practice. For smooth functions satisfying the Polyak-{\L}ojasiewicz condition, we obtain convergence bounds (in the large epoch regime) which show that these shuffling-based variants converge faster than their with-replacement counterparts. Moreover, we prove matching lower bounds showing that our convergence analysis is tight. Finally, we propose an algorithmic modification called synchronized shuffling that leads to convergence rates faster than our lower bounds in near-homogeneous settings.

state-of-the-art · 任務對話系統 · CASE · 自動問答 · GROUP ·

2018 年 9 月 21 日

Neural Approaches to Conversational AI

Jianfeng Gao,Michel Galley,Lihong Li

from arxiv, Submitted to Foundations and Trends in Information Retrieval (85 pages)

The present paper surveys neural approaches to conversational AI that have been developed in the last few years. We group conversational systems into three categories: (1) question answering agents, (2) task-oriented dialogue agents, and (3) chatbots. For each category, we present a review of state-of-the-art neural approaches, draw the connection between them and traditional approaches, and discuss the progress that has been made and challenges still being faced, using specific systems and models as case studies.

Networking · Neural Networks · 優化器 · contrastive · CASE ·

2018 年 8 月 3 日

A Dual Approach to Scalable Verification of Deep Networks

Krishnamurthy, Dvijotham,Robert Stanforth,Sven Gowal,Timothy Mann,Pushmeet Kohli

This paper addresses the problem of formally verifying desirable properties of neural networks, i.e., obtaining provable guarantees that neural networks satisfy specifications relating their inputs and outputs (robustness to bounded norm adversarial perturbations, for example). Most previous work on this topic was limited in its applicability by the size of the network, network architecture and the complexity of properties to be verified. In contrast, our framework applies to a general class of activation functions and specifications on neural network inputs and outputs. We formulate verification as an optimization problem (seeking to find the largest violation of the specification) and solve a Lagrangian relaxation of the optimization problem to obtain an upper bound on the worst case violation of the specification being verified. Our approach is anytime i.e. it can be stopped at any time and a valid bound on the maximum violation can be obtained. We develop specialized verification algorithms with provable tightness guarantees under special assumptions and demonstrate the practical significance of our general verification approach on a variety of verification tasks.