亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

One-shot coupling is a method of bounding the convergence rate between two copies of a Markov chain in total variation distance. The method is divided into two parts: the contraction phase, when the chains converge in expected distance and the coalescing phase, which occurs at the last iteration, when there is an attempt to couple. The method closely resembles the common random number technique used for simulation. In this paper, we present a general theorem for finding the upper bound on the Markov chain convergence rate that uses the one-shot coupling method. Our theorem does not require the use of any exogenous variables like a drift function or minorization constant. We then apply the general theorem to two families of Markov chains: the random functional autoregressive process and the randomly scaled iterated random function. We provide multiple examples of how the theorem can be used on various models including ones in high dimensions. These examples illustrate how theorem's conditions can be verified in a straightforward way. The one-shot coupling method appears to generate tight geometric convergence rate bounds.

相關內容

馬爾可夫鏈,因安德烈·馬爾可夫(A.A.Markov,1856-1922)得名,是指數學中具有馬爾可夫性質的離散事件隨機過程。該過程中,在給定當前知識或信息的情況下,過去(即當前以前的歷史狀態)對于預測將來(即當前以后的未來狀態)是無關的。 在馬爾可夫鏈的每一步,系統根據概率分布,可以從一個狀態變到另一個狀態,也可以保持當前狀態。狀態的改變叫做轉移,與不同的狀態改變相關的概率叫做轉移概率。隨機漫步就是馬爾可夫鏈的例子。隨機漫步中每一步的狀態是在圖形中的點,每一步可以移動到任何一個相鄰的點,在這里移動到每一個點的概率都是相同的(無論之前漫步路徑是如何的)。

Kernel-based models such as kernel ridge regression and Gaussian processes are ubiquitous in machine learning applications for regression and optimization. It is well known that a serious downside for kernel-based models is the high computational cost; given a dataset of $n$ samples, the cost grows as $\mathcal{O}(n^3)$. Existing sparse approximation methods can yield a significant reduction in the computational cost, effectively reducing the real world cost down to as low as $\mathcal{O}(n)$ in certain cases. Despite this remarkable empirical success, significant gaps remain in the existing results for the analytical confidence bounds on the error due to approximation. In this work, we provide novel confidence intervals for the Nystr\"om method and the sparse variational Gaussian processes approximation method. Our confidence intervals lead to improved error bounds in both regression and optimization. We establish these confidence intervals using novel interpretations of the approximate (surrogate) posterior variance of the models.

Difference-of-Convex (DC) minimization, referring to the problem of minimizing the difference of two convex functions, has been found rich applications in statistical learning and studied extensively for decades. However, existing methods are primarily based on multi-stage convex relaxation, only leading to weak optimality of critical points. This paper proposes a coordinate descent method for minimizing DC functions based on sequential nonconvex approximation. Our approach iteratively solves a nonconvex one-dimensional subproblem globally, and it is guaranteed to converge to a coordinate-wise stationary point. We prove that this new optimality condition is always stronger than the critical point condition and the directional point condition when the objective function is weakly convex. For comparisons, we also include a naive variant of coordinate descent methods based on sequential convex approximation in our study. When the objective function satisfies an additional regularity condition called \textit{sharpness}, coordinate descent methods with an appropriate initialization converge \textit{linearly} to the optimal solution set. Also, for many applications of interest, we show that the nonconvex one-dimensional subproblem can be computed exactly and efficiently using a breakpoint searching method. Finally, we have conducted extensive experiments on several statistical learning tasks to show the superiority of our approach. Keywords: Coordinate Descent, DC Minimization, DC Programming, Difference-of-Convex Programs, Nonconvex Optimization, Sparse Optimization, Binary Optimization.

One of the main reasons for query model's prominence in quantum complexity is the presence of concrete lower bounding techniques: polynomial method and adversary method. There have been considerable efforts to not just give lower bounds using these methods but even to compare and relate them. We explore the value of these bounds on quantum query complexity for the class of symmetric functions, arguably one of the most natural and basic set of Boolean functions. We show that the recently introduced measure of spectral sensitivity give the same value as both these bounds (positive adversary and approximate degree) for every total symmetric Boolean function. We also look at the quantum query complexity of Gap Majority, a partial symmetric function. It has gained importance recently in regard to understanding the composition of randomized query complexity. We characterize the quantum query complexity of Gap Majority and show a lower bound on noisy randomized query complexity (Ben-David and Blais, FOCS 2020) in terms of quantum query complexity. In addition, we study how large certificate complexity and block sensitivity can be as compared to sensitivity (even up to constant factors) for symmetric functions. We show tight separations, i.e., give upper bound on possible separations and construct functions achieving the same.

This paper analyzes the fundamental limit of the strategic semantic communication problem in which a transmitter obtains a limited number of indirect observation of an intrinsic semantic information source and can then influence the receiver's decoding by sending a limited number of messages to an imperfect channel. The transmitter and the receiver can have different distortion measures and can make rational decision about their encoding and decoding strategies, respectively. The decoder can also have some side information (e.g., background knowledge and/or information obtained from previous communications) about the semantic source to assist its interpretation of the semantic information. We focus particularly on the case that the transmitter can commit to an encoding strategy and study the impact of the strategic decision making on the rate distortion of semantic communication. Three equilibrium solutions including the strong Stackelberg equilibrium, weak Stackelberg equilibrium, as well as Nash equilibrium have been studied and compared. The optimal encoding and decoding strategy profiles under various equilibrium solutions have been derived. We prove that committing to an encoding strategy cannot always bring benefit to the encoder. We therefore propose a feasible condition under which committing to an encoding strategy can always reduce the distortion performance of semantic communication.

Statistical wisdom suggests that very complex models, interpolating training data, will be poor at prediction on unseen examples. Yet, this aphorism has been recently challenged by the identification of benign overfitting regimes, specially studied in the case of parametric models: generalization capabilities may be preserved despite model high complexity. While it is widely known that fully-grown decision trees interpolate and, in turn, have bad predictive performances, the same behavior is yet to be analyzed for random forests. In this paper, we study the trade-off between interpolation and consistency for several types of random forest algorithms. Theoretically, we prove that interpolation regimes and consistency cannot be achieved for non-adaptive random forests. Since adaptivity seems to be the cornerstone to bring together interpolation and consistency, we introduce and study interpolating Adaptive Centered Forests, which are proved to be consistent in a noiseless scenario. Numerical experiments show that Breiman's random forests are consistent while exactly interpolating, when no bootstrap step is involved. We theoretically control the size of the interpolation area, which converges fast enough to zero, so that exact interpolation and consistency occur in conjunction.

Information leakage is becoming a critical problem as various information becomes publicly available by mistake, and machine learning models train on that data to provide services. As a result, one's private information could easily be memorized by such trained models. Unfortunately, deleting information is out of the question as the data is already exposed to the Web or third-party platforms. Moreover, we cannot necessarily control the labeling process and the model trainings by other parties either. In this setting, we study the problem of targeted disinformation where the goal is to lower the accuracy of inference attacks on a specific target (e.g., a person's profile) only using data insertion. While our problem is related to data privacy and defenses against exploratory attacks, our techniques are inspired by targeted data poisoning attacks with some key differences. We show that our problem is best solved by finding the closest points to the target in the input space that will be labeled as a different class. Since we do not control the labeling process, we instead conservatively estimate the labels probabilistically by combining decision boundaries of multiple classifiers using data programming techniques. We also propose techniques for making the disinformation realistic. Our experiments show that a probabilistic decision boundary can be a good proxy for labelers, and that our approach outperforms other targeted poisoning methods when using end-to-end training on real datasets.

In the present paper, we study the analyticity of the leftmost eigenvalue of the linear elliptic partial differential operator with random coefficient and analyze the convergence rate of the quasi-Monte Carlo method for approximation of the expectation of this quantity. The random coefficient is assumed to be represented by an affine expansion $a_0(\boldsymbol{x})+\sum_{j\in \mathbb{N}}y_ja_j(\boldsymbol{x})$, where elements of the parameter vector $\boldsymbol{y}=(y_j)_{j\in \mathbb{N}}\in U^\infty$ are independent and identically uniformly distributed on $U:=[-\frac{1}{2},\frac{1}{2}]$. Under the assumption $ \|\sum_{j\in \mathbb{N}}\rho_j|a_j|\|_{L_\infty(D)} <\infty$ with some positive sequence $(\rho_j)_{j\in \mathbb{N}}\in \ell_p(\mathbb{N})$ for $p\in (0,1]$ we show that for any $\boldsymbol{y}\in U^\infty$, the elliptic partial differential operator has a countably infinite number of eigenvalues $(\lambda_j(\boldsymbol{y}))_{j\in \mathbb{N}}$ which can be ordered non-decreasingly. Moreover, the spectral gap $\lambda_2(\boldsymbol{y})-\lambda_1(\boldsymbol{y})$ is uniformly positive in $U^\infty$. From this, we prove the holomorphic extension property of $\lambda_1(\boldsymbol{y})$ to a complex domain in $\mathbb{C}^\infty$ and estimate mixed derivatives of $\lambda_1(\boldsymbol{y})$ with respect to the parameters $\boldsymbol{y}$ by using Cauchy's formula for analytic functions. Based on these bounds we prove the dimension-independent convergence rate of the quasi-Monte Carlo method to approximate the expectation of $\lambda_1(\boldsymbol{y})$. In this case, the computational cost of fast component-by-component algorithm for generating quasi-Monte Carlo $N$-points scales linearly in terms of integration dimension.

We obtain new equitightness and $C([0,T];L^p(\mathbb{R}^N))$-convergence results for numerical approximations of generalized porous medium equations of the form $$ \partial_tu-\mathfrak{L}[\varphi(u)]=g\qquad\text{in $\mathbb{R}^N\times(0,T)$}, $$ where $\varphi:\mathbb{R}\to\mathbb{R}$ is continuous and nondecreasing, and $\mathfrak{L}$ is a local or nonlocal diffusion operator. Our results include slow diffusions, strongly degenerate Stefan problems, and fast diffusions above a critical exponent. These results improve the previous $C([0,T];L_{\text{loc}}^p(\mathbb{R}^N))$-convergence obtained in a series of papers on the topic by the authors. To have equitightness and global $L^p$-convergence, some additional restrictions on $\mathfrak{L}$ and $\varphi$ are needed. Most commonly used symmetric operators $\mathfrak{L}$ are still included: the Laplacian, fractional Laplacians, and other generators of symmetric L\'evy processes with some fractional moment. We also discuss extensions to nonlinear possibly strongly degenerate convection-diffusion equations.

We revisit the Bayesian Context Trees (BCT) modelling framework for discrete time series, which was recently found to be very effective in numerous tasks including model selection, estimation and prediction. A novel representation of the induced posterior distribution on model space is derived in terms of a simple branching process, and several consequences of this are explored in theory and in practice. First, it is shown that the branching process representation leads to a simple variable-dimensional Monte Carlo sampler for the joint posterior distribution on models and parameters, which can efficiently produce independent samples. This sampler is found to be more efficient than earlier MCMC samplers for the same tasks. Then, the branching process representation is used to establish the asymptotic consistency of the BCT posterior, including the derivation of an almost-sure convergence rate. Finally, an extensive study is carried out on the performance of the induced Bayesian entropy estimator. Its utility is illustrated through both simulation experiments and real-world applications, where it is found to outperform several state-of-the-art methods.

Stochastic gradient Markov chain Monte Carlo (SGMCMC) has become a popular method for scalable Bayesian inference. These methods are based on sampling a discrete-time approximation to a continuous time process, such as the Langevin diffusion. When applied to distributions defined on a constrained space, such as the simplex, the time-discretisation error can dominate when we are near the boundary of the space. We demonstrate that while current SGMCMC methods for the simplex perform well in certain cases, they struggle with sparse simplex spaces; when many of the components are close to zero. However, most popular large-scale applications of Bayesian inference on simplex spaces, such as network or topic models, are sparse. We argue that this poor performance is due to the biases of SGMCMC caused by the discretization error. To get around this, we propose the stochastic CIR process, which removes all discretization error and we prove that samples from the stochastic CIR process are asymptotically unbiased. Use of the stochastic CIR process within a SGMCMC algorithm is shown to give substantially better performance for a topic model and a Dirichlet process mixture model than existing SGMCMC approaches.

北京阿比特科技有限公司