一本色道综合久久欧美日韩精品_亚洲国产中文精品在线观看香蕉_国产人人看人人拍视频_国产一级毛片三邦车视_四房播播丁香开心婷婷伊人_欧美性爱内射德国_在线视频这里只有精品

Cyclic block coordinate methods are a fundamental class of optimization methods widely used in practice and implemented as part of standard software packages for statistical learning. Nevertheless, their convergence is generally not well understood and so far their good practical performance has not been explained by existing convergence analyses. In this work, we introduce a new block coordinate method that applies to the general class of variational inequality (VI) problems with monotone operators. This class includes composite convex optimization problems and convex-concave min-max optimization problems as special cases and has not been addressed by the existing work. The resulting convergence bounds match the optimal convergence bounds of full gradient methods, but are provided in terms of a novel gradient Lipschitz condition w.r.t.~a Mahalanobis norm. For $m$ coordinate blocks, the resulting gradient Lipschitz constant in our bounds is never larger than a factor $\sqrt{m}$ compared to the traditional Euclidean Lipschitz constant, while it is possible for it to be much smaller. Further, for the case when the operator in the VI has finite-sum structure, we propose a variance reduced variant of our method which further decreases the per-iteration cost and has better convergence rates in certain regimes. To obtain these results, we use a gradient extrapolation strategy that allows us to view a cyclic collection of block coordinate-wise gradients as one implicit gradient.

相關內容

Lipschitz

關注 0

坐標下降 · 動量 · FAST · 優化器 · 線性的 ·

2023 年 7 月 31 日

Fast stochastic dual coordinate descent algorithms for linearly constrained convex optimization

Zeng Yun,Deren Han,Yansheng Su,Jiaxin Xie

Finding a solution to the linear system $Ax = b$ with various minimization properties arises from many engineering and computer science applications, including compressed sensing, image processing, and machine learning. In the age of big data, the scalability of stochastic optimization algorithms has made it increasingly important to solve problems of unprecedented sizes. This paper focuses on the problem of minimizing a strongly convex objective function subject to linearly constraints. We consider the dual formulation of this problem and adopt the stochastic coordinate descent to solve it. The proposed algorithmic framework, called fast stochastic dual coordinate descent, utilizes an adaptive variation of Polyak's heavy ball momentum and user-defined distributions for sampling. Our adaptive heavy ball momentum technique can efficiently update the parameters by using iterative information, overcoming the limitation of the heavy ball momentum method where prior knowledge of certain parameters, such as singular values of a matrix, is required. We prove that, under strongly admissible of the objective function, the propose method converges linearly in expectation. By varying the sampling matrix, we recover a comprehensive array of well-known algorithms as special cases, including the randomized sparse Kaczmarz method, the randomized regularized Kaczmarz method, the linearized Bregman iteration, and a variant of the conjugate gradient (CG) method. Numerical experiments are provided to confirm our results.

近似 · Continuity · Performer · 正則的 · 操作 ·

2023 年 7 月 31 日

Operator Splitting/Finite Element Methods for the Minkowski Problem

Hao Liu,Shingyu Leung,Jianliang Qian

The classical Minkowski problem for convex bodies has deeply influenced the development of differential geometry. During the past several decades, abundant mathematical theories have been developed for studying the solutions of the Minkowski problem, however, the numerical solution of this problem has been largely left behind, with only few methods available to achieve that goal. In this article, focusing on the two-dimensional Minkowski problem with Dirichlet boundary conditions, we introduce two solution methods, both based on operator-splitting. One of these two methods deals directly with the Dirichlet condition, while the other method uses an approximation of this Dirichlet condition. This relaxation of the Dirichlet condition makes this second method better suited than the first one to treat those situations where the Minkowski and the Dirichlet condition are not compatible. Both methods are generalizations of the solution method for the canonical Monge-Amp\`{e}re equation discussed by Glowinski et al. (Journal of Scientific Computing, 79(1), 1-47, 2019); as such they take advantage of a divergence formulation of the Minkowski problem, well-suited to a mixed finite element approximation, and to the the time-discretization via an operator-splitting scheme, of an associated initial value problem. Our methodology can be easily implemented on convex domains of rather general shape (with curved boundaries, possibly). The numerical experiments we performed validate both methods and show that if one uses continuous piecewise affine finite element approximations of the smooth solution of the Minkowski problem and of its three second order derivatives, these two methods provide nearly second order accuracy for the $L^2$ and $L^{\infty}$ error. One can extend easily the methods discussed in this article, to address the solution of three-dimensional Minkowski problem.

MoDELS · 方差 · Extensibility · 線性模型 · Performer ·

2023 年 7 月 31 日

LASSO extension: using the number of non-zero coefficients to test the global model hypothesis

Carsten Uhlig,Steffen Uhlig

from arxiv, 9 pages, 2 figures

In this paper, we propose a test procedure based on the LASSO methodology to test the global null hypothesis of no dependence between a response variable and $p$ predictors, where $n$ observations with $n < p$ are available. The proposed procedure is similar to the F-test for a linear model, which evaluates significance based on the ratio of explained to unexplained variance. However, the F-test is not suitable for models where $p \geq n$. This limitation is due to the fact that when $p \geq n$, the unexplained variance is zero and thus the F-statistic can no longer be calculated. In contrast, the proposed extension of the LASSO methodology overcomes this limitation by using the number of non-zero coefficients in the LASSO model as a test statistic after suitably specifying the regularization parameter. The method allows reliable analysis of high-dimensional datasets with as few as $n = 40$ observations. The performance of the method is tested by means of a power study.

統計量 · 散度 · 協方差矩陣 · 方差 · 樣本 ·

2023 年 7 月 30 日

A CLT for the LSS of large dimensional sample covariance matrices with diverging spikes

Zhijun Liu,Jiang Hu,Zhidong Bai,Haiyan Song

from arxiv, Comparing with the old manuscript, we modified the application and simulation part. It is an update version of arXiv:2212.05896. arXiv admin note: text overlap with arXiv:2205.07280

In this paper, we establish the central limit theorem (CLT) for linear spectral statistics (LSS) of large-dimensional sample covariance matrix when the population covariance matrices are not uniformly bounded. This constitutes a nontrivial extension of the Bai-Silverstein theorem (BST) (Ann Probab 32(1):553--605, 2004), a theorem that has strongly influenced the development of high-dimensional statistics, especially in the applications of random matrix theory to statistics. Recently there has been a growing realization that the assumption of uniform boundedness of the population covariance matrices in BST is not satisfied in some fields, such as economics, where the variances of principal components could diverge as the dimension tends to infinity. Therefore, in this paper, we aim to eliminate the obstacles to the applications of BST. Our new CLT accommodates the spiked eigenvalues, which may either be bounded or tend to infinity. A distinguishing feature of our result is that the variance in the new CLT is related to both spiked eigenvalues and bulk eigenvalues, with dominance being determined by the divergence rate of the largest spiked eigenvalue. The new CLT for LSS is then applied to test the hypothesis that the population covariance matrix is the identity matrix or a generalized spiked model. The asymptotic distributions for the corrected likelihood ratio test statistic and corrected Nagao's trace test statistic are derived under the alternative hypothesis. Moreover, we provide power comparisons between the two LSSs and Roy's largest root test under certain hypotheses. In particular, we demonstrate that except for the case where the number of spikes is equal to 1, the LSSs may exhibit higher power than Roy's largest root test in certain scenarios.

線性的 · Subspace · Weight · 泛函 · 再生核希爾伯特空間 ·

2023 年 7 月 29 日

Infinite-Variate $L^2$-Approximation with Nested Subspace Sampling

Kumar Harsha,Michael Gnewuch,Marcin Wnuk

from arxiv, To appear in: A. Hinrichs, P. Kritzer, F. Pillichshammer (eds.). Monte Carlo and Quasi-Monte Carlo Methods 2022. Springer Verlag

We consider $L^2$-approximation on weighted reproducing kernel Hilbert spaces of functions depending on infinitely many variables. We focus on unrestricted linear information, admitting evaluations of arbitrary continuous linear functionals. We distinguish between ANOVA and non-ANOVA spaces, where, by ANOVA spaces, we refer to function spaces whose norms are induced by an underlying ANOVA function decomposition. In ANOVA spaces, we provide an optimal algorithm to solve the approximation problem using linear information. We determine the upper and lower error bounds on the polynomial convergence rate of $n$-th minimal worst-case errors, which match if the weights decay regularly. For non-ANOVA spaces, we also establish upper and lower error bounds. Our analysis reveals that for weights with a regular and moderate decay behavior, the convergence rate of $n$-th minimal errors is strictly higher in ANOVA than in non-ANOVA spaces.

CASES · 散度 · 情景 · 無限 · state-of-the-art ·

2023 年 7 月 28 日

Solving Infinite-State Games via Acceleration

Philippe Heim,Rayna Dimitrova

Two-player graph games have found numerous applications, most notably in the synthesis of reactive systems from temporal specifications, but also in verification. The relevance of infinite-state systems in these areas has lead to significant attention towards developing techniques for solving infinite-state games. We propose novel symbolic semi-algorithms for solving infinite-state games with $\omega$-regular winning conditions. The novelty of our approach lies in the introduction of an acceleration technique that enhances fixpoint-based game-solving methods and helps to avoid divergence. Classical fixpoint-based algorithms, when applied to infinite-state games, are bound to diverge in many cases, since they iteratively compute the set of states from which one player has a winning strategy. Our proposed approach can lead to convergence in cases where existing algorithms require an infinite number of iterations. This is achieved by acceleration: computing an infinite set of states from which a simpler sub-strategy can be iterated an unbounded number of times in order to win the game. Ours is the first method for solving infinite-state games to employ acceleration. Thanks to this, it is able to outperform state-of-the-art techniques on a range of benchmarks, as evidenced by our evaluation of a prototype implementation.

動量 · Principle · 線性的 · 知識 (knowledge) · 相似度 ·

2023 年 7 月 28 日

Minimal error momentum Bregman-Kaczmarz

Dirk A. Lorenz,Maximilian Winkler

The Bregman-Kaczmarz method is an iterative method which can solve strongly convex problems with linear constraints and uses only one or a selected number of rows of the system matrix in each iteration, thereby making it amenable for large-scale systems. To speed up convergence, we investigate acceleration by heavy ball momentum in the so-called dual update. Heavy ball acceleration of the Kaczmarz method with constant parameters has turned out to be difficult to analyze, in particular no accelerated convergence for the L2-error of the iterates has been proven to the best of our knowledge. Here we propose a way to adaptively choose the momentum parameter by a minimal-error principle similar to a recently proposed method for the standard randomized Kaczmarz method. The momentum parameter can be chosen to exactly minimize the error in the next iterate or to minimize a relaxed version of the minimal error principle. The former choice leads to a theoretically optimal step while the latter is cheaper to compute. We prove improved convergence results compared to the non-accelerated method. Numerical experiments show that the proposed methods can accelerate convergence in practice, also for matrices which arise from applications such as computational tomography.

Duet · 估計/估計量 · Learning · 可理解性 · 推斷 ·

2023 年 7 月 28 日

Duet: efficient and scalable hybriD neUral rElation undersTanding

Kaixin Zhang,Hongzhi Wang,Yabin Lu,Ziqi Li,Chang Shu,Yu Yan,Donghua Yang

Learned cardinality estimation methods have achieved high precision compared to traditional methods. Among learned methods, query-driven approaches face the data and workload drift problem for a long time. Although both query-driven and hybrid methods are proposed to avoid this problem, even the state-of-the-art of them suffer from high training and estimation costs, limited scalability, instability, and long-tailed distribution problem on high cardinality and high-dimensional tables, which seriously affects the practical application of learned cardinality estimators. In this paper, we prove that most of these problems are directly caused by the widely used progressive sampling. We solve this problem by introducing predicates information into the autoregressive model and propose Duet, a stable, efficient, and scalable hybrid method to estimate cardinality directly without sampling or any non-differentiable process, which can not only reduces the inference complexity from O(n) to O(1) compared to Naru and UAE but also achieve higher accuracy on high cardinality and high-dimensional tables. Experimental results show that Duet can achieve all the design goals above and be much more practical and even has a lower inference cost on CPU than that of most learned methods on GPU.

估計/估計量 · 線性的 · 黑盒 · 推斷 · 方差 ·

2023 年 7 月 27 日

Linear Convergence of Black-Box Variational Inference: Should We Stick the Landing?

Kyurae Kim,Yian Ma,Jacob R. Gardner

We prove that black-box variational inference (BBVI) with control variates, particularly the sticking-the-landing (STL) estimator, converges at a geometric (traditionally called "linear") rate under perfect variational family specification. In particular, we prove a quadratic bound on the gradient variance of the STL estimator, one which encompasses misspecified variational families. Combined with previous works on the quadratic variance condition, this directly implies convergence of BBVI with the use of projected stochastic gradient descent. We also improve existing analysis on the regular closed-form entropy gradient estimators, which enables comparison against the STL estimator and provides explicit non-asymptotic complexity guarantees for both.

圖形處理器 · 圖 · Neural Networks · Networking · 層 ·

2020 年 5 月 24 日

Connecting the Dots: Multivariate Time Series Forecasting with Graph Neural Networks

Zonghan Wu,Shirui Pan,Guodong Long,Jing Jiang,Xiaojun Chang,Chengqi Zhang

from arxiv, Accepted by KDD 2020

Modeling multivariate time series has long been a subject that has attracted researchers from a diverse range of fields including economics, finance, and traffic. A basic assumption behind multivariate time series forecasting is that its variables depend on one another but, upon looking closely, it is fair to say that existing methods fail to fully exploit latent spatial dependencies between pairs of variables. In recent years, meanwhile, graph neural networks (GNNs) have shown high capability in handling relational dependencies. GNNs require well-defined graph structures for information propagation which means they cannot be applied directly for multivariate time series where the dependencies are not known in advance. In this paper, we propose a general graph neural network framework designed specifically for multivariate time series data. Our approach automatically extracts the uni-directed relations among variables through a graph learning module, into which external knowledge like variable attributes can be easily integrated. A novel mix-hop propagation layer and a dilated inception layer are further proposed to capture the spatial and temporal dependencies within the time series. The graph learning, graph convolution, and temporal convolution modules are jointly learned in an end-to-end framework. Experimental results show that our proposed model outperforms the state-of-the-art baseline methods on 3 of 4 benchmark datasets and achieves on-par performance with other approaches on two traffic datasets which provide extra structural information.