18GAY国产小鲜肉可播放-亚洲日韩网站在线观看

統計量 · 線性的 · 樣本 · 知識 (knowledge) · 可辨認的 ·

2023 年 8 月 16 日

Global and local CLTs for linear spectral statistics of general sample covariance matrices when the dimension is much larger than the sample size with applications

Xiucai Ding,Zhenggang Wang

from arxiv, arXiv admin note: substantial text overlap with arXiv:1308.5729 by other authors

In this paper, under the assumption that the dimension is much larger than the sample size, i.e., $p \asymp n^{\alpha}, \alpha>1,$ we consider the (unnormalized) sample covariance matrices $Q = \Sigma^{1/2} XX^*\Sigma^{1/2}$, where $X=(x_{ij})$ is a $p \times n$ random matrix with centered i.i.d entries whose variances are $(pn)^{-1/2}$, and $\Sigma$ is the deterministic population covariance matrix. We establish two classes of central limit theorems (CLTs) for the linear spectral statistics (LSS) for $Q,$ the global CLTs on the macroscopic scales and the local CLTs on the mesoscopic scales. We prove that the LSS converge to some Gaussian processes whose mean and covariance functions depending on $\Sigma$, the ratio $p/n$ and the test functions, can be identified explicitly on both macroscopic and mesoscopic scales. We also show that even though the global CLTs depend on the fourth cumulant of $x_{ij},$ the local CLTs do not. Based on these results, we propose two classes of statistics for testing the structures of $\Sigma,$ the global statistics and the local statistics, and analyze their superior power under general local alternatives. To our best knowledge, the local LSS testing statistics which do not rely on the fourth moment of $x_{ij},$ is used for the first time in hypothesis testing while the literature mostly uses the global statistics and requires the prior knowledge of the fourth cumulant. Numerical simulations also confirm the accuracy and powerfulness of our proposed statistics and illustrate better performance compared to the existing methods in the literature.

相關內容

統計量

關注 3

離散化 · 正則化項 · 線性的 · 優化器 · 數值分析 ·

2023 年 10 月 6 日

Fully discrete Galerkin scheme for a semilinear subdiffusion equation with nonsmooth data and time-dependent coefficient

?ukasz P?ociniczak,Kacper Ta?bierski

We couple the L1 discretization of the Caputo fractional derivative in time with the Galerkin scheme to devise a linear numerical method for the semilinear subdiffusion equation. Two important points that we make are: nonsmooth initial data and time-dependent diffusion coefficient. We prove the stability and convergence of the method under weak assumptions concerning regularity of the diffusivity. We find optimal pointwise in space and global in time errors, which are verified with several numerical experiments.

離散化 · 估計/估計量 · 近似 · 穩健性 · Analysis ·

2023 年 10 月 6 日

Error estimates for the robust $α$-stable central limit theorem under sublinear expectation by discrete approximation method

Lianzi Jiang

In this work, we develop a numerical method to study the error estimates of the $\alpha$-stable central limit theorem under sublinear expectation with $\alpha \in(0,2)$, whose limit distribution can be characterized by a fully nonlinear integro-differential equation (PIDE). Based on the sequence of independent random variables, we propose a discrete approximation scheme for the fully nonlinear PIDE. With the help of the nonlinear stochastic analysis techniques and numerical analysis tools, we establish the error bounds for the discrete approximation scheme, which in turn provides a general error bound for the robust $\alpha$-stable central limit theorem, including the integrable case $\alpha \in(1,2)$ as well as the non-integrable case $\alpha \in(0,1]$. Finally, we provide some concrete examples to illustrate our main results and derive the precise convergence rates.

向量化 · 流形 · 流形學習 · 變換 · Learning ·

2023 年 10 月 5 日

On Wasserstein distances for affine transformations of random vectors

Keaton Hamm,Andrzej Korzeniowski

We expound on some known lower bounds of the quadratic Wasserstein distance between random vectors in $\mathbb{R}^n$ with an emphasis on affine transformations that have been used in manifold learning of data in Wasserstein space. In particular, we give concrete lower bounds for rotated copies of random vectors in $\mathbb{R}^2$ with uncorrelated components by computing the Bures metric between the covariance matrices. We also derive upper bounds for compositions of affine maps which yield a fruitful variety of diffeomorphisms applied to an initial data measure. We apply these bounds to various distributions including those lying on a 1-dimensional manifold in $\mathbb{R}^2$ and illustrate the quality of the bounds. Finally, we give a framework for mimicking handwritten digit or alphabet datasets that can be applied in a manifold learning framework.

估計/估計量 · 配分函數 · 泛函 · 樣本 · 劃分 ·

2023 年 10 月 5 日

Parameter estimation for Gibbs distributions

David G. Harris,Vladimir Kolmogorov

from arxiv, This is a longer version which extends two previous papers "A Faster Approximation Algorithm for the Gibbs Partition Function" (arXiv:1608.04223), published in COLT 2018, and "Parameter estimation for Gibbs distributions" published in ICALP 2023

We consider Gibbs distributions, which are families of probability distributions over a discrete space $\Omega$ with probability mass function of the form $\mu^\Omega_\beta(\omega) \propto e^{\beta H(\omega)}$ for $\beta$ in an interval $[\beta_{\min}, \beta_{\max}]$ and $H( \omega ) \in \{0 \} \cup [1, n]$. The partition function is the normalization factor $Z(\beta)=\sum_{\omega \in\Omega}e^{\beta H(\omega)}$. Two important parameters of these distributions are the log partition ratio $q = \log \tfrac{Z(\beta_{\max})}{Z(\beta_{\min})}$ and the counts $c_x = |H^{-1}(x)|$. These are correlated with system parameters in a number of physical applications and sampling algorithms. Our first main result is to estimate the counts $c_x$ using roughly $\tilde O( \frac{q}{\varepsilon^2})$ samples for general Gibbs distributions and $\tilde O( \frac{n^2}{\varepsilon^2} )$ samples for integer-valued distributions (ignoring some second-order terms and parameters), and we show this is optimal up to logarithmic factors. We illustrate with improved algorithms for counting connected subgraphs, independent sets, and perfect matchings. As a key subroutine, we also develop algorithms to compute the partition function $Z$ using $\tilde O(\frac{q}{\varepsilon^2})$ samples for general Gibbs distributions and using $\tilde O(\frac{n^2}{\varepsilon^2})$ samples for integer-valued distributions.

極小點 · Color · 評論員 · 確切的 · 有向 ·

2023 年 10 月 5 日

Minimum number of arcs in $k$-critical digraphs with order at most $2k-1$

Lucas Picasarri-Arrieta,Michael Stiebitz

The dichromatic number $\vec{\chi}(D)$ of a digraph $D$ is the least integer $k$ for which $D$ has a coloring with $k$ colors such that there is no monochromatic directed cycle in $D$. The digraphs considered here are finite and may have antiparallel arcs, but no parallel arcs. A digraph $D$ is called $k$-critical if each proper subdigraph $D'$ of $D$ satisfies $\vec{\chi}(D')<\vec{\chi}(D)=k$. For integers $k$ and $n$, let $\overrightarrow{\mathrm{ext}}(k,n)$ denote the minimum number of arcs possible in a $k$-critical digraph of order $n$. It is easy to show that $\overrightarrow{\mathrm{ext}}(2,n)=n$ for all $n\geq 2$, and $\overrightarrow{\mathrm{ext}}(3,n)\geq 2n$ for all possible $n$, where equality holds if and only if $n$ is odd and $n\geq 3$. As a main result we prove that if $n, k$ and $p$ are integers with $n=k+p$ and $2\leq p \leq k-1$, then $\overrightarrow{\mathrm{ext}}(k,n)=2({\binom{n}{2}} - (p^2+1))$, and we give an exact characterisation of $k$-critical digraphs for which equality holds. This generalizes a result about critical graphs obtained in 1963 by Tibor Gallai.

圖 · TOOLS · 可約的 · 正則化項 · Color ·

2023 年 10 月 5 日

New reducible configurations for graph multicoloring with application to the experimental resolution of McDiarmid-Reed's Conjecture (extended version)

Jean-Christophe Godin,Olivier Togni

from arxiv, 24 pages, new version with more reducible configurations

A $(a,b)$-coloring of a graph $G$ associates to each vertex a $b$-subset of a set of $a$ colors in such a way that the color-sets of adjacent vertices are disjoint. We define general reduction tools for $(a,b)$-coloring of graphs for $2\le a/b\le 3$. In particular, using necessary and sufficient conditions for the existence of a $(a,b)$-coloring of a path with prescribed color-sets on its end-vertices, more complex $(a,b)$-colorability reductions are presented. The utility of these tools is exemplified on finite triangle-free induced subgraphs of the triangular lattice for which McDiarmid-Reed's conjecture asserts that they are all $(9,4)$-colorable. Computations on millions of such graphs generated randomly show that our tools allow to find a $(9,4)$-coloring for each of them except for one specific regular shape of graphs (that can be $(9,4)$-colored by an easy ad-hoc process). We thus obtain computational evidence towards the conjecture of McDiarmid\&Reed.

相互獨立的 · MoDELS · CASE · 數值分析 ·

2023 年 10 月 5 日

On the stability and convergence of Discontinuous Galerkin schemes for incompressible flow

Pablo Alexei Gazca-Orozco,Alex Kaltenbach

from arxiv, convergence proof has been added

The property that the velocity $\boldsymbol{u}$ belongs to $L^\infty(0,T;L^2(\Omega)^d)$ is an essential requirement in the definition of energy solutions of models for incompressible fluids. It is, therefore, highly desirable that the solutions produced by discretisation methods are uniformly stable in the $L^\infty(0,T;L^2(\Omega)^d)$-norm. In this work, we establish that this is indeed the case for Discontinuous Galerkin (DG) discretisations (in time and space) of non-Newtonian models with $p$-structure, assuming that $p\geq \frac{3d+2}{d+2}$; the time discretisation is equivalent to the RadauIIA Implicit Runge-Kutta method. We also prove (weak) convergence of the numerical scheme to the weak solution of the system; this type of convergence result for schemes based on quadrature seems to be new. As an auxiliary result, we also derive Gagliardo-Nirenberg-type inequalities on DG spaces, which might be of independent interest.

Agent · 代價 · binary · 極小點 · 邊緣化 ·

2023 年 10 月 4 日

Telikepalli Kavitha,Tamás Király,Jannik Matuschke,Ildikó Schlotter,Ulrike Schmidt-Kraepelin

from arxiv, Preliminary version appeared in Proc. of the 2022 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2022), SIAM, pp. 103-123, 2022. The paper now contains Subsections 4.1 and 4.2, an addition to the previous version

We consider a matching problem in a bipartite graph $G=(A\cup B,E)$ where nodes in $A$ are agents having preferences in partial order over their neighbors, while nodes in $B$ are objects without preferences. We propose a polynomial-time combinatorial algorithm based on LP duality that finds a maximum matching or assignment in $G$ that is popular among all maximum matchings, if there exists one. Our algorithm can also be used to achieve a trade-off between popularity and cardinality by imposing a penalty on unmatched nodes in $A$. We also provide an $O^*(|E|^k)$ algorithm that finds an assignment whose unpopularity margin is at most $k$; this algorithm is essentially optimal, since the problem is $\mathsf{NP}$-complete and $\mathsf{W}_l[1]$-hard with parameter $k$. We also prove that finding a popular assignment of minimum cost when each edge has an associated binary cost is $\mathsf{NP}$-hard, even if agents have strict preferences. By contrast, we propose a polynomial-time algorithm for the variant of the popular assignment problem with forced/forbidden edges. Finally, we present an application in the context of housing markets.

數據選擇 · 統計量 · 標注 · 樣本 · Learning ·

2023 年 10 月 4 日

Towards a statistical theory of data selection under weak supervision

Germain Kolossov,Andrea Montanari,Pulkit Tandon

from arxiv, 55 pages; 14 figures

Given a sample of size $N$, it is often useful to select a subsample of smaller size $n<N$ to be used for statistical estimation or learning. Such a data selection step is useful to reduce the requirements of data labeling and the computational complexity of learning. We assume to be given $N$ unlabeled samples $\{{\boldsymbol x}_i\}_{i\le N}$, and to be given access to a `surrogate model' that can predict labels $y_i$ better than random guessing. Our goal is to select a subset of the samples, to be denoted by $\{{\boldsymbol x}_i\}_{i\in G}$, of size $|G|=n<N$. We then acquire labels for this set and we use them to train a model via regularized empirical risk minimization. By using a mixture of numerical experiments on real and synthetic data, and mathematical derivations under low- and high- dimensional asymptotics, we show that: $(i)$~Data selection can be very effective, in particular beating training on the full sample in some cases; $(ii)$~Certain popular choices in data selection methods (e.g. unbiased reweighted subsampling, or influence function-based subsampling) can be substantially suboptimal.

概率圖模型 · 推斷 · GM · 稀疏 · MoDELS ·

2023 年 10 月 3 日

Delta-AI: Local objectives for amortized inference in sparse graphical models

Jean-Pierre Falet,Hae Beom Lee,Nikolay Malkin,Chen Sun,Dragos Secrieru,Dinghuai Zhang,Guillaume Lajoie,Yoshua Bengio

from arxiv, 19 pages, code: //github.com/GFNOrg/Delta-AI/

We present a new algorithm for amortized inference in sparse probabilistic graphical models (PGMs), which we call $\Delta$-amortized inference ($\Delta$-AI). Our approach is based on the observation that when the sampling of variables in a PGM is seen as a sequence of actions taken by an agent, sparsity of the PGM enables local credit assignment in the agent's policy learning objective. This yields a local constraint that can be turned into a local loss in the style of generative flow networks (GFlowNets) that enables off-policy training but avoids the need to instantiate all the random variables for each parameter update, thus speeding up training considerably. The $\Delta$-AI objective matches the conditional distribution of a variable given its Markov blanket in a tractable learned sampler, which has the structure of a Bayesian network, with the same conditional distribution under the target PGM. As such, the trained sampler recovers marginals and conditional distributions of interest and enables inference of partial subsets of variables. We illustrate $\Delta$-AI's effectiveness for sampling from synthetic PGMs and training latent variable models with sparse factor structure.