又黄又爽又色的视频免费_欧美亚一区二区三区不卡视频_亚洲国产A精品一区二区30P_午夜亚洲国产理论片在线播放_亚洲无码视频免费在线观看_刺激妇乱子伦交换_国产高潮刺激一区二区三区

This paper discusses the approximate distributions of eigenvalues of a singular Wishart matrix. We give the approximate joint density of eigenvalues by Laplace approximation for the hyper-geometric functions of matrix arguments. Furthermore, we show that the distribution of each eigenvalue can be approximated by the chi-square distribution with varying degrees of freedom when the population eigenvalues are infinitely dispersed. The derived result is applied to testing the equality of eigenvalues in two populations

相關內容

近似

關注 0

Networking · 泛化理論 · 貝葉斯推斷 · 核化 · Neural Networks ·

2023 年 7 月 31 日

A theory of data variability in Neural Network Bayesian inference

Javed Lindner,David Dahmen,Michael Kr?mer,Moritz Helias

from arxiv, 16 pages, 6 figures

Bayesian inference and kernel methods are well established in machine learning. The neural network Gaussian process in particular provides a concept to investigate neural networks in the limit of infinitely wide hidden layers by using kernel and inference methods. Here we build upon this limit and provide a field-theoretic formalism which covers the generalization properties of infinitely wide networks. We systematically compute generalization properties of linear, non-linear, and deep non-linear networks for kernel matrices with heterogeneous entries. In contrast to currently employed spectral methods we derive the generalization properties from the statistical properties of the input, elucidating the interplay of input dimensionality, size of the training data set, and variability of the data. We show that data variability leads to a non-Gaussian action reminiscent of a ($\varphi^3+\varphi^4$)-theory. Using our formalism on a synthetic task and on MNIST we obtain a homogeneous kernel matrix approximation for the learning curve as well as corrections due to data variability which allow the estimation of the generalization properties and exact results for the bounds of the learning curves in the case of infinitely many training data points.

分解的 · 對稱矩陣 · Networking · Performer · Neural Networks ·

2023 年 7 月 31 日

The Decimation Scheme for Symmetric Matrix Factorization

Francesco Camilli,Marc Mézard

from arxiv, 30 pages, 13 figures

Matrix factorization is an inference problem that has acquired importance due to its vast range of applications that go from dictionary learning to recommendation systems and machine learning with deep networks. The study of its fundamental statistical limits represents a true challenge, and despite a decade-long history of efforts in the community, there is still no closed formula able to describe its optimal performances in the case where the rank of the matrix scales linearly with its size. In the present paper, we study this extensive rank problem, extending the alternative 'decimation' procedure that we recently introduced, and carry out a thorough study of its performance. Decimation aims at recovering one column/line of the factors at a time, by mapping the problem into a sequence of neural network models of associative memory at a tunable temperature. Though being sub-optimal, decimation has the advantage of being theoretically analyzable. We extend its scope and analysis to two families of matrices. For a large class of compactly supported priors, we show that the replica symmetric free entropy of the neural network models takes a universal form in the low temperature limit. For sparse Ising prior, we show that the storage capacity of the neural network models diverges as sparsity in the patterns increases, and we introduce a simple algorithm based on a ground state search that implements decimation and performs matrix factorization, with no need of an informative initialization.

奇異的 · 向量空間 · 統計量 · 估計/估計量 · SimPLe ·

2023 年 7 月 31 日

Non-centered parametric variational Bayes' approach for hierarchical inverse problems of partial differential equations

Jiaming Sui,Junxiong Jia

from arxiv, 43 pages

This paper proposes a non-centered parameterization based infinite-dimensional mean-field variational inference (NCP-iMFVI) approach for solving the hierarchical Bayesian inverse problems. This method can generate available estimates from the approximated posterior distribution efficiently. To avoid the mutually singular obstacle that occurred in the infinite-dimensional hierarchical approach, we propose a rigorous theory of the non-centered variational Bayesian approach. Since the non-centered parameterization weakens the connection between the parameter and the hyper-parameter, we can introduce the hyper-parameter to all terms of the eigendecomposition of the prior covariance operator. We also show the relationships between the NCP-iMFVI and infinite-dimensional hierarchical approaches with centered parameterization. The proposed algorithm is applied to three inverse problems governed by the simple smooth equation, the Helmholtz equation, and the steady-state Darcy flow equation. Numerical results confirm our theoretical findings, illustrate the efficiency of solving the iMFVI problem formulated by large-scale linear and nonlinear statistical inverse problems, and verify the mesh-independent property.

MoDELS · Analysis · 近似 · 樣例 · 合一 ·

2023 年 7 月 31 日

Unification of Rare/Weak Detection Models using Moderate Deviations Analysis and Log-Chisquared P-values

Alon Kipnis

from arxiv, Accepted for publication. Proofs are in the supplementary material

Rare and Weak models for multiple hypothesis testing assume that only a small proportion of the tested hypotheses concern non-null effects and the individual effects are only moderately large, so they generally do not stand out individually, for example in a Bonferroni analysis. Such models have been studied in quite a few settings, for example in some cases studies focused on an underlying Gaussian means model for the hypotheses being tested; in some others, Poisson and Binomial. Such seemingly different models have asymptotically the following common structure. Summarizing the evidence of individual tests by the negative logarithm of its P-value, the model is asymptotically equivalent to a situation in which most negative log P-values have a standard exponential distribution but a small fraction of the P-values might have an alternative distribution which is approximately noncentral chisquared on one degree of freedom. This log-chisquared approximation is different from the log-normal approximation of Bahadur which is unsuitable for analyzing Rare and Weak multiple testing models. We characterize the asymptotic performance of global tests combining asymptotic log-chisquared P-values in terms of the chisquared mixture parameters: the scaling parameter controlling heteroscedasticity, the non-centrality parameter describing the effect size whenever it exists, and the parameter controlling the rarity of the non-null effects. In a phase space involving the last two parameters, we derive a region where all tests are asymptotically powerless. Outside of this region, the Berk-Jones and the Higher Criticism tests have maximal power. Inference techniques based on the minimal P-value, false-discovery rate controlling, and Fisher's combination test have sub-optimal asymptotic phase diagrams.

圖 · 情景 · 示例 · 不變 · 類別 ·

2023 年 7 月 29 日

Variety of mutual-visibility problems in graphs

Serafino Cicerone,Gabriele Di Stefano,Lara Drozek,Jaka Hedzet,Sandi Klavzar,Ismael G. Yero

from arxiv, 23 pages, 4 figures, original paper

If $X$ is a subset of vertices of a graph $G$, then vertices $u$ and $v$ are $X$-visible if there exists a shortest $u,v$-path $P$ such that $V(P)\cap X \subseteq \{u,v\}$. If each two vertices from $X$ are $X$-visible, then $X$ is a mutual-visibility set. The mutual-visibility number of $G$ is the cardinality of a largest mutual-visibility set of $G$ and has been already investigated. In this paper a variety of mutual-visibility problems is introduced based on which natural pairs of vertices are required to be $X$-visible. This yields the total, the dual, and the outer mutual-visibility numbers. We first show that these graph invariants are related to each other and to the classical mutual-visibility number, and then we prove that the three newly introduced mutual-visibility problems are computationally difficult. According to this result, we compute or bound their values for several graphs classes that include for instance grid graphs and tori. We conclude the study by presenting some inter-comparison between the values of such parameters, which is based on the computations we made for some specific families.

貪心 · 可約的 · 控制器 · Learning · 優化器 ·

2023 年 7 月 28 日

Be greedy and learn: efficient and certified algorithms for parametrized optimal control problems

Hendrik Kleikamp,Martin Lazar,Cesare Molinari

We consider parametrized linear-quadratic optimal control problems and provide their online-efficient solutions by combining greedy reduced basis methods and machine learning algorithms. To this end, we first extend the greedy control algorithm, which builds a reduced basis for the manifold of optimal final time adjoint states, to the setting where the objective functional consists of a penalty term measuring the deviation from a desired state and a term describing the control energy. Afterwards, we apply machine learning surrogates to accelerate the online evaluation of the reduced model. The error estimates proven for the greedy procedure are further transferred to the machine learning models and thus allow for efficient a posteriori error certification. We discuss the computational costs of all considered methods in detail and show by means of two numerical examples the tremendous potential of the proposed methodology.

拒絕采樣 · 估計/估計量 · 數據注意事項 · 樣本 · 無偏估計 ·

2023 年 7 月 27 日

RCT Rejection Sampling for Causal Estimation Evaluation

Katherine A. Keith,Sergey Feldman,David Jurgens,Jonathan Bragg,Rohit Bhattacharya

from arxiv, Code and data at //github.com/kakeith/rct_rejection_sampling

Confounding is a significant obstacle to unbiased estimation of causal effects from observational data. For settings with high-dimensional covariates -- such as text data, genomics, or the behavioral social sciences -- researchers have proposed methods to adjust for confounding by adapting machine learning methods to the goal of causal estimation. However, empirical evaluation of these adjustment methods has been challenging and limited. In this work, we build on a promising empirical evaluation strategy that simplifies evaluation design and uses real data: subsampling randomized controlled trials (RCTs) to create confounded observational datasets while using the average causal effects from the RCTs as ground-truth. We contribute a new sampling algorithm, which we call RCT rejection sampling, and provide theoretical guarantees that causal identification holds in the observational data to allow for valid comparisons to the ground-truth RCT. Using synthetic data, we show our algorithm indeed results in low bias when oracle estimators are evaluated on the confounded samples, which is not always the case for a previously proposed algorithm. In addition to this identification result, we highlight several finite data considerations for evaluation designers who plan to use RCT rejection sampling on their own datasets. As a proof of concept, we implement an example evaluation pipeline and walk through these finite data considerations with a novel, real-world RCT -- which we release publicly -- consisting of approximately 70k observations and text data as high-dimensional covariates. Together, these contributions build towards a broader agenda of improved empirical evaluation for causal estimation.

線性的 · 動力系統 · 估計/估計量 · binary · MoDELS ·

2023 年 7 月 26 日

Spectral learning of Bernoulli linear dynamical systems models

Iris R. Stone,Yotam Sagiv,Il Memming Park,Jonathan W. Pillow

from arxiv, Published in Transactions on Machine Learning Research (//jmlr.org/tmlr/papers/)

Latent linear dynamical systems with Bernoulli observations provide a powerful modeling framework for identifying the temporal dynamics underlying binary time series data, which arise in a variety of contexts such as binary decision-making and discrete stochastic processes (e.g., binned neural spike trains). Here we develop a spectral learning method for fast, efficient fitting of probit-Bernoulli latent linear dynamical system (LDS) models. Our approach extends traditional subspace identification methods to the Bernoulli setting via a transformation of the first and second sample moments. This results in a robust, fixed-cost estimator that avoids the hazards of local optima and the long computation time of iterative fitting procedures like the expectation-maximization (EM) algorithm. In regimes where data is limited or assumptions about the statistical structure of the data are not met, we demonstrate that the spectral estimate provides a good initialization for Laplace-EM fitting. Finally, we show that the estimator provides substantial benefits to real world settings by analyzing data from mice performing a sensory decision-making task.

正則化項 · 學成 · Principle · 廣義函數 · 蛋白折疊 ·

2021 年 5 月 2 日

Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges

Michael M. Bronstein,Joan Bruna,Taco Cohen,Petar Veli?kovi?

from arxiv, 156 pages. Work in progress -- comments welcome!

The last decade has witnessed an experimental revolution in data science and machine learning, epitomised by deep learning methods. Indeed, many high-dimensional learning tasks previously thought to be beyond reach -- such as computer vision, playing Go, or protein folding -- are in fact feasible with appropriate computational scale. Remarkably, the essence of deep learning is built from two simple algorithmic principles: first, the notion of representation or feature learning, whereby adapted, often hierarchical, features capture the appropriate notion of regularity for each task, and second, learning by local gradient-descent type methods, typically implemented as backpropagation. While learning generic functions in high dimensions is a cursed estimation problem, most tasks of interest are not generic, and come with essential pre-defined regularities arising from the underlying low-dimensionality and structure of the physical world. This text is concerned with exposing these regularities through unified geometric principles that can be applied throughout a wide spectrum of applications. Such a 'geometric unification' endeavour, in the spirit of Felix Klein's Erlangen Program, serves a dual purpose: on one hand, it provides a common mathematical framework to study the most successful neural network architectures, such as CNNs, RNNs, GNNs, and Transformers. On the other hand, it gives a constructive procedure to incorporate prior physical knowledge into neural architectures and provide principled way to build future architectures yet to be invented.

非凸 · 優化器 · 因子分解 · 統計量 · 分解的 ·

2019 年 9 月 19 日

Nonconvex Optimization Meets Low-Rank Matrix Factorization: An Overview

Yuejie Chi,Yue M. Lu,Yuxin Chen

from arxiv, Invited overview article

Substantial progress has been made recently on developing provably accurate and efficient algorithms for low-rank matrix factorization via nonconvex optimization. While conventional wisdom often takes a dim view of nonconvex optimization algorithms due to their susceptibility to spurious local minima, simple iterative methods such as gradient descent have been remarkably successful in practice. The theoretical footings, however, had been largely lacking until recently. In this tutorial-style overview, we highlight the important role of statistical models in enabling efficient nonconvex optimization with performance guarantees. We review two contrasting approaches: (1) two-stage algorithms, which consist of a tailored initialization step followed by successive refinement; and (2) global landscape analysis and initialization-free algorithms. Several canonical matrix factorization problems are discussed, including but not limited to matrix sensing, phase retrieval, matrix completion, blind deconvolution, robust principal component analysis, phase synchronization, and joint alignment. Special care is taken to illustrate the key technical insights underlying their analyses. This article serves as a testament that the integrated consideration of optimization and statistics leads to fruitful research findings.