高清国产三级在线播放,久久人妻中出按摩,在线观看国产免费AV网站,久草视频在线这里精品,在线播放欧美精品一区二区

Maximum likelihood estimation (MLE) is a fundamental problem in statistics. Characteristics of the MLE problem for algebraic statistical models are reflected in the geometry of the likelihood correspondence, a variety that ties together data and their maximum likelihood estimators. We construct the ideal of the likelihood correspondence for the large class of toric models and find a Gr\"obner basis in the case of complete and joint independence models arising from multi-way contingency tables. These results provide insight into their properties and offer faster computational strategies for solving the MLE problem.

相關內容

似然

關注 0

估計/估計量 · 相對熵 · 參數空間 · INFORMS · 相關系數 ·

2024 年 2 月 5 日

Quantum Neural Estimation of Entropies

Ziv Goldfeld,Dhrumil Patel,Sreejith Sreekumar,Mark M. Wilde

from arxiv, 14 pages, 2 figures; see also independent works of Shin, Lee, and Jeong at arXiv:2306.14566v1 and Lee, Kwon, and Lee at arXiv:2307.13511v2

Entropy measures quantify the amount of information and correlation present in a quantum system. In practice, when the quantum state is unknown and only copies thereof are available, one must resort to the estimation of such entropy measures. Here we propose a variational quantum algorithm for estimating the von Neumann and R\'enyi entropies, as well as the measured relative entropy and measured R\'enyi relative entropy. Our approach first parameterizes a variational formula for the measure of interest by a quantum circuit and a classical neural network, and then optimizes the resulting objective over parameter space. Numerical simulations of our quantum algorithm are provided, using a noiseless quantum simulator. The algorithm provides accurate estimates of the various entropy measures for the examples tested, which renders it as a promising approach for usage in downstream tasks.

樣本復雜度 · Networking · ReLU · 樣本 · 相互獨立的 ·

2024 年 2 月 4 日

On Size-Independent Sample Complexity of ReLU Networks

Mark Sellke

from arxiv, 4 pages

We study the sample complexity of learning ReLU neural networks from the point of view of generalization. Given norm constraints on the weight matrices, a common approach is to estimate the Rademacher complexity of the associated function class. Previously Golowich-Rakhlin-Shamir (2020) obtained a bound independent of the network size (scaling with a product of Frobenius norms) except for a factor of the square-root depth. We give a refinement which often has no explicit depth-dependence at all.

統計量 · 模型評估 · 卡爾曼濾波 · 近似 · 過濾式方法 ·

2024 年 2 月 2 日

Statistical Accuracy of Approximate Filtering Methods

J. A. Carrillo,F. Hoffmann,A. M. Stuart,U. Vaes

from arxiv, To appear in SIAM News

Estimating the statistics of the state of a dynamical system, from partial and noisy observations, is both mathematically challenging and finds wide application. Furthermore, the applications are of great societal importance, including problems such as probabilistic weather forecasting and prediction of epidemics. Particle filters provide a well-founded approach to the problem, leading to provably accurate approximations of the statistics. However these methods perform poorly in high dimensions. In 1994 the idea of ensemble Kalman filtering was introduced by Evensen, leading to a methodology that has been widely adopted in the geophysical sciences and also finds application to quite general inverse problems. However, ensemble Kalman filters have defied rigorous analysis of their statistical accuracy, except in the linear Gaussian setting. In this article we describe recent work which takes first steps to analyze the statistical accuracy of ensemble Kalman filters beyond the linear Gaussian setting. The subject is inherently technical, as it involves the evolution of probability measures according to a nonlinear and nonautonomous dynamical system; and the approximation of this evolution. It can nonetheless be presented in a fairly accessible fashion, understandable with basic knowledge of dynamical systems, numerical analysis and probability.

核化 · 優化器 · Minimax · 核回歸 · 泛化誤差 ·

2024 年 2 月 2 日

The Optimality of Kernel Classifiers in Sobolev Space

Jianfa Lai,Zhifan Li,Dongming Huang,Qian Lin

from arxiv, 21 pages, 2 figures

Kernel methods are widely used in machine learning, especially for classification problems. However, the theoretical analysis of kernel classification is still limited. This paper investigates the statistical performances of kernel classifiers. With some mild assumptions on the conditional probability $\eta(x)=\mathbb{P}(Y=1\mid X=x)$, we derive an upper bound on the classification excess risk of a kernel classifier using recent advances in the theory of kernel regression. We also obtain a minimax lower bound for Sobolev spaces, which shows the optimality of the proposed classifier. Our theoretical results can be extended to the generalization error of overparameterized neural network classifiers. To make our theoretical results more applicable in realistic settings, we also propose a simple method to estimate the interpolation smoothness of $2\eta(x)-1$ and apply the method to real datasets.

Networking · 生成器網絡 · Pair · Learning · 對抗學習 ·

2024 年 2 月 2 日

Generative Adversarial Learning of Sinkhorn Algorithm Initializations

Jonathan Geuter,Vaios Laschos

from arxiv, 15 pages, 9 figures

The Sinkhorn algorithm is the state-of-the-art to approximate solutions of entropic optimal transport (OT) distances between discrete probability distributions. We show that meticulously training a neural network to learn initializations to the algorithm via the entropic OT dual problem can significantly speed up convergence, while maintaining desirable properties of the Sinkhorn algorithm, such as differentiability and parallelizability. We train our predictive network in an adversarial fashion using a second, generating network and a self-supervised bootstrapping loss. The predictive network is universal in the sense that it is able to generalize to any pair of distributions of fixed dimension and cost at inference, and we prove that we can make the generating network universal in the sense that it is capable of producing any pair of distributions during training. Furthermore, we show that our network can even be used as a standalone OT solver to approximate regularized transport distances to a few percent error, which makes it the first meta neural OT solver.

推斷 · Networking · Processing（編程語言） · 估計/估計量 · MoDELS ·

2024 年 2 月 1 日

Bayesian Causal Inference with Gaussian Process Networks

Enrico Giudice,Jack Kuipers,Giusi Moffa

Causal discovery and inference from observational data is an essential problem in statistics posing both modeling and computational challenges. These are typically addressed by imposing strict assumptions on the joint distribution such as linearity. We consider the problem of the Bayesian estimation of the effects of hypothetical interventions in the Gaussian Process Network (GPN) model, a flexible causal framework which allows describing the causal relationships nonparametrically. We detail how to perform causal inference on GPNs by simulating the effect of an intervention across the whole network and propagating the effect of the intervention on downstream variables. We further derive a simpler computational approximation by estimating the intervention distribution as a function of local variables only, modeling the conditional distributions via additive Gaussian processes. We extend both frameworks beyond the case of a known causal graph, incorporating uncertainty about the causal structure via Markov chain Monte Carlo methods. Simulation studies show that our approach is able to identify the effects of hypothetical interventions with non-Gaussian, non-linear observational data and accurately reflect the posterior uncertainty of the causal estimates. Finally we compare the results of our GPN-based causal inference approach to existing methods on a dataset of $A.~thaliana$ gene expressions.

Processing（編程語言） · 表示定理 · 情景 · 相同 · 表示 ·

2024 年 2 月 1 日

The Algebra of Nondeterministic Finite Automata

Roberto Gorrieri

A process algebra is proposed, whose semantics maps a term to a nondeterministic finite automaton (NFA, for short). We prove a representability theorem: for each NFA $N$, there exists a process algebraic term $p$ such that its semantics is an NFA isomorphic to $N$. Moreover, we provide a concise axiomatization of language equivalence: two NFAs $N_1$ and $N_2$ recognize the same language if and only if the associated terms $p_1$ and $p_2$, respectively, can be equated by means of a set of axioms, comprising 7 axioms plus 3 conditional axioms, only.

圖形處理器 · 圖 · 可辨認的 · Neural Networks · Networking ·

2021 年 5 月 31 日

On Explainability of Graph Neural Networks via Subgraph Explorations

Hao Yuan,Haiyang Yu,Jie Wang,Kang Li,Shuiwang Ji

from arxiv, Accepted by ICML 2021

We consider the problem of explaining the predictions of graph neural networks (GNNs), which otherwise are considered as black boxes. Existing methods invariably focus on explaining the importance of graph nodes or edges but ignore the substructures of graphs, which are more intuitive and human-intelligible. In this work, we propose a novel method, known as SubgraphX, to explain GNNs by identifying important subgraphs. Given a trained GNN model and an input graph, our SubgraphX explains its predictions by efficiently exploring different subgraphs with Monte Carlo tree search. To make the tree search more effective, we propose to use Shapley values as a measure of subgraph importance, which can also capture the interactions among different subgraphs. To expedite computations, we propose efficient approximation schemes to compute Shapley values for graph data. Our work represents the first attempt to explain GNNs via identifying subgraphs explicitly and directly. Experimental results show that our SubgraphX achieves significantly improved explanations, while keeping computations at a reasonable level.

Networking · 殘差網絡 · 縮放 · Weight · 平滑 ·

2021 年 5 月 25 日

Scaling Properties of Deep Residual Networks

Alain-Sam Cohen,Rama Cont,Alain Rossier,Renyuan Xu

from arxiv, Published at ICML 2021

Residual networks (ResNets) have displayed impressive results in pattern recognition and, recently, have garnered considerable theoretical interest due to a perceived link with neural ordinary differential equations (neural ODEs). This link relies on the convergence of network weights to a smooth function as the number of layers increases. We investigate the properties of weights trained by stochastic gradient descent and their scaling with network depth through detailed numerical experiments. We observe the existence of scaling regimes markedly different from those assumed in neural ODE literature. Depending on certain features of the network architecture, such as the smoothness of the activation function, one may obtain an alternative ODE limit, a stochastic differential equation or neither of these. These findings cast doubts on the validity of the neural ODE model as an adequate asymptotic description of deep ResNets and point to an alternative class of differential equations as a better description of the deep network limit.

Performer · 學成 · 維數災難 · 泛化理論 · 數學 ·

2021 年 5 月 9 日

The Modern Mathematics of Deep Learning

Julius Berner,Philipp Grohs,Gitta Kutyniok,Philipp Petersen

from arxiv, This review paper will appear as a book chapter in the book "Theory of Deep Learning" by Cambridge University Press

We describe the new field of mathematical analysis of deep learning. This field emerged around a list of research questions that were not answered within the classical framework of learning theory. These questions concern: the outstanding generalization power of overparametrized neural networks, the role of depth in deep architectures, the apparent absence of the curse of dimensionality, the surprisingly successful optimization performance despite the non-convexity of the problem, understanding what features are learned, why deep architectures perform exceptionally well in physical problems, and which fine aspects of an architecture affect the behavior of a learning task in which way. We present an overview of modern approaches that yield partial answers to these questions. For selected approaches, we describe the main ideas in more detail.