爱琴海论坛视频播放三免费,国产欧美日韩综合在线,国产欧美日韩亚洲18禁在线,免费网址在线观看视频

In the linear regression model, the minimum l2-norm interpolant estimator has received much attention since it was proved to be consistent even though it fits noisy data perfectly under some condition on the covariance matrix $\Sigma$ of the input vector, known as benign overfitting. Motivated by this phenomenon, we study the generalization property of this estimator from a geometrical viewpoint. Our main results extend and improve the convergence rates as well as the deviation probability from [Tsigler and Bartlett]. Our proof differs from the classical bias/variance analysis and is based on the self-induced regularization property introduced in [Bartlett, Montanari and Rakhlin]: the minimum l2-norm interpolant estimator can be written as a sum of a ridge estimator and an overfitting component. The two geometrical properties of random Gaussian matrices at the heart of our analysis are the Dvoretsky-Milman theorem and isomorphic and restricted isomorphic properties. In particular, the Dvoretsky dimension appearing naturally in our geometrical viewpoint, coincides with the effective rank and is the key tool for handling the behavior of the design matrix restricted to the sub-space where overfitting happens. We extend these results to heavy-tailed scenarii proving the universality of this phenomenon beyond exponential moment assumptions. This phenomenon is unknown before and is widely believed to be a significant challenge. This follows from an anistropic version of the probabilistic Dvoretsky-Milman theorem that holds for heavy-tailed vectors which is of independent interest.

相關內容

估計/估計量

關注 3

MoDELS · 可理解性 · Learning · Transformer模型 · 變換 ·

2023 年 11 月 7 日

LISBET: a self-supervised Transformer model for the automatic segmentation of social behavior motifs

Giuseppe Chindemi,Benoit Girard,Camilla Bellone

Social behavior, defined as the process by which individuals act and react in response to others, is crucial for the function of societies and holds profound implications for mental health. To fully grasp the intricacies of social behavior and identify potential therapeutic targets for addressing social deficits, it is essential to understand its core principles. Although machine learning algorithms have made it easier to study specific aspects of complex behavior, current methodologies tend to focus primarily on single-animal behavior. In this study, we introduce LISBET (seLf-supervIsed Social BEhavioral Transformer), a model designed to detect and segment social interactions. Our model eliminates the need for feature selection and extensive human annotation by using self-supervised learning to detect and quantify social behaviors from dynamic body parts tracking data. LISBET can be used in hypothesis-driven mode to automate behavior classification using supervised finetuning, and in discovery-driven mode to segment social behavior motifs using unsupervised learning. We found that motifs recognized using the discovery-driven approach not only closely match the human annotations but also correlate with the electrophysiological activity of dopaminergic neurons in the Ventral Tegmental Area (VTA). We hope LISBET will help the community improve our understanding of social behaviors and their neural underpinnings.

向量化 · 操作 · 數學 · 數值分析 ·

2023 年 11 月 7 日

On the generalized vectorization and its inverse

Vitor Curtarelli

Although the vectorization operation is known and well-defined, it is only defined for 2-D matrices, and its inverse isn't as well-popularized. This work proposes to generalize the vectorization to higher dimensions, and define mathematically its inverse operation.

估計/估計量 · 泛函 · FC · Extensibility · fMRI ·

2023 年 11 月 7 日

Accurate estimation of functional brain connectivity via Bayesian ICA with population-derived priors

Amanda Mejia,David Bolin,Daniel Spencer,Ani Eloyan

from arxiv, 33 pages, 12 figures, plus appendix

Estimation of brain functional connectivity (FC) is essential for understanding the functional organization in the brain and for identifying changes occurring due to neurological disorders, development, treatment, and other phenomena. Independent component analysis (ICA) is a matrix decomposition method that has been used extensively for estimation of brain functional networks and their FC. However, estimation of FC via ICA is often sub-optimal due to the use of ad-hoc methods or need for temporal dimension reduction prior to traditional ICA methods. Bayesian ICA methods can avoid dimension reduction, produce more accurate estimates, and facilitate inference via posterior distributions on the model parameters. In this paper, we propose a novel, computationally efficient Bayesian ICA method with population-derived priors on both the temporal covariance, representing FC, and the spatial components of the model. We propose two algorithms for parameter estimation: a Bayesian Expectation-Maximization algorithm with a Gibbs sampler at the E-step, and a more computationally efficient variational Bayes algorithm. Through extensive simulation studies using realistic fMRI data generation mechanisms, we evaluate the performance of the proposed methods and compare them with existing approaches. Finally, we perform a comprehensive evaluation of the proposed methods using fMRI data from over 400 healthy adults in the Human Connectome Project. Our analyses demonstrate that the proposed Bayesian ICA methods produce highly accurate measures of functional connectivity and spatial brain features. The proposed framework is computationally efficient and applicable to single-subject analysis, making it potentially clinically viable.

統計量 · 閾值 · 評論員 · 優化器 · 稀疏 ·

2023 年 11 月 7 日

Thresholding the higher criticism test statistics for optimality in a heterogeneous setting

Hock Peng Chan

Donoho and Kipnis (2022) showed that the the higher criticism (HC) test statistic has a non-Gaussian phase transition but remarked that it is probably not optimal, in the detection of sparse differences between two large frequency tables when the counts are low. The setting can be considered to be heterogeneous, with cells containing larger total counts more able to detect smaller differences. We provide a general study here of sparse detection arising from such heterogeneous settings, and showed that optimality of the HC test statistic requires thresholding, for example in the case of frequency table comparison, to restrict to p-values of cells with total counts exceeding a threshold. The use of thresholding also leads to optimality of the HC test statistic when it is applied on the sparse Poisson means model of Arias-Castro and Wang (2015). The phase transitions we consider here are non-Gaussian, and involve an interplay between the rate functions of the response and sample size distributions. We also showed, both theoretically and in a numerical study, that applying thresholding to the Bonferroni test statistic results in better sparse mixture detection in heterogeneous settings.

塊 · 非凸 · 優化器 · 正則化項 · Projection ·

2023 年 11 月 7 日

Block majorization-minimization with diminishing radius for constrained nonconvex optimization

Hanbaek Lyu,Yuchen Li

from arxiv, 31 pages, 6 figures. The hyperlinks to equations in the previous version were not compiled correctly. Fixed in the new version

Block majorization-minimization (BMM) is a simple iterative algorithm for nonconvex constrained optimization that sequentially minimizes majorizing surrogates of the objective function in each block coordinate while the other coordinates are held fixed. BMM entails a large class of optimization algorithms such as block coordinate descent and its proximal-point variant, expectation-minimization, and block projected gradient descent. We establish that for general constrained nonconvex optimization, BMM with strongly convex surrogates can produce an $\epsilon$-stationary point within $O(\epsilon^{-2}(\log \epsilon^{-1})^{2})$ iterations and asymptotically converges to the set of stationary points. Furthermore, we propose a trust-region variant of BMM that can handle surrogates that are only convex and still obtain the same iteration complexity and asymptotic stationarity. These results hold robustly even when the convex sub-problems are inexactly solved as long as the optimality gaps are summable. As an application, we show that a regularized version of the celebrated multiplicative update algorithm for nonnegative matrix factorization by Lee and Seung has iteration complexity of $O(\epsilon^{-2}(\log \epsilon^{-1})^{2})$. The same result holds for a wide class of regularized nonnegative tensor decomposition algorithms as well as the classical block projected gradient descent algorithm. These theoretical results are validated through various numerical experiments.

近似 · 泛函 · 樣本 · 穩健性 · 稀疏 ·

2023 年 11 月 6 日

On efficient algorithms for computing near-best polynomial approximations to high-dimensional, Hilbert-valued functions from limited samples

Ben Adcock,Simone Brugiapaglia,Nick Dexter,Sebastian Moraga

Sparse polynomial approximation has become indispensable for approximating smooth, high- or infinite-dimensional functions from limited samples. This is a key task in computational science and engineering, e.g., surrogate modelling in uncertainty quantification where the function is the solution map of a parametric or stochastic differential equation (DE). Yet, sparse polynomial approximation lacks a complete theory. On the one hand, there is a well-developed theory of best $s$-term polynomial approximation, which asserts exponential or algebraic rates of convergence for holomorphic functions. On the other, there are increasingly mature methods such as (weighted) $\ell^1$-minimization for computing such approximations. While the sample complexity of these methods has been analyzed with compressed sensing, whether they achieve best $s$-term approximation rates is not fully understood. Furthermore, these methods are not algorithms per se, as they involve exact minimizers of nonlinear optimization problems. This paper closes these gaps. Specifically, we consider the following question: are there robust, efficient algorithms for computing approximations to finite- or infinite-dimensional, holomorphic and Hilbert-valued functions from limited samples that achieve best $s$-term rates? We answer this affirmatively by introducing algorithms and theoretical guarantees that assert exponential or algebraic rates of convergence, along with robustness to sampling, algorithmic, and physical discretization errors. We tackle both scalar- and Hilbert-valued functions, this being key to parametric or stochastic DEs. Our results involve significant developments of existing techniques, including a novel restarted primal-dual iteration for solving weighted $\ell^1$-minimization problems in Hilbert spaces. Our theory is supplemented by numerical experiments demonstrating the efficacy of these algorithms.

估計/估計量 · MoDELS · Performer · Signal Processing · Analysis ·

2023 年 11 月 6 日

Multivariate selfsimilarity: Multiscale eigen-structures for selfsimilarity parameter estimation

Charles-Gérard Lucas,Gustavo Didier,Herwig Wendt,Patrice Abry

from arxiv, 14 pages, 8 figures

Scale-free dynamics, formalized by selfsimilarity, provides a versatile paradigm massively and ubiquitously used to model temporal dynamics in real-world data. However, its practical use has mostly remained univariate so far. By contrast, modern applications often demand multivariate data analysis. Accordingly, models for multivariate selfsimilarity were recently proposed. Nevertheless, they have remained rarely used in practice because of a lack of available robust estimation procedures for the vector of selfsimilarity parameters. Building upon recent mathematical developments, the present work puts forth an efficient estimation procedure based on the theoretical study of the multiscale eigenstructure of the wavelet spectrum of multivariate selfsimilar processes. The estimation performance is studied theoretically in the asymptotic limits of large scale and sample sizes, and computationally for finite-size samples. As a practical outcome, a fully operational and documented multivariate signal processing estimation toolbox is made freely available and is ready for practical use on real-world data. Its potential benefits are illustrated in epileptic seizure prediction from multi-channel EEG data.

Networking · 評論員 · Learning · 計算成本 · GNN ·

2023 年 11 月 5 日

A graph-based probabilistic geometric deep learning framework with online enforcement of physical constraints to predict the criticality of defects in porous materials

Vasilis Krokos,Stéphane P. A. Bordas,Pierre Kerfriden

from arxiv, 68 pages; 52 figures

Stress prediction in porous materials and structures is challenging due to the high computational cost associated with direct numerical simulations. Convolutional Neural Network (CNN) based architectures have recently been proposed as surrogates to approximate and extrapolate the solution of such multiscale simulations. These methodologies are usually limited to 2D problems due to the high computational cost of 3D voxel based CNNs. We propose a novel geometric learning approach based on a Graph Neural Network (GNN) that efficiently deals with three-dimensional problems by performing convolutions over 2D surfaces only. Following our previous developments using pixel-based CNN, we train the GNN to automatically add local fine-scale stress corrections to an inexpensively computed coarse stress prediction in the porous structure of interest. Our method is Bayesian and generates densities of stress fields, from which credible intervals may be extracted. As a second scientific contribution, we propose to improve the extrapolation ability of our network by deploying a strategy of online physics-based corrections. Specifically, we condition the posterior predictions of our probabilistic predictions to satisfy partial equilibrium at the microscale, at the inference stage. This is done using an Ensemble Kalman algorithm, to ensure tractability of the Bayesian conditioning operation. We show that this innovative methodology allows us to alleviate the effect of undesirable biases observed in the outputs of the uncorrected GNN, and improves the accuracy of the predictions in general.

Extensibility · 向量化 · 標準正交 · 秩 · 相互獨立的 ·

2023 年 11 月 5 日

Orthonormal representations, vector chromatic number, and extension complexity

Igor Balla

from arxiv, 9 pages; Fixed minor typographical and logical errors

We construct a bipartite generalization of Alon and Szegedy's nearly orthogonal vectors, thereby obtaining strong bounds for several extremal problems involving the Lov\'asz theta function, vector chromatic number, minimum semidefinite rank, nonnegative rank, and extension complexity of polytopes. In particular, we derive a couple of general lower bounds for the vector chromatic number which may be of independent interest.

泛函 · 估計/估計量 · 協方差矩陣 · 分解的 · 特化 ·

2023 年 11 月 4 日

Factor-guided estimation of large covariance matrix function with conditional functional sparsity

Dong Li,Xinghao Qiao,Zihan Wang

This paper addresses the fundamental task of estimating covariance matrix functions for high-dimensional functional data/functional time series. We consider two functional factor structures encompassing either functional factors with scalar loadings or scalar factors with functional loadings, and postulate functional sparsity on the covariance of idiosyncratic errors after taking out the common unobserved factors. To facilitate estimation, we rely on the spiked matrix model and its functional generalization, and derive some novel asymptotic identifiability results, based on which we develop DIGIT and FPOET estimators under two functional factor models, respectively. Both estimators involve performing associated eigenanalysis to estimate the covariance of common components, followed by adaptive functional thresholding applied to the residual covariance. We also develop functional information criteria for the purpose of model selection. The convergence rates of estimated factors, loadings, and conditional sparse covariance matrix functions under various functional matrix norms, are respectively established for DIGIT and FPOET estimators. Numerical studies including extensive simulations and two real data applications on mortality rates and functional portfolio allocation are conducted to examine the finite-sample performance of the proposed methodology.