好诱人的搜子好爽免费观看_黄色视频在线观看男人插女人的视频在线观看_国产最爽的乱婬视频_又湿又粗又刺激的视频_欧美乱自拍视频在线视频_国产精品区一区二区三广区HD_久久国产精品福利影集

We provide matching upper and lower bounds of order $\sigma^2/\log(d/n)$ for the prediction error of the minimum $\ell_1$-norm interpolator, a.k.a. basis pursuit. Our result is tight up to negligible terms when $d \gg n$, and is the first to imply asymptotic consistency of noisy minimum-norm interpolation for isotropic features and sparse ground truths. Our work complements the literature on "benign overfitting" for minimum $\ell_2$-norm interpolation, where asymptotic consistency can be achieved only when the features are effectively low-dimensional.

相關內容

極小(xiao)點

關注 0

泛化理論 · Neural Networks · 泛化誤差上界 · Networking · 泛化誤差 ·

2022 年 1 月 14 日

Generalization Error Bounds for Iterative Recovery Algorithms Unfolded as Neural Networks

Ekkehard Schnoor,Arash Behboodi,Holger Rauhut

from arxiv, 29 pages, 6 figures

Motivated by the learned iterative soft thresholding algorithm (LISTA), we introduce a general class of neural networks suitable for sparse reconstruction from few linear measurements. By allowing a wide range of degrees of weight-sharing between the layers, we enable a unified analysis for very different neural network types, ranging from recurrent ones to networks more similar to standard feedforward neural networks. Based on training samples, via empirical risk minimization we aim at learning the optimal network parameters and thereby the optimal network that reconstructs signals from their low-dimensional linear measurements. We derive generalization bounds by analyzing the Rademacher complexity of hypothesis classes consisting of such deep networks, that also take into account the thresholding parameters. We obtain estimates of the sample complexity that essentially depend only linearly on the number of parameters and on the depth. We apply our main result to obtain specific generalization bounds for several practical examples, including different algorithms for (implicit) dictionary learning, and convolutional neural networks.

跡 · 離散化 · 奇異的 · 模型評估 · 流形 ·

2022 年 1 月 14 日

A Geometrically Consistent Trace Finite Element Method For The Laplace-Beltrami Eigenvalue Problem

Song Lu,Xianmin Xu

from arxiv, 23 pages, 6 figures

In this paper, we propose a new trace finite element method for the {Laplace-Beltrami} eigenvalue problem. The method is proposed directly on a smooth manifold which is implicitly given by a level-set function and require high order numerical quadrature on the surface. A comprehensive analysis for the method is provided. We show that the eigenvalues of the discrete Laplace-Beltrami operator coincide with only part of the eigenvalues of an embedded problem, which further corresponds to the finite eigenvalues for a singular generalized algebraic eigenvalue problem. The finite eigenvalues can be efficiently solved by a rank-completing perturbation algorithm in {\it Hochstenbach et al. SIAM J. Matrix Anal. Appl., 2019} \cite{hochstenbach2019solving}. We prove the method has optimal convergence rate. Numerical experiments verify the theoretical analysis and show that the geometric consistency can improve the numerical accuracy significantly.

方陣 · 情景 ·

2022 年 1 月 14 日

Windmills of the minds: an algorithm for Fermat's Two Squares Theorem

Hing Lun Chan

from arxiv, 14 pages, 6 tables, 10 figures. In Proceedings of the 11th ACM SIGPLAN International Conference on Certified Programs and Proofs (CPP 2022), January 17-18, 2022, Philadelphia, PA, USA

The two squares theorem of Fermat is a gem in number theory, with a spectacular one-sentence "proof from the Book". Here is a formalisation of this proof, with an interpretation using windmill patterns. The theory behind involves involutions on a finite set, especially the parity of the number of fixed points in the involutions. Starting as an existence proof that is non-constructive, there is an ingenious way to turn it into a constructive one. This gives an algorithm to compute the two squares by iterating the two involutions alternatively from a known fixed point.

統計量 · Performer · 推斷 · 置信度 · 方差 ·

2022 年 1 月 13 日

Interpretation and inference for altmetric indicators arising from sparse data statistics

Lawrence Smolinsky,Bernhard Klingenberg,Brian D. Marx

from arxiv, To appear in the Journal of Informetrics

In 2018 Bornmann and Haunschild (2018a) introduced a new indicator called the Mantel-Haenszel quotient (MHq) to measure alternative metrics (or altmetrics) of scientometric data. In this article we review the Mantel-Haenszel statistics, point out two errors in the literature, and introduce a new indicator. First, we correct the interpretation of MHq and mention that it is still a meaningful indicator. Second, we correct the variance formula for MHq, which leads to narrower confidence intervals. A simulation study shows the superior performance of our variance estimator and confidence intervals. Since MHq does not match its original description in the literature, we propose a new indicator, the Mantel-Haenszel row risk ratio (MHRR), to meet that need. Interpretation and statistical inference for MHRR are discussed. For both MHRR and MHq, a value greater (less) than one means performance is better (worse) than in the reference set called the world.

優化器 · 全局優化 · ReLU · Networking · 正則化項 ·

2022 年 1 月 13 日

Global Optimality Beyond Two Layers: Training Deep ReLU Networks via Convex Programs

Tolga Ergen,Mert Pilanci

from arxiv, Accepted to ICML 2021

Understanding the fundamental mechanism behind the success of deep neural networks is one of the key challenges in the modern machine learning literature. Despite numerous attempts, a solid theoretical analysis is yet to be developed. In this paper, we develop a novel unified framework to reveal a hidden regularization mechanism through the lens of convex optimization. We first show that the training of multiple three-layer ReLU sub-networks with weight decay regularization can be equivalently cast as a convex optimization problem in a higher dimensional space, where sparsity is enforced via a group $\ell_1$-norm regularization. Consequently, ReLU networks can be interpreted as high dimensional feature selection methods. More importantly, we then prove that the equivalent convex problem can be globally optimized by a standard convex optimization solver with a polynomial-time complexity with respect to the number of samples and data dimension when the width of the network is fixed. Finally, we numerically validate our theoretical results via experiments involving both synthetic and real datasets.

INFORMS · 圖 · 表示學習 · 無監督 · 結點 ·

2020 年 9 月 15 日

Graph InfoClust: Leveraging cluster-level node information for unsupervised graph representation learning

Costas Mavromatis,George Karypis

Unsupervised (or self-supervised) graph representation learning is essential to facilitate various graph data mining tasks when external supervision is unavailable. The challenge is to encode the information about the graph structure and the attributes associated with the nodes and edges into a low dimensional space. Most existing unsupervised methods promote similar representations across nodes that are topologically close. Recently, it was shown that leveraging additional graph-level information, e.g., information that is shared among all nodes, encourages the representations to be mindful of the global properties of the graph, which greatly improves their quality. However, in most graphs, there is significantly more structure that can be captured, e.g., nodes tend to belong to (multiple) clusters that represent structurally similar nodes. Motivated by this observation, we propose a graph representation learning method called Graph InfoClust (GIC), that seeks to additionally capture cluster-level information content. These clusters are computed by a differentiable K-means method and are jointly optimized by maximizing the mutual information between nodes of the same clusters. This optimization leads the node representations to capture richer information and nodal interactions, which improves their quality. Experiments show that GIC outperforms state-of-art methods in various downstream tasks (node classification, link prediction, and node clustering) with a 0.9% to 6.1% gain over the best competing approach, on average.

損失函數（機器學習） · 正交 · INFORMS · 泛函 · 學成 ·

2019 年 1 月 22 日

On orthogonal projections for dimension reduction and applications in variational loss functions for learning problems

Anna Breger,Jose Ignacio Orlando,Pavol Harar,Monika D?rfler,Sophie Klimscha,Christoph Grechenig,Bianca S. Gerendas,Ursula Schmidt-Erfurth,Martin Ehler

The use of orthogonal projections on high-dimensional input and target data in learning frameworks is studied. First, we investigate the relations between two standard objectives in dimension reduction, maximizing variance and preservation of pairwise relative distances. The derivation of their asymptotic correlation and numerical experiments tell that a projection usually cannot satisfy both objectives. In a standard classification problem we determine projections on the input data that balance them and compare subsequent results. Next, we extend our application of orthogonal projections to deep learning frameworks. We introduce new variational loss functions that enable integration of additional information via transformations and projections of the target data. In two supervised learning problems, clinical image segmentation and music information classification, the application of the proposed loss functions increase the accuracy.

似然 · 估計/估計量 · 最大似然估計 · 極大似然 · MoDELS ·

2018 年 9 月 24 日

Implicit Maximum Likelihood Estimation

Ke Li,Jitendra Malik

from arxiv, 21 pages, 4 figures. In the interest of promoting discussion, we make the reviews available at //people.eecs.berkeley.edu/~ke.li/papers/imle_reviews.pdf

Implicit probabilistic models are models defined naturally in terms of a sampling procedure and often induces a likelihood function that cannot be expressed explicitly. We develop a simple method for estimating parameters in implicit models that does not require knowledge of the form of the likelihood function or any derived quantities, but can be shown to be equivalent to maximizing likelihood under some conditions. Our result holds in the non-asymptotic parametric setting, where both the capacity of the model and the number of data examples are finite. We also demonstrate encouraging experimental results.

自編碼器 · 正則化項 · 可理解性 · 評價網絡 · INFORMS ·

2018 年 7 月 19 日

Understanding and Improving Interpolation in Autoencoders via an Adversarial Regularizer

David Berthelot,Colin Raffel,Aurko Roy,Ian Goodfellow

Autoencoders provide a powerful framework for learning compressed representations by encoding all of the information needed to reconstruct a data point in a latent code. In some cases, autoencoders can "interpolate": By decoding the convex combination of the latent codes for two datapoints, the autoencoder can produce an output which semantically mixes characteristics from the datapoints. In this paper, we propose a regularization procedure which encourages interpolated outputs to appear more realistic by fooling a critic network which has been trained to recover the mixing coefficient from interpolated data. We then develop a simple benchmark task where we can quantitatively measure the extent to which various autoencoders can interpolate and show that our regularizer dramatically improves interpolation in this setting. We also demonstrate empirically that our regularizer produces latent codes which are more effective on downstream tasks, suggesting a possible link between interpolation abilities and learning useful representations.

去噪 · 自編碼器 · 對抗自編碼 · 學成 · 未標記 ·

2018 年 1 月 4 日

Denoising Adversarial Autoencoders

Antonia Creswell,Anil Anthony Bharath

from arxiv, submitted to journal

Unsupervised learning is of growing interest because it unlocks the potential held in vast amounts of unlabelled data to learn useful representations for inference. Autoencoders, a form of generative model, may be trained by learning to reconstruct unlabelled input data from a latent representation space. More robust representations may be produced by an autoencoder if it learns to recover clean input samples from corrupted ones. Representations may be further improved by introducing regularisation during training to shape the distribution of the encoded data in latent space. We suggest denoising adversarial autoencoders, which combine denoising and regularisation, shaping the distribution of latent space using adversarial training. We introduce a novel analysis that shows how denoising may be incorporated into the training and sampling of adversarial autoencoders. Experiments are performed to assess the contributions that denoising makes to the learning of representations for classification and sample synthesis. Our results suggest that autoencoders trained using a denoising criterion achieve higher classification performance, and can synthesise samples that are more consistent with the input data than those trained without a corruption process.