五月丁香四月婷婷激情综合-欧美日韩大片一区二区三区

Machine learning algorithms have grown in sophistication over the years and are increasingly deployed for real-life applications. However, when using machine learning techniques in practical settings, particularly in high-risk applications such as medicine and engineering, obtaining the failure probability of the predictive model is critical. We refer to this problem as the risk-assessment task. We focus on regression algorithms and the risk-assessment task of computing the probability of the true label lying inside an interval defined around the model's prediction. We solve the risk-assessment problem using the conformal prediction approach, which provides prediction intervals that are guaranteed to contain the true label with a given probability. Using this coverage property, we prove that our approximated failure probability is conservative in the sense that it is not lower than the true failure probability of the ML algorithm. We conduct extensive experiments to empirically study the accuracy of the proposed method for problems with and without covariate shift. Our analysis focuses on different modeling regimes, dataset sizes, and conformal prediction methodologies.

相關內容

Machine Learning

關注 2240

機器學習（Machine Learning）是一個研究計算學習方法的國際論壇。該雜志發表文章，報告廣泛的學習方法應用于各種學習問題的實質性結果。該雜志的特色論文描述研究的問題和方法，應用研究和研究方法的問題。有關學習問題或方法的論文通過實證研究、理論分析或與心理現象的比較提供了堅實的支持。應用論文展示了如何應用學習方法來解決重要的應用問題。研究方法論文改進了機器學習的研究方法。所有的論文都以其他研究人員可以驗證或復制的方式描述了支持證據。論文還詳細說明了學習的組成部分，并討論了關于知識表示和性能任務的假設。官網地址：

BASIC · 線性組合 · 線性的 · 可約的 · Integration ·

2023 年 11 月 20 日

Generalized extrapolation methods based on compositions of a basic 2nd-order scheme

Sergio Blanes,Fernando Casas,Luke Shaw

from arxiv, 17 figures

We propose new linear combinations of compositions of a basic second-order scheme with appropriately chosen coefficients to construct higher order numerical integrators for differential equations. They can be considered as a generalization of extrapolation methods and multi-product expansions. A general analysis is provided and new methods up to order 8 are built and tested. The new approach is shown to reduce the latency problem when implemented in a parallel environment and leads to schemes that are significantly more efficient than standard extrapolation when the linear combination is delayed by a number of steps.

Performer · 分離的 · Integration · 離散化 · prototype ·

2023 年 11 月 19 日

p-adaptive discontinuous Galerkin method for the shallow water equations on heterogeneous computing architectures

Sara Faghih-Naini,Vadym Aizinger,Sebastian Kuckuk,Richard Angersbach,Harald K?stler

Heterogeneous computing and exploiting integrated CPU-GPU architectures has become a clear current trend since the flattening of Moore's Law. In this work, we propose a numerical and algorithmic re-design of a p-adaptive quadrature-free discontinuous Galerkin method (DG) for the shallow water equations (SWE). Our new approach separates the computations of the non-adaptive (lower-order) and adaptive (higher-order) parts of the discretization form each other. Thereby, we can overlap computations of the lower-order and the higher-order DG solution components. Furthermore, we investigate execution times of main computational kernels and use automatic code generation to optimize their distribution between the CPU and GPU. Several setups, including a prototype of a tsunami simulation in a tide-driven flow scenario, are investigated, and the results show that significant performance improvements can be achieved in suitable setups.

Analysis · Learning · Machine Learning · 示例 · 可理解性 ·

2023 年 11 月 18 日

Biarchetype analysis: simultaneous learning of observations and features based on extremes

Aleix Alcacer,Irene Epifanio,Ximo Gual-Arnau

A new exploratory technique called biarchetype analysis is defined. We extend archetype analysis to find the archetypes of both observations and features simultaneously. The idea of this new unsupervised machine learning tool is to represent observations and features by instances of pure types (biarchetypes) that can be easily interpreted as they are mixtures of observations and features. Furthermore, the observations and features are expressed as mixtures of the biarchetypes, which also helps understand the structure of the data. We propose an algorithm to solve biarchetype analysis. We show that biarchetype analysis offers advantages over biclustering, especially in terms of interpretability. This is because byarchetypes are extreme instances as opposed to the centroids returned by biclustering, which favors human understanding. Biarchetype analysis is applied to several machine learning problems to illustrate its usefulness.

線性的 · 圖 · CASE · Extensibility · 講稿 ·

2023 年 11 月 17 日

The exponential logic of sequentialization

Aurore Alcolei,Luc Pellissier,Alexis Saurin

from arxiv, 17 pages, submitted to MFPS 2023

Linear logic has provided new perspectives on proof-theory, denotational semantics and the study of programming languages. One of its main successes are proof-nets, canonical representations of proofs that lie at the intersection between logic and graph theory. In the case of the minimalist proof-system of multiplicative linear logic without units (MLL), these two aspects are completely fused: proof-nets for this system are graphs satisfying a correctness criterion that can be fully expressed in the language of graphs. For more expressive logical systems (containing logical constants, quantifiers and exponential modalities), this is not completely the case. The purely graphical approach of proof-nets deprives them of any sequential structure that is crucial to represent the order in which arguments are presented, which is necessary for these extensions. Rebuilding this order of presentation - sequentializing the graph - is thus a requirement for a graph to be logical. Presentations and study of the artifacts ensuring that sequentialization can be done, such as boxes or jumps, are an integral part of researches on linear logic. Jumps, extensively studied by Faggian and di Giamberardino, can express intermediate degrees of sequentialization between a sequent calculus proof and a fully desequentialized proof-net. We propose to analyze the logical strength of jumps by internalizing them in an extention of MLL where axioms on a specific formula, the jumping formula, introduce constrains on the possible sequentializations. The jumping formula needs to be treated non-linearly, which we do either axiomatically, or by embedding it in a very controlled fragment of multiplicative-exponential linear logic, uncovering the exponential logic of sequentialization.

Networking · 泛函 · Neural Networks · 近似 · GROUP ·

2023 年 11 月 17 日

Neural approximation of Wasserstein distance via a universal architecture for symmetric and factorwise group invariant functions

Samantha Chen,Yusu Wang

from arxiv, Accepted to NeurIPS 2023

Learning distance functions between complex objects, such as the Wasserstein distance to compare point sets, is a common goal in machine learning applications. However, functions on such complex objects (e.g., point sets and graphs) are often required to be invariant to a wide variety of group actions e.g. permutation or rigid transformation. Therefore, continuous and symmetric product functions (such as distance functions) on such complex objects must also be invariant to the product of such group actions. We call these functions symmetric and factor-wise group invariant (or SFGI functions in short). In this paper, we first present a general neural network architecture for approximating SFGI functions. The main contribution of this paper combines this general neural network with a sketching idea to develop a specific and efficient neural network which can approximate the $p$-th Wasserstein distance between point sets. Very importantly, the required model complexity is independent of the sizes of input point sets. On the theoretical front, to the best of our knowledge, this is the first result showing that there exists a neural network with the capacity to approximate Wasserstein distance with bounded model complexity. Our work provides an interesting integration of sketching ideas for geometric problems with universal approximation of symmetric functions. On the empirical front, we present a range of results showing that our newly proposed neural network architecture performs comparatively or better than other models (including a SOTA Siamese Autoencoder based approach). In particular, our neural network generalizes significantly better and trains much faster than the SOTA Siamese AE. Finally, this line of investigation could be useful in exploring effective neural network design for solving a broad range of geometric optimization problems (e.g., $k$-means in a metric space).

Integration · 不變 · 可約的 · 離散化 · 穩健性 ·

2023 年 11 月 16 日

Conservation properties of the augmented basis update & Galerkin integrator for kinetic problems

Lukas Einkemmer,Jonas Kusch,Steffen Schotth?fer

Numerical simulations of kinetic problems can become prohibitively expensive due to their large memory footprint and computational costs. A method that has proven to successfully reduce these costs is the dynamical low-rank approximation (DLRA). One key question when using DLRA methods is the construction of robust time integrators that preserve the invariances and associated conservation laws of the original problem. In this work, we demonstrate that the augmented basis update & Galerkin integrator (BUG) preserves solution invariances and the associated conservation laws when using a conservative truncation step and an appropriate time and space discretization. We present numerical comparisons to existing conservative integrators and discuss advantages and disadvantages

奇異的 · Microsoft Surface · 有限差分 · AIM · Integration ·

2023 年 11 月 16 日

A numerical method for solving elliptic equations on real closed algebraic curves and surfaces

Wenrui Hao,Jonathan D. Hauenstein,Margaret H. Regan,Tingting Tang

There are many numerical methods for solving partial different equations (PDEs) on manifolds such as classical implicit, finite difference, finite element, and isogeometric analysis methods which aim at improving the interoperability between finite element method and computer aided design (CAD) software. However, these approaches have difficulty when the domain has singularities since the solution at the singularity may be multivalued. This paper develops a novel numerical approach to solve elliptic PDEs on real, closed, connected, orientable, and almost smooth algebraic curves and surfaces. Our method integrates numerical algebraic geometry, differential geometry, and a finite difference scheme which is demonstrated on several examples.

泛化理論 · 黑盒 · 學成 · INFORMS · 監督學習算法 ·

2021 年 10 月 4 日

Information-theoretic generalization bounds for black-box learning algorithms

Hrayr Harutyunyan,Maxim Raginsky,Greg Ver Steeg,Aram Galstyan

from arxiv, NeurIPS 2021

We derive information-theoretic generalization bounds for supervised learning algorithms based on the information contained in predictions rather than in the output of the training algorithm. These bounds improve over the existing information-theoretic bounds, are applicable to a wider range of algorithms, and solve two key challenges: (a) they give meaningful results for deterministic algorithms and (b) they are significantly easier to estimate. We show experimentally that the proposed bounds closely follow the generalization gap in practical scenarios for deep learning.

學成 · 深度學習 · Continuity · 貝葉斯推斷 · Networking ·

2020 年 12 月 20 日

Recent advances in deep learning theory

Fengxiang He,Dacheng Tao

Deep learning is usually described as an experiment-driven field under continuous criticizes of lacking theoretical foundations. This problem has been partially fixed by a large volume of literature which has so far not been well organized. This paper reviews and organizes the recent advances in deep learning theory. The literature is categorized in six groups: (1) complexity and capacity-based approaches for analyzing the generalizability of deep learning; (2) stochastic differential equations and their dynamic systems for modelling stochastic gradient descent and its variants, which characterize the optimization and generalization of deep learning, partially inspired by Bayesian inference; (3) the geometrical structures of the loss landscape that drives the trajectories of the dynamic systems; (4) the roles of over-parameterization of deep neural networks from both positive and negative perspectives; (5) theoretical foundations of several special structures in network architectures; and (6) the increasingly intensive concerns in ethics and security and their relationships with generalizability.

Neural Networks · 優化器 · Networks · 局部極小 · Networking ·

2019 年 12 月 19 日

Optimization for deep learning: theory and algorithms

Ruoyu Sun

from arxiv, 38 pages of main body; 5 pages of appendix; 12 pages of references

When and why can a neural network be successfully trained? This article provides an overview of optimization algorithms and theory for training neural networks. First, we discuss the issue of gradient explosion/vanishing and the more general issue of undesirable spectrum, and then discuss practical solutions including careful initialization and normalization methods. Second, we review generic optimization methods used in training neural networks, such as SGD, adaptive gradient methods and distributed methods, and theoretical results for these algorithms. Third, we review existing research on the global issues of neural network training, including results on bad local minima, mode connectivity, lottery ticket hypothesis and infinite-width analysis.