好男人在线观看免费2019-精品自在线观看影片天天看

Nonparametric estimation of nonlocal interaction kernels is crucial in various applications involving interacting particle systems. The inference challenge, situated at the nexus of statistical learning and inverse problems, comes from the nonlocal dependency. A central question is whether the optimal minimax rate of convergence for this problem aligns with the rate of $M^{-\frac{2\beta}{2\beta+1}}$ in classical nonparametric regression, where $M$ is the sample size and $\beta$ represents the smoothness exponent of the radial kernel. Our study confirms this alignment for systems with a finite number of particles. We introduce a tamed least squares estimator (tLSE) that attains the optimal convergence rate for a broad class of exchangeable distributions. The tLSE bridges the smallest eigenvalue of random matrices and Sobolev embedding. This estimator relies on nonasymptotic estimates for the left tail probability of the smallest eigenvalue of the normal matrix. The lower minimax rate is derived using the Fano-Tsybakov hypothesis testing method. Our findings reveal that provided the inverse problem in the large sample limit satisfies a coercivity condition, the left tail probability does not alter the bias-variance tradeoff, and the optimal minimax rate remains intact. Our tLSE method offers a straightforward approach for establishing the optimal minimax rate for models with either local or nonlocal dependency.

相關內容

Minimax

關注 0

Integration · 有向 · 近似 · Performer · state-of-the-art ·

2024 年 1 月 19 日

Efficient third order tensor-oriented directional splitting for exponential integrators

Fabio Cassini

Suitable discretizations through tensor product formulas of popular multidimensional operators (diffusion or diffusion--advection, for instance) lead to matrices with $d$-dimensional Kronecker sum structure. For evolutionary Partial Differential Equations containing such operators and integrated in time with exponential integrators, it is then of paramount importance to efficiently approximate the actions of $\varphi$-functions of the arising matrices. In this work, we show how to produce directional split approximations of third order with respect to the time step size. They conveniently employ tensor-matrix products (the so-called $\mu$-mode product and related Tucker operator, realized in practice with high performance level 3 BLAS), and allow for the effective usage of exponential Runge--Kutta integrators up to order three. The technique can also be efficiently implemented on modern computer hardware such as Graphic Processing Units. The approach has been successfully tested against state-of-the-art techniques on two well-known physical models that lead to Turing patterns, namely the 2D Schnakenberg and the 3D FitzHugh--Nagumo systems, on different architectures.

統計量 · Learning · 估計/估計量 · 方陣 · 線性的 ·

2024 年 1 月 19 日

Least squares approximations in linear statistical inverse learning problems

Tapio Helin

from arxiv, 23 pages

Statistical inverse learning aims at recovering an unknown function $f$ from randomly scattered and possibly noisy point evaluations of another function $g$, connected to $f$ via an ill-posed mathematical model. In this paper we blend statistical inverse learning theory with the classical regularization strategy of applying finite-dimensional projections. Our key finding is that coupling the number of random point evaluations with the choice of projection dimension, one can derive probabilistic convergence rates for the reconstruction error of the maximum likelihood (ML) estimator. Convergence rates in expectation are derived with a ML estimator complemented with a norm-based cut-off operation. Moreover, we prove that the obtained rates are minimax optimal.

MoDELS · 優化器 · 集成 · Learning · 決策函數 ·

2024 年 1 月 18 日

Unboxing Tree Ensembles for interpretability: a hierarchical visualization tool and a multivariate optimal re-built tree

Giulia Di Teodoro,Marta Monaci,Laura Palagi

from arxiv, 44 pages, 9 figures, 20 tables

The interpretability of models has become a crucial issue in Machine Learning because of algorithmic decisions' growing impact on real-world applications. Tree ensemble methods, such as Random Forests or XgBoost, are powerful learning tools for classification tasks. However, while combining multiple trees may provide higher prediction quality than a single one, it sacrifices the interpretability property resulting in "black-box" models. In light of this, we aim to develop an interpretable representation of a tree-ensemble model that can provide valuable insights into its behavior. First, given a target tree-ensemble model, we develop a hierarchical visualization tool based on a heatmap representation of the forest's feature use, considering the frequency of a feature and the level at which it is selected as an indicator of importance. Next, we propose a mixed-integer linear programming (MILP) formulation for constructing a single optimal multivariate tree that accurately mimics the target model predictions. The goal is to provide an interpretable surrogate model based on oblique hyperplane splits, which uses only the most relevant features according to the defined forest's importance indicators. The MILP model includes a penalty on feature selection based on their frequency in the forest to further induce sparsity of the splits. The natural formulation has been strengthened to improve the computational performance of {mixed-integer} software. Computational experience is carried out on benchmark datasets from the UCI repository using a state-of-the-art off-the-shelf solver. Results show that the proposed model is effective in yielding a shallow interpretable tree approximating the tree-ensemble decision function.

峰值 · Continuity · 泛函 · Automator · 相互獨立的 ·

2024 年 1 月 17 日

The dynamics of belief: continuously monitoring and visualising complex systems

Edwin J. Beggs,John V. Tucker

from arxiv, Comments welcome

The rise of AI in human contexts places new demands on automated systems to be transparent and explainable. We examine some anthropomorphic ideas and principles relevant to such accountablity in order to develop a theoretical framework for thinking about digital systems in complex human contexts and the problem of explaining their behaviour. Structurally, systems are made of modular and hierachical components, which we abstract in a new system model using notions of modes and mode transitions. A mode is an independent component of the system with its own objectives, monitoring data, and algorithms. The behaviour of a mode, including its transitions to other modes, is determined by functions that interpret each mode's monitoring data in the light of its objectives and algorithms. We show how these belief functions can help explain system behaviour by visualising their evaluation as trajectories in higher-dimensional geometric spaces. These ideas are formalised mathematically by abstract and concrete simplicial complexes. We offer three techniques: a framework for design heuristics, a general system theory based on modes, and a geometric visualisation, and apply them in three types of human-centred systems.

Attention · 解碼 · Performer · 變換 · state-of-the-art ·

2024 年 1 月 17 日

A DenseNet-based method for decoding auditory spatial attention with EEG

Xiran Xu,Bo Wang,Yujie Yan,Xihong Wu,Jing Chen

from arxiv, 5 pages, 3 figures, has been accepted by ICASSP 2024

Auditory spatial attention detection (ASAD) aims to decode the attended spatial location with EEG in a multiple-speaker setting. ASAD methods are inspired by the brain lateralization of cortical neural responses during the processing of auditory spatial attention, and show promising performance for the task of auditory attention decoding (AAD) with neural recordings. In the previous ASAD methods, the spatial distribution of EEG electrodes is not fully exploited, which may limit the performance of these methods. In the present work, by transforming the original EEG channels into a two-dimensional (2D) spatial topological map, the EEG data is transformed into a three-dimensional (3D) arrangement containing spatial-temporal information. And then a 3D deep convolutional neural network (DenseNet-3D) is used to extract temporal and spatial features of the neural representation for the attended locations. The results show that the proposed method achieves higher decoding accuracy than the state-of-the-art (SOTA) method (94.3% compared to XANet's 90.6%) with 1-second decision window for the widely used KULeuven (KUL) dataset, and the code to implement our work is available on Github: //github.com/xuxiran/ASAD_DenseNet

儲層計算 · 模型評估 · 動力系統 · 均方誤差 · 均方根 ·

2024 年 1 月 17 日

Reservoir computing with logistic map

R. Arun,M. Sathish Aravindh,A. Venkatesan,M. Lakshmanan

from arxiv, Submitted for publication in Physical Review E

Recent studies on reservoir computing essentially involve a high dimensional dynamical system as the reservoir, which transforms and stores the input as a higher dimensional state, for temporal and nontemporal data processing. We demonstrate here a method to predict temporal and nontemporal tasks by constructing virtual nodes as constituting a reservoir in reservoir computing using a nonlinear map, namely logistic map, and a simple finite trigonometric series. We predict three nonlinear systems, namely Lorenz, R\"ossler, and Hindmarsh-Rose, for temporal tasks and a seventh order polynomial for nontemporal tasks with great accuracy. Also, the prediction is made in the presence of noise and found to closely agree with the target. Remarkably, the logistic map performs well and predicts close to the actual or target values. The low values of the root mean square error confirm the accuracy of this method in terms of efficiency. Our approach removes the necessity of continuous dynamical systems for constructing the reservoir in reservoir computing. Moreover, the accurate prediction for the three different nonlinear systems suggests that this method can be considered a general one and can be applied to predict many systems. Finally, we show that the method also accurately anticipates the time series for the future (self prediction).

Neural Networks · Networking · Processing（編程語言） · 目標檢測 · 控制器 ·

2024 年 1 月 17 日

Design and development of opto-neural processors for simulation of neural networks trained in image detection for potential implementation in hybrid robotics

Sanjana Shetty

Neural networks have been employed for a wide range of processing applications like image processing, motor control, object detection and many others. Living neural networks offer advantages of lower power consumption, faster processing, and biological realism. Optogenetics offers high spatial and temporal control over biological neurons and presents potential in training live neural networks. This work proposes a simulated living neural network trained indirectly by backpropagating STDP based algorithms using precision activation by optogenetics achieving accuracy comparable to traditional neural network training algorithms.

Performer · Integration · 代碼 · 設計 · 并行計算 ·

2024 年 1 月 16 日

Monitoring the development of CFD applications on unstable HPC platforms

Damien Dosimont,Guillaume Houzeaux

from arxiv, ParCFD2023 34th International Conference on Parallel Computational Fluid Dynamics, May 29-31 2023, Cuenca, Ecuador

We tackle the challenging tasks of monitoring on unstable HPC platforms the performance of CFD applications all along their development. We have designed and implemented a monitoring framework, integrated at the end of a CI-CD pipeline. Measures retrieved during the automatic execution of production simulations are analyzed within a visual analytics interface we developed, providing advanced visualizations and interaction. We have validated this approach by monitoring the CFD code Alya over two years, detecting and resolving issues related to the platform, and highlighting performance improvement.

學成 · 深度學習 · Continuity · 貝葉斯推斷 · Networking ·

2020 年 12 月 20 日

Recent advances in deep learning theory

Fengxiang He,Dacheng Tao

Deep learning is usually described as an experiment-driven field under continuous criticizes of lacking theoretical foundations. This problem has been partially fixed by a large volume of literature which has so far not been well organized. This paper reviews and organizes the recent advances in deep learning theory. The literature is categorized in six groups: (1) complexity and capacity-based approaches for analyzing the generalizability of deep learning; (2) stochastic differential equations and their dynamic systems for modelling stochastic gradient descent and its variants, which characterize the optimization and generalization of deep learning, partially inspired by Bayesian inference; (3) the geometrical structures of the loss landscape that drives the trajectories of the dynamic systems; (4) the roles of over-parameterization of deep neural networks from both positive and negative perspectives; (5) theoretical foundations of several special structures in network architectures; and (6) the increasingly intensive concerns in ethics and security and their relationships with generalizability.

哈希學習 · 圖像檢索 · 卷積神經網絡 · 優化器 · 損失函數（機器學習） ·

2020 年 6 月 10 日

A survey on deep hashing for image retrieval

Xiaopeng Zhang

Hashing has been widely used in approximate nearest search for large-scale database retrieval for its computation and storage efficiency. Deep hashing, which devises convolutional neural network architecture to exploit and extract the semantic information or feature of images, has received increasing attention recently. In this survey, several deep supervised hashing methods for image retrieval are evaluated and I conclude three main different directions for deep supervised hashing methods. Several comments are made at the end. Moreover, to break through the bottleneck of the existing hashing methods, I propose a Shadow Recurrent Hashing(SRH) method as a try. Specifically, I devise a CNN architecture to extract the semantic features of images and design a loss function to encourage similar images projected close. To this end, I propose a concept: shadow of the CNN output. During optimization process, the CNN output and its shadow are guiding each other so as to achieve the optimal solution as much as possible. Several experiments on dataset CIFAR-10 show the satisfying performance of SRH.