动漫AV观看网站不卡无码,欧美成人亚洲国产中文精品,新强乱中文字幕在线播放,黄色视频高清在线观看网页,最新中文字幕一区二区在线

Latent linear dynamical systems with Bernoulli observations provide a powerful modeling framework for identifying the temporal dynamics underlying binary time series data, which arise in a variety of contexts such as binary decision-making and discrete stochastic processes (e.g., binned neural spike trains). Here we develop a spectral learning method for fast, efficient fitting of probit-Bernoulli latent linear dynamical system (LDS) models. Our approach extends traditional subspace identification methods to the Bernoulli setting via a transformation of the first and second sample moments. This results in a robust, fixed-cost estimator that avoids the hazards of local optima and the long computation time of iterative fitting procedures like the expectation-maximization (EM) algorithm. In regimes where data is limited or assumptions about the statistical structure of the data are not met, we demonstrate that the spectral estimate provides a good initialization for Laplace-EM fitting. Finally, we show that the estimator provides substantial benefits to real world settings by analyzing data from mice performing a sensory decision-making task.

相關內容

線性的

關注 1

量子機器學習 · Machine Learning · Learning · 泛化理論 · 核化 ·

2023 年 9 月 19 日

Coreset selection can accelerate quantum machine learning models with provable generalization

Yiming Huang,Huiyuan Wang,Yuxuan Du,Xiao Yuan

from arxiv, 25 pages, 7 figures

Quantum neural networks (QNNs) and quantum kernels stand as prominent figures in the realm of quantum machine learning, poised to leverage the nascent capabilities of near-term quantum computers to surmount classical machine learning challenges. Nonetheless, the training efficiency challenge poses a limitation on both QNNs and quantum kernels, curbing their efficacy when applied to extensive datasets. To confront this concern, we present a unified approach: coreset selection, aimed at expediting the training of QNNs and quantum kernels by distilling a judicious subset from the original training dataset. Furthermore, we analyze the generalization error bounds of QNNs and quantum kernels when trained on such coresets, unveiling the comparable performance with those training on the complete original dataset. Through systematic numerical simulations, we illuminate the potential of coreset selection in expediting tasks encompassing synthetic data classification, identification of quantum correlations, and quantum compiling. Our work offers a useful way to improve diverse quantum machine learning models with a theoretical guarantee while reducing the training cost.

核化 · Jupyter · 端到端 · 縮放 · AMD ·

2023 年 9 月 19 日

Julia as a unifying end-to-end workflow language on the Frontier exascale system

William F. Godoy,Pedro Valero-Lara,Caira Anderson,Katrina W. Lee,Ana Gainaru,Rafael Ferreira da Silva,Jeffrey S. Vetter

from arxiv, 11 pages, 8 figures, accepted at the 18th Workshop on Workflows in Support of Large-Scale Science (WORKS23), IEEE/ACM The International Conference for High Performance Computing, Networking, Storage, and Analysis, SC23

We evaluate using Julia as a single language and ecosystem paradigm powered by LLVM to develop workflow components for high-performance computing. We run a Gray-Scott, 2-variable diffusion-reaction application using a memory-bound, 7-point stencil kernel on Frontier, the US Department of Energy's first exascale supercomputer. We evaluate the feasibility, performance, scaling, and trade-offs of (i) the computational kernel on AMD's MI250x GPUs, (ii) weak scaling up to 4,096 MPI processes/GPUs or 512 nodes, (iii) parallel I/O writes using the ADIOS2 library bindings, and (iv) Jupyter Notebooks for interactive data analysis. Our results suggest that although Julia generates a reasonable LLVM-IR kernel, a nearly 50\% performance difference exists vs. native AMD HIP stencil codes when running on the GPUs. As expected, we observed near-zero overhead when using MPI and parallel I/O bindings for system-wide installed implementations. Consequently, Julia emerges as a compelling high-performance and high-productivity workflow composition strategy, as measured on the fastest supercomputer in the world.

知識 (knowledge) · Learning · state-of-the-art · MoDELS · 基準 ·

2023 年 9 月 18 日

Multi-modality Meets Re-learning: Mitigating Negative Transfer in Sequential Recommendation

Bo Peng,Srinivasan Parthasarathy,Xia Ning

Learning effective recommendation models from sparse user interactions represents a fundamental challenge in developing modern sequential recommendation methods. Recently, pre-training-based methods have been developed to tackle this challenge. The key idea behind these methods is to learn transferable knowledge from related tasks (i.e., auxiliary tasks) via pre-training and adapt the knowledge to the task of interest (i.e., target task) to mitigate its data sparsity, thereby enabling more accurate recommendations. Though promising, in this paper, we show that existing methods suffer from the notorious negative transfer issue, where the model adapted from the pre-trained model results in worse performance compared to the model learned from scratch in the target task. To address this issue, we develop a method, denoted as ANT, for transferable sequential recommendation. Compared to existing methods, ANT mitigates negative transfer by 1) incorporating multi-modality item information, including item texts, images and prices, to effectively learn more transferable knowledge from auxiliary tasks; and 2) better capturing task-specific knowledge in the target task using a re-learning-based adaptation strategy. We evaluate ANT against eight state-of-the-art baseline methods on five target tasks. Our experimental results demonstrate that ANT does not suffer from the negative transfer issue on any of the five tasks. The results also demonstrate that ANT substantially outperforms the state-of-the-art baseline methods in five target tasks with an improvement of as much as 15.2%. Our analysis highlights the utility of item texts, images and prices together for sequential recommendation. It also demonstrates that our re-learning-based strategy is more effective than fine-tuning on all five target tasks.

Integration · MoDELS · 3D · 相互獨立的 · CASES ·

2023 年 9 月 18 日

Exponential time integration for 3D compressible atmospheric models

Greg Rainwater,Kevin C. Viner,P. Alex Reinecke

Recent advancements in evaluating matrix-exponential functions have opened the doors to the practical use of exponential time-integration methods in numerical weather prediction (NWP). The success of exponential methods in shallow water simulations has led to the question of whether they can be beneficial in a 3D atmospheric model. In this paper, we take the first step forward by evaluating the behavior of exponential time-integration methods in the Navy's compressible deep-atmosphere nonhydrostatic global model (NEPTUNE-Navy Environmental Prediction sysTem Utilizing a Nonhydrostatic Engine). Simulations are conducted on a set of idealized test cases designed to assess key features of a nonhydrostatic model and demonstrate that exponential integrators capture the desired large and small-scale traits, yielding results comparable to those found in the literature. We propose a new upper boundary absorbing layer independent of reference state and shown to be effective in both idealized and real-data simulations. A real-data forecast using an exponential method with full physics is presented, providing a positive outlook for using exponential integrators for NWP.

MoDELS · xgboost · 穩健性 · Performer · Learning ·

2023 年 9 月 18 日

Deep incremental learning models for financial temporal tabular datasets with distribution shifts

Thomas Wong,Mauricio Barahona

We present a robust deep incremental learning framework for regression tasks on financial temporal tabular datasets which is built upon the incremental use of commonly available tabular and time series prediction models to adapt to distributional shifts typical of financial datasets. The framework uses a simple basic building block (decision trees) to build self-similar models of any required complexity to deliver robust performance under adverse situations such as regime changes, fat-tailed distributions, and low signal-to-noise ratios. As a detailed study, we demonstrate our scheme using XGBoost models trained on the Numerai dataset and show that a two layer deep ensemble of XGBoost models over different model snapshots delivers high quality predictions under different market regimes. We also show that the performance of XGBoost models with different number of boosting rounds in three scenarios (small, standard and large) is monotonically increasing with respect to model size and converges towards the generalisation upper bound. We also evaluate the robustness of the model under variability of different hyperparameters, such as model complexity and data sampling settings. Our model has low hardware requirements as no specialised neural architectures are used and each base model can be independently trained in parallel.

MoDELS · 語音識別 · 自動語音識別 · Performer · 后驗分布 ·

2023 年 9 月 18 日

Training dynamic models using early exits for automatic speech recognition on resource-constrained devices

George August Wright,Umberto Cappellazzo,Salah Zaiem,Desh Raj,Lucas Ondel Yang,Daniele Falavigna,Alessio Brutti

The possibility of dynamically modifying the computational load of neural models at inference time is crucial for on-device processing, where computational power is limited and time-varying. Established approaches for neural model compression exist, but they provide architecturally static models. In this paper, we investigate the use of early-exit architectures, that rely on intermediate exit branches, applied to large-vocabulary speech recognition. This allows for the development of dynamic models that adjust their computational cost to the available resources and recognition performance. Unlike previous works, besides using pre-trained backbones we also train the model from scratch with an early-exit architecture. Experiments on public datasets show that early-exit architectures from scratch not only preserve performance levels when using fewer encoder layers, but also improve task accuracy as compared to using single-exit models or using pre-trained models. Additionally, we investigate an exit selection strategy based on posterior probabilities as an alternative to frame-based entropy.

有偏 · state-of-the-art · MoDELS · 地球 · Learning ·

2023 年 9 月 15 日

Deep learning for bias-correcting CMIP6-class Earth system models

Philipp Hess,Stefan Lange,Christof Sch?tz,Niklas Boers

The accurate representation of precipitation in Earth system models (ESMs) is crucial for reliable projections of the ecological and socioeconomic impacts in response to anthropogenic global warming. The complex cross-scale interactions of processes that produce precipitation are challenging to model, however, inducing potentially strong biases in ESM fields, especially regarding extremes. State-of-the-art bias correction methods only address errors in the simulated frequency distributions locally at every individual grid cell. Improving unrealistic spatial patterns of the ESM output, which would require spatial context, has not been possible so far. Here, we show that a post-processing method based on physically constrained generative adversarial networks (cGANs) can correct biases of a state-of-the-art, CMIP6-class ESM both in local frequency distributions and in the spatial patterns at once. While our method improves local frequency distributions equally well as gold-standard bias-adjustment frameworks, it strongly outperforms any existing methods in the correction of spatial patterns, especially in terms of the characteristic spatial intermittency of precipitation extremes.

假陽性 · MoDELS · 假正例率 · state-of-the-art · 分離的 ·

2023 年 9 月 15 日

Predictive change point detection for heterogeneous data

Anna-Christina Glock,Florian Sobieczky,Johannes Fürnkranz,Peter Filzmoser,Martin Jech

A change point detection (CPD) framework assisted by a predictive machine learning model called "Predict and Compare" is introduced and characterised in relation to other state-of-the-art online CPD routines which it outperforms in terms of false positive rate and out-of-control average run length. The method's focus is on improving standard methods from sequential analysis such as the CUSUM rule in terms of these quality measures. This is achieved by replacing typically used trend estimation functionals such as the running mean with more sophisticated predictive models (Predict step), and comparing their prognosis with actual data (Compare step). The two models used in the Predict step are the ARIMA model and the LSTM recursive neural network. However, the framework is formulated in general terms, so as to allow the use of other prediction or comparison methods than those tested here. The power of the method is demonstrated in a tribological case study in which change points separating the run-in, steady-state, and divergent wear phases are detected in the regime of very few false positives.

Learning · 約束 · 多峰值 · Storage · ForCES ·

2023 年 9 月 15 日

Solving multiphysics-based inverse problems with learned surrogates and constraints

Ziyi Yin,Rafael Orozco,Mathias Louboutin,Felix J. Herrmann

Solving multiphysics-based inverse problems for geological carbon storage monitoring can be challenging when multimodal time-lapse data are expensive to collect and costly to simulate numerically. We overcome these challenges by combining computationally cheap learned surrogates with learned constraints. Not only does this combination lead to vastly improved inversions for the important fluid-flow property, permeability, it also provides a natural platform for inverting multimodal data including well measurements and active-source time-lapse seismic data. By adding a learned constraint, we arrive at a computationally feasible inversion approach that remains accurate. This is accomplished by including a trained deep neural network, known as a normalizing flow, which forces the model iterates to remain in-distribution, thereby safeguarding the accuracy of trained Fourier neural operators that act as surrogates for the computationally expensive multiphase flow simulations involving partial differential equation solves. By means of carefully selected experiments, centered around the problem of geological carbon storage, we demonstrate the efficacy of the proposed constrained optimization method on two different data modalities, namely time-lapse well and time-lapse seismic data. While permeability inversions from both these two modalities have their pluses and minuses, their joint inversion benefits from either, yielding valuable superior permeability inversions and CO2 plume predictions near, and far away, from the monitoring wells.

模型評估 · MoDELS · 學成 · AIM · 特化 ·

2019 年 1 月 14 日

Interpretable machine learning: definitions, methods, and applications

W. James Murdoch,Chandan Singh,Karl Kumbier,Reza Abbasi-Asl,Bin Yu

from arxiv, 11 pages

Machine-learning models have demonstrated great success in learning complex patterns that enable them to make predictions about unobserved data. In addition to using models for prediction, the ability to interpret what a model has learned is receiving an increasing amount of attention. However, this increased focus has led to considerable confusion about the notion of interpretability. In particular, it is unclear how the wide array of proposed interpretation methods are related, and what common concepts can be used to evaluate them. We aim to address these concerns by defining interpretability in the context of machine learning and introducing the Predictive, Descriptive, Relevant (PDR) framework for discussing interpretations. The PDR framework provides three overarching desiderata for evaluation: predictive accuracy, descriptive accuracy and relevancy, with relevancy judged relative to a human audience. Moreover, to help manage the deluge of interpretation methods, we introduce a categorization of existing techniques into model-based and post-hoc categories, with sub-groups including sparsity, modularity and simulatability. To demonstrate how practitioners can use the PDR framework to evaluate and understand interpretations, we provide numerous real-world examples. These examples highlight the often under-appreciated role played by human audiences in discussions of interpretability. Finally, based on our framework, we discuss limitations of existing methods and directions for future work. We hope that this work will provide a common vocabulary that will make it easier for both practitioners and researchers to discuss and choose from the full range of interpretation methods.