五月丁香四月婷婷激情综合_尹人香蕉网在线视频观看_亚洲国产中文精品盗摄_H无遮挡H无码黄3D漫画_日本国产欧美自拍_色嗨嗨一区二区三区绯色蜜乳_无码精品视频在线观看免费

Deep Neural Collapse (DNC) refers to the surprisingly rigid structure of the data representations in the final layers of Deep Neural Networks (DNNs). Though the phenomenon has been measured in a variety of settings, its emergence is typically explained via data-agnostic approaches, such as the unconstrained features model. In this work, we introduce a data-dependent setting where DNC forms due to feature learning through the average gradient outer product (AGOP). The AGOP is defined with respect to a learned predictor and is equal to the uncentered covariance matrix of its input-output gradients averaged over the training dataset. The Deep Recursive Feature Machine (Deep RFM) is a method that constructs a neural network by iteratively mapping the data with the AGOP and applying an untrained random feature map. We demonstrate empirically that DNC occurs in Deep RFM across standard settings as a consequence of the projection with the AGOP matrix computed at each layer. Further, we theoretically explain DNC in Deep RFM in an asymptotic setting and as a result of kernel learning. We then provide evidence that this mechanism holds for neural networks more generally. In particular, we show that the right singular vectors and values of the weights can be responsible for the majority of within-class variability collapse for DNNs trained in the feature learning regime. As observed in recent work, this singular structure is highly correlated with that of the AGOP.

相關內容

平(ping)均梯度

關注 0

線性的 · 標量 · Lipschitz · Lipschitz連續 · 離散化 ·

2024 年 11 月 13 日

Cost-optimal adaptive FEM with linearization and algebraic solver for semilinear elliptic PDEs

Maximilian Brunner,Dirk Praetorius,Julian Streitberger

We consider scalar semilinear elliptic PDEs, where the nonlinearity is strongly monotone, but only locally Lipschitz continuous. To linearize the arising discrete nonlinear problem, we employ a damped Zarantonello iteration, which leads to a linear Poisson-type equation that is symmetric and positive definite. The resulting system is solved by a contractive algebraic solver such as a multigrid method with local smoothing. We formulate a fully adaptive algorithm that equibalances the various error components coming from mesh refinement, iterative linearization, and algebraic solver. We prove that the proposed adaptive iteratively linearized finite element method (AILFEM) guarantees convergence with optimal complexity, where the rates are understood with respect to the overall computational cost (i.e., the computational time). Numerical experiments investigate the involved adaptivity parameters.

噪聲 · MoDELS · CASE · 分解 · 正則化項 ·

2024 年 11 月 13 日

Noisy image decomposition: a new structure, texture and noise model based on local adaptivity

Jerome Gilles

from arxiv, arXiv admin note: text overlap with arXiv:2411.05265

These last few years, image decomposition algorithms have been proposed to split an image into two parts: the structures and the textures. These algorithms are not adapted to the case of noisy images because the textures are corrupted by noise. In this paper, we propose a new model which decomposes an image into three parts (structures, textures and noise) based on a local regularization scheme. We compare our results with the recent work of Aujol and Chambolle. We finish by giving another model which combines the advantages of the two previous ones.

論文 · 自然語言處理 ·

2024 年 11 月 12 日

Annotating Constructions with UD: the experience of the Italian Constructicon

Ludovica Pannitto,Beatrice Bernasconi,Lucia Busso,Flavio Pisciotta,Giulia Rambelli,Francesca Masini

The paper descirbes a first attempt of linking the Italian constructicon to UD resources

估計/估計量 · 3D · 數據集 · Pair · 回合 ·

2024 年 11 月 11 日

Extreme Rotation Estimation in the Wild

Hana Bezalel,Dotan Ankri,Ruojin Cai,Hadar Averbuch-Elor

from arxiv, Project webpage: //tau-vailab.github.io/ExtremeRotationsInTheWild/

We present a technique and benchmark dataset for estimating the relative 3D orientation between a pair of Internet images captured in an extreme setting, where the images have limited or non-overlapping field of views. Prior work targeting extreme rotation estimation assume constrained 3D environments and emulate perspective images by cropping regions from panoramic views. However, real images captured in the wild are highly diverse, exhibiting variation in both appearance and camera intrinsics. In this work, we propose a Transformer-based method for estimating relative rotations in extreme real-world settings, and contribute the ExtremeLandmarkPairs dataset, assembled from scene-level Internet photo collections. Our evaluation demonstrates that our approach succeeds in estimating the relative rotations in a wide variety of extremeview Internet image pairs, outperforming various baselines, including dedicated rotation estimation techniques and contemporary 3D reconstruction methods.

AI · MoDELS · 模型評估 · 泛化理論 · 語言模型化 ·

2024 年 11 月 11 日

LA4SR: illuminating the dark proteome with generative AI

David R. Nelson,Ashish Kumar Jaiswal,Noha Ismail,Alexandra Mystikou,Kourosh Salehi-Ashtiani

AI language models (LMs) show promise for biological sequence analysis. We re-engineered open-source LMs (GPT-2, BLOOM, DistilRoBERTa, ELECTRA, and Mamba, ranging from 70M to 12B parameters) for microbial sequence classification. The models achieved F1 scores up to 95 and operated 16,580x faster and at 2.9x the recall of BLASTP. They effectively classified the algal dark proteome - uncharacterized proteins comprising about 65% of total proteins - validated on new data including a new, complete Hi-C/Pacbio Chlamydomonas genome. Larger (>1B) LA4SR models reached high accuracy (F1 > 86) when trained on less than 2% of available data, rapidly achieving strong generalization capacity. High accuracy was achieved when training data had intact or scrambled terminal information, demonstrating robust generalization to incomplete sequences. Finally, we provide custom AI explainability software tools for attributing amino acid patterns to AI generative processes and interpret their outputs in evolutionary and biophysical contexts.

Processing（編程語言） · MoDELS · Markov · 馬爾可夫過程 · 泛函 ·

2024 年 11 月 11 日

A mixture transition distribution modeling for higher-order circular Markov processes

Hiroaki Ogata,Takayuki Shiohama

The stationary higher-order Markov process for circular data is considered. We employ the mixture transition distribution (MTD) model to express the transition density of the process on the circle. The underlying circular transition distribution is based on Wehrly and Johnson's bivariate joint circular models. The structures of the circular autocorrelation function together with the circular partial autocorrelation function are found to be similar to those of the autocorrelation and partial autocorrelation functions of the real-valued autoregressive process when the underlying binding density has zero sine moments. The validity of the model is assessed by applying it to some Monte Carlo simulations and real directional data.

估計/估計量 · 相同 · Lyapunov · 優化器 · 數值分析 ·

2024 年 11 月 9 日

Geometric Ergodicity and Strong Error Estimates for Tamed Schemes of Super-linear SODEs

Zhihui Liu,Xiaoming Wu

from arxiv, 21 pages, 4 figures

We construct a family of explicit tamed Euler--Maruyama (TEM) schemes, which can preserve the same Lyapunov structure for super-linear stochastic ordinary differential equations (SODEs) driven by multiplicative noise.These TEM schemes are shown to inherit the geometric ergodicity of the considered SODEs and converge with optimal strong convergence orders. Numerical experiments verify our theoretical results.

變換 · Learning · INFORMS · 新加坡南洋理工大學 · 卷積 ·

2024 年 11 月 8 日

Autoregressive Adaptive Hypergraph Transformer for Skeleton-based Activity Recognition

Abhisek Ray,Ayush Raj,Maheshkumar H. Kolekar

from arxiv, Accepted to WACV 2025

Extracting multiscale contextual information and higher-order correlations among skeleton sequences using Graph Convolutional Networks (GCNs) alone is inadequate for effective action classification. Hypergraph convolution addresses the above issues but cannot harness the long-range dependencies. Transformer proves to be effective in capturing these dependencies and making complex contextual features accessible. We propose an Autoregressive Adaptive HyperGraph Transformer (AutoregAd-HGformer) model for in-phase (autoregressive and discrete) and out-phase (adaptive) hypergraph generation. The vector quantized in-phase hypergraph equipped with powerful autoregressive learned priors produces a more robust and informative representation suitable for hyperedge formation. The out-phase hypergraph generator provides a model-agnostic hyperedge learning technique to align the attributes with input skeleton embedding. The hybrid (supervised and unsupervised) learning in AutoregAd-HGformer explores the action-dependent feature along spatial, temporal, and channel dimensions. The extensive experimental results and ablation study indicate the superiority of our model over state-of-the-art hypergraph architectures on NTU RGB+D, NTU RGB+D 120, and NW-UCLA datasets.

BERT · 語言表示 · state-of-the-art · 可理解性 · 自動問答 ·

2018 年 10 月 11 日

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Jacob Devlin,Ming-Wei Chang,Kenton Lee,Kristina Toutanova

from arxiv, 13 pages

We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers. Unlike recent language representation models, BERT is designed to pre-train deep bidirectional representations by jointly conditioning on both left and right context in all layers. As a result, the pre-trained BERT representations can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks, such as question answering and language inference, without substantial task-specific architecture modifications. BERT is conceptually simple and empirically powerful. It obtains new state-of-the-art results on eleven natural language processing tasks, including pushing the GLUE benchmark to 80.4% (7.6% absolute improvement), MultiNLI accuracy to 86.7 (5.6% absolute improvement) and the SQuAD v1.1 question answering Test F1 to 93.2 (1.5% absolute improvement), outperforming human performance by 2.0%.

知識表示 · Things · 推薦系統 · MoDELS · 邊 ·

2018 年 5 月 10 日

A Unified Knowledge Representation and Context-aware Recommender System in Internet of Things

Yinhao Li,Awa Alqahtani,Ellis Solaiman,Charith Perera,Prem Prakash Jayaraman,Boualem Benatallah,Rajiv Ranjan

Within the rapidly developing Internet of Things (IoT), numerous and diverse physical devices, Edge devices, Cloud infrastructure, and their quality of service requirements (QoS), need to be represented within a unified specification in order to enable rapid IoT application development, monitoring, and dynamic reconfiguration. But heterogeneities among different configuration knowledge representation models pose limitations for acquisition, discovery and curation of configuration knowledge for coordinated IoT applications. This paper proposes a unified data model to represent IoT resource configuration knowledge artifacts. It also proposes IoT-CANE (Context-Aware recommendatioN systEm) to facilitate incremental knowledge acquisition and declarative context driven knowledge recommendation.