亚洲精品无码国产爽快A片百度-日韩一区二区三区免费视

Recently, Neural Topic Models (NTM), inspired by variational autoencoders, have attracted a lot of research interest; however, these methods have limited applications in the real world due to the challenge of incorporating human knowledge. This work presents a semi-supervised neural topic modeling method, vONTSS, which uses von Mises-Fisher (vMF) based variational autoencoders and optimal transport. When a few keywords per topic are provided, vONTSS in the semi-supervised setting generates potential topics and optimizes topic-keyword quality and topic classification. Experiments show that vONTSS outperforms existing semi-supervised topic modeling methods in classification accuracy and diversity. vONTSS also supports unsupervised topic modeling. Quantitative and qualitative experiments show that vONTSS in the unsupervised setting outperforms recent NTMs on multiple aspects: vONTSS discovers highly clustered and coherent topics on benchmark datasets. It is also much faster than the state-of-the-art weakly supervised text classification method while achieving similar classification performance. We further prove the equivalence of optimal transport loss and cross-entropy loss at the global minimum.

相關內容

話題模型

關注 0

Integration · TOOLS · 置信度 · INFORMS · 估計/估計量 ·

2023 年 11 月 1 日

Integrating measures of replicability into literature search: Challenges and opportunities

Chuhao Wu,Tatiana Chakravorti,John Carroll,Sarah Rajtmajer

Challenges to reproducibility and replicability have gained widespread attention over the past decade, driven by a number of large replication projects with lukewarm success rates. A nascent work has emerged developing algorithms to estimate, or predict, the replicability of published findings. The current study explores ways in which AI-enabled signals of confidence in research might be integrated into literature search. We interview 17 PhD researchers about their current processes for literature search and ask them to provide feedback on a prototype replicability estimation tool. Our findings suggest that information about replicability can support researchers throughout literature review and research design processes. However, explainability and interpretability of system outputs is critical, and potential drawbacks of AI-enabled confidence assessment need to be further studied before such tools could be widely accepted and deployed. We discuss implications for the design of technological tools to support scholarly activities and advance reproducibility and replicability.

估計/估計量 · 穩健性 · ReLU · Networking · Neural Networks ·

2023 年 10 月 31 日

Robust nonparametric regression based on deep ReLU neural networks

Juntong Chen

from arxiv, 40 pages

In this paper, we consider robust nonparametric regression using deep neural networks with ReLU activation function. While several existing theoretically justified methods are geared towards robustness against identical heavy-tailed noise distributions, the rise of adversarial attacks has emphasized the importance of safeguarding estimation procedures against systematic contamination. We approach this statistical issue by shifting our focus towards estimating conditional distributions. To address it robustly, we introduce a novel estimation procedure based on $\ell$-estimation. Under a mild model assumption, we establish general non-asymptotic risk bounds for the resulting estimators, showcasing their robustness against contamination, outliers, and model misspecification. We then delve into the application of our approach using deep ReLU neural networks. When the model is well-specified and the regression function belongs to an $\alpha$-H\"older class, employing $\ell$-type estimation on suitable networks enables the resulting estimators to achieve the minimax optimal rate of convergence. Additionally, we demonstrate that deep $\ell$-type estimators can circumvent the curse of dimensionality by assuming the regression function closely resembles the composition of several H\"older functions. To attain this, new deep fully-connected ReLU neural networks have been designed to approximate this composition class. This approximation result can be of independent interest.

Performer · 層 · MoDELS · 全 · Networking ·

2023 年 10 月 31 日

Your representations are in the network: composable and parallel adaptation for large scale models

Yonatan Dukler,Alessandro Achille,Hao Yang,Varsha Vivek,Luca Zancato,Benjamin Bowman,Avinash Ravichandran,Charless Fowlkes,Ashwin Swaminathan,Stefano Soatto

from arxiv, Accepted to NeurIPS 2023

We propose InCA, a lightweight method for transfer learning that cross-attends to any activation layer of a pre-trained model. During training, InCA uses a single forward pass to extract multiple activations, which are passed to external cross-attention adapters, trained anew and combined or selected for downstream tasks. We show that, even when selecting a single top-scoring adapter, InCA achieves performance comparable to full fine-tuning, at a cost comparable to fine-tuning just the last layer. For example, with a cross-attention probe 1.3% the size of a pre-trained ViT-L/16 model, we achieve performance within 0.2% of the full fine-tuning paragon at a computational training cost of 51% of the baseline, on average across 11 downstream classification. Unlike other forms of efficient adaptation, InCA does not require backpropagating through the pre-trained model, thus leaving its execution unaltered at both training and inference. The versatility of InCA is best illustrated in fine-grained tasks, which may require accessing information absent in the last layer but accessible in intermediate layer activations. Since the backbone is fixed, InCA allows parallel ensembling as well as parallel execution of multiple tasks. InCA achieves state-of-the-art performance in the ImageNet-to-Sketch multi-task benchmark.

Networking · 判別式模型 · 生成器網絡 · 判別器 · 可行 ·

2023 年 10 月 31 日

Visible to Thermal image Translation for improving visual task in low light conditions

Md Azim Khan

Several visual tasks, such as pedestrian detection and image-to-image translation, are challenging to accomplish in low light using RGB images. Heat variation of objects in thermal images can be used to overcome this. In this work, an end-to-end framework, which consists of a generative network and a detector network, is proposed to translate RGB image into Thermal ones and compare generated thermal images with real data. We have collected images from two different locations using the Parrot Anafi Thermal drone. After that, we created a two-stream network, preprocessed, augmented, the image data, and trained the generator and discriminator models from scratch. The findings demonstrate that it is feasible to translate RGB training data to thermal data using GAN. As a result, thermal data can now be produced more quickly and affordably, which is useful for security and surveillance applications.

Continuity · INTERACT · 相互獨立的 · MoDELS · 原點 ·

2023 年 10 月 30 日

Kinetic description and macroscopic limit of swarming dynamics with continuous leader-follower transitions

Emiliano Cristiani,Nadia Loy,Marta Menci,Andrea Tosin

from arxiv, 30 pages, 7 figures

In this paper, we derive a kinetic description of swarming particle dynamics in an interacting multi-agent system featuring emerging leaders and followers. Agents are classically characterized by their position and velocity plus a continuous parameter quantifying their degree of leadership. The microscopic processes ruling the change of velocity and degree of leadership are independent, non-conservative and non-local in the physical space, so as to account for long-range interactions. Out of the kinetic description, we obtain then a macroscopic model under a hydrodynamic limit reminiscent of that used to tackle the hydrodynamics of weakly dissipative granular gases, thus relying in particular on a regime of small non-conservative and short-range interactions. Numerical simulations in one- and two-dimensional domains show that the limiting macroscopic model is consistent with the original particle dynamics and furthermore can reproduce classical emerging patterns typically observed in swarms.

binary · Networking · Neural Networks · 查準率/準確率 · Networks ·

2023 年 10 月 19 日

Training binary neural networks without floating point precision

Federico Fontana

from arxiv, 74 pages, Master's thesis

The main goal of this work is to improve the efficiency of training binary neural networks, which are low latency and low energy networks. The main contribution of this work is the proposal of two solutions comprised of topology changes and strategy training that allow the network to achieve near the state-of-the-art performance and efficient training. The time required for training and the memory required in the process are two factors that contribute to efficient training.

圖片分類 · 前饋網絡 · INTERACT · Networking · 前饋 ·

2021 年 5 月 7 日

ResMLP: Feedforward networks for image classification with data-efficient training

Hugo Touvron,Piotr Bojanowski,Mathilde Caron,Matthieu Cord,Alaaeldin El-Nouby,Edouard Grave,Armand Joulin,Gabriel Synnaeve,Jakob Verbeek,Hervé Jégou

We present ResMLP, an architecture built entirely upon multi-layer perceptrons for image classification. It is a simple residual network that alternates (i) a linear layer in which image patches interact, independently and identically across channels, and (ii) a two-layer feed-forward network in which channels interact independently per patch. When trained with a modern training strategy using heavy data-augmentation and optionally distillation, it attains surprisingly good accuracy/complexity trade-offs on ImageNet. We will share our code based on the Timm library and pre-trained models.

鏈路預測 · 知識庫 · 基 · Extensibility · INTERACT ·

2020 年 4 月 10 日

Tensor Decompositions for temporal knowledge base completion

Timothée Lacroix,Guillaume Obozinski,Nicolas Usunier

Most algorithms for representation learning and link prediction in relational data have been designed for static data. However, the data they are applied to usually evolves with time, such as friend graphs in social networks or user interactions with items in recommender systems. This is also the case for knowledge bases, which contain facts such as (US, has president, B. Obama, [2009-2017]) that are valid only at certain points in time. For the problem of link prediction under temporal constraints, i.e., answering queries such as (US, has president, ?, 2012), we propose a solution inspired by the canonical decomposition of tensors of order 4. We introduce new regularization schemes and present an extension of ComplEx (Trouillon et al., 2016) that achieves state-of-the-art performance. Additionally, we propose a new dataset for knowledge base completion constructed from Wikidata, larger than previous benchmarks by an order of magnitude, as a new reference for evaluating temporal and non-temporal link prediction methods.

圖形處理器 · 圖 · INTERACT · Performer · Neural Networks ·

2019 年 11 月 6 日

Hyper-SAGNN: a self-attention based graph neural network for hypergraphs

Ruochi Zhang,Yuesong Zou,Jian Ma

Graph representation learning for hypergraphs can be used to extract patterns among higher-order interactions that are critically important in many real world problems. Current approaches designed for hypergraphs, however, are unable to handle different types of hypergraphs and are typically not generic for various learning tasks. Indeed, models that can predict variable-sized heterogeneous hyperedges have not been available. Here we develop a new self-attention based graph neural network called Hyper-SAGNN applicable to homogeneous and heterogeneous hypergraphs with variable hyperedge sizes. We perform extensive evaluations on multiple datasets, including four benchmark network datasets and two single-cell Hi-C datasets in genomics. We demonstrate that Hyper-SAGNN significantly outperforms the state-of-the-art methods on traditional tasks while also achieving great performance on a new task called outsider identification. Hyper-SAGNN will be useful for graph representation learning to uncover complex higher-order interactions in different applications.

Networking · Neural Networks · 卷積神經網絡 · 卷積 · Network Dissection ·

2018 年 4 月 30 日

How convolutional neural network see the world - A survey of convolutional neural network visualization methods

Zhuwei Qin,Funxun Yu,Chenchen Liu,Xiang Chen

from arxiv, 32 pages, 21 figures. Mathematical Foundations of Computing

Nowadays, the Convolutional Neural Networks (CNNs) have achieved impressive performance on many computer vision related tasks, such as object detection, image recognition, image retrieval, etc. These achievements benefit from the CNNs outstanding capability to learn the input features with deep layers of neuron structures and iterative training process. However, these learned features are hard to identify and interpret from a human vision perspective, causing a lack of understanding of the CNNs internal working mechanism. To improve the CNN interpretability, the CNN visualization is well utilized as a qualitative analysis method, which translates the internal features into visually perceptible patterns. And many CNN visualization works have been proposed in the literature to interpret the CNN in perspectives of network structure, operation, and semantic concept. In this paper, we expect to provide a comprehensive survey of several representative CNN visualization methods, including Activation Maximization, Network Inversion, Deconvolutional Neural Networks (DeconvNet), and Network Dissection based visualization. These methods are presented in terms of motivations, algorithms, and experiment results. Based on these visualization methods, we also discuss their practical applications to demonstrate the significance of the CNN interpretability in areas of network design, optimization, security enhancement, etc.