爱琴海论坛视频播放三免费,国产免费一区二区三区在线能观看,亚洲国产一区三区四区二区

Regularization plays a crucial role in machine learning models, especially for deep neural networks. The existing regularization techniques mainly rely on the i.i.d. assumption and only consider the knowledge from the current sample, without the leverage of the neighboring relationship between samples. In this work, we propose a general regularizer called \textbf{Patch-level Neighborhood Interpolation~(Pani)} that conducts a non-local representation in the computation of networks. Our proposal explicitly constructs patch-level graphs in different layers and then linearly interpolates neighborhood patch features, serving as a general and effective regularization strategy. Further, we customize our approach into two kinds of popular regularization methods, namely Virtual Adversarial Training (VAT) and MixUp as well as its variants. The first derived \textbf{Pani VAT} presents a novel way to construct non-local adversarial smoothness by employing patch-level interpolated perturbations. The second derived \textbf{Pani MixUp} method extends the MixUp, and achieves superiority over MixUp and competitive performance over state-of-the-art variants of MixUp method with a significant advantage in computational efficiency. Extensive experiments have verified the effectiveness of our Pani approach in both supervised and semi-supervised settings.

相關內容

正則化項

關注 0

任務對話系統 · INTERACT · Learning · Automator · AIM ·

2023 年 11 月 14 日

Workflow-Guided Response Generation for Task-Oriented Dialogue

Do June Min,Paloma Sodhi,Ramya Ramakrishnan

Task-oriented dialogue (TOD) systems aim to achieve specific goals through interactive dialogue. Such tasks usually involve following specific workflows, i.e. executing a sequence of actions in a particular order. While prior work has focused on supervised learning methods to condition on past actions, they do not explicitly optimize for compliance to a desired workflow. In this paper, we propose a novel framework based on reinforcement learning (RL) to generate dialogue responses that are aligned with a given workflow. Our framework consists of ComplianceScorer, a metric designed to evaluate how well a generated response executes the specified action, combined with an RL opimization process that utilizes an interactive sampling technique. We evaluate our approach on two TOD datasets, Action-Based Conversations Dataset (ABCD) (Chen et al., 2021a) and MultiWOZ 2.2 (Zang et al., 2020) on a range of automated and human evaluation metrics. Our findings indicate that our RL-based framework outperforms baselines and is effective at enerating responses that both comply with the intended workflows while being expressed in a natural and fluent manner.

語言模型化 · MoDELS · 層 · motivation · Performer ·

2023 年 11 月 14 日

No Train No Gain: Revisiting Efficient Training Algorithms For Transformer-based Language Models

Jean Kaddour,Oscar Key,Piotr Nawrot,Pasquale Minervini,Matt J. Kusner

from arxiv, NeurIPS 2023

The computation necessary for training Transformer-based language models has skyrocketed in recent years. This trend has motivated research on efficient training algorithms designed to improve training, validation, and downstream performance faster than standard training. In this work, we revisit three categories of such algorithms: dynamic architectures (layer stacking, layer dropping), batch selection (selective backprop, RHO loss), and efficient optimizers (Lion, Sophia). When pre-training BERT and T5 with a fixed computation budget using such methods, we find that their training, validation, and downstream gains vanish compared to a baseline with a fully-decayed learning rate. We define an evaluation protocol that enables computation to be done on arbitrary machines by mapping all computation time to a reference machine which we call reference system time. We discuss the limitations of our proposed protocol and release our code to encourage rigorous research in efficient training procedures: //github.com/JeanKaddour/NoTrainNoGain.

演繹推理 · Learning · 可辨認的 · 語言模型化 · 有向 ·

2023 年 11 月 14 日

Learning Deductive Reasoning from Synthetic Corpus based on Formal Logic

Terufumi Morishita,Gaku Morio,Atsuki Yamaguchi,Yasuhiro Sogawa

We study a synthetic corpus based approach for language models (LMs) to acquire logical deductive reasoning ability. The previous studies generated deduction examples using specific sets of deduction rules. However, these rules were limited or otherwise arbitrary, limiting the generalizability of acquired reasoning ability. We rethink this and adopt a well-grounded set of deduction rules based on formal logic theory, which can derive any other deduction rules when combined in a multistep way. Then, using the proposed corpora, which we name FLD (Formal Logic Deduction), we first evaluate and analyze the logical reasoning ability of the latest LLMs. Even GPT-4 can solve only half of the problems, suggesting that pure logical reasoning isolated from knowledge is still challenging for the LLMs, and additional training specialized in logical reasoning is indeed essential. We next empirically verify that LMs trained on FLD corpora acquire more generalizable reasoning ability. Furthermore, we identify the aspects of reasoning ability on which deduction corpora can enhance LMs and those on which they cannot, and discuss future directions on each aspect. The released corpora serve both as learning resources and as challenging benchmarks.

Networking · Neural Networks · Weight · Pivotal（公司） · 神經形態計算 ·

2023 年 11 月 13 日

Hybrid Synaptic Structure for Spiking Neural Network Realization

Sasan Razmkhah,Mustafa Altay Karamuftuoglu,Ali Bozbey

from arxiv, 7 pages, 10 figures

Neural networks and neuromorphic computing play pivotal roles in deep learning and machine vision. Due to their dissipative nature and inherent limitations, traditional semiconductor-based circuits face challenges in realizing ultra-fast and low-power neural networks. However, the spiking behavior characteristic of single flux quantum (SFQ) circuits positions them as promising candidates for spiking neural networks (SNNs). Our previous work showcased a JJ-Soma design capable of operating at tens of gigahertz while consuming only a fraction of the power compared to traditional circuits, as documented in [1]. This paper introduces a compact SFQ-based synapse design that applies positive and negative weighted inputs to the JJ-Soma. Using an RSFQ synapse empowers us to replicate the functionality of a biological neuron, a crucial step in realizing a complete SNN. The JJ-Synapse can operate at ultra-high frequencies, exhibits orders of magnitude lower power consumption than CMOS counterparts, and can be conveniently fabricated using commercial Nb processes. Furthermore, the network's flexibility enables modifications by incorporating cryo-CMOS circuits for weight value adjustments. In our endeavor, we have successfully designed, fabricated, and partially tested the JJ-Synapse within our cryocooler system. Integration with the JJ-Soma further facilitates the realization of a high-speed inference SNN.

Networking · Neural Networks · MoDELS · DNN · 可辨認的 ·

2023 年 11 月 13 日

Tabdoor: Backdoor Vulnerabilities in Transformer-based Neural Networks for Tabular Data

Bart Pleiter,Behrad Tajalli,Stefanos Koffas,Gorka Abad,Jing Xu,Martha Larson,Stjepan Picek

Deep neural networks (DNNs) have shown great promise in various domains. Alongside these developments, vulnerabilities associated with DNN training, such as backdoor attacks, are a significant concern. These attacks involve the subtle insertion of triggers during model training, allowing for manipulated predictions. More recently, DNNs for tabular data have gained increasing attention due to the rise of transformer models. Our research presents a comprehensive analysis of backdoor attacks on tabular data using DNNs, particularly focusing on transformer-based networks. Given the inherent complexities of tabular data, we explore the challenges of embedding backdoors. Through systematic experimentation across benchmark datasets, we uncover that transformer-based DNNs for tabular data are highly susceptible to backdoor attacks, even with minimal feature value alterations. Our results indicate nearly perfect attack success rates (approx100%) by introducing novel backdoor attack strategies to tabular data. Furthermore, we evaluate several defenses against these attacks, identifying Spectral Signatures as the most effective one. Our findings highlight the urgency to address such vulnerabilities and provide insights into potential countermeasures for securing DNN models against backdoors on tabular data.

優化器 · 共軛梯度 · 共軛 · 約束優化 · 流形 ·

2023 年 11 月 12 日

Symbol-Error Probability Constrained Power Minimization for Reconfigurable Intelligent Surfaces-based Passive Transmitter

Erico S. P. Lopes,Lukas T. N. Landau

This study considers a virtual multiuser multiple-input multiple-output system with PSK modulation realized via the reconfigurable intelligent surface-based passive transmitter setup. Under this framework, the study derives the formulation for the union-bound symbol-error probability, which is an upper bound on the actual symbol-error probability. Based on this, a symbol-level precoding power minimization problem under the condition that the union-bound symbol-error probability is below a given requirement is proposed. The problem is formulated as a constrained optimization on an oblique manifold, and solved via a bisection method. The method consists of successively optimizing transmit power while evaluating the feasibility of the union-bound symbol-error probability requisite by solving, via the Riemannian conjugate gradient algorithm, an auxiliary problem dependent only on the reflection coefficients of the reconfigurable intelligent surface elements. Numerical results demonstrate the effectiveness of the proposed approach in minimizing the transmit power for different symbol-error probability requirements.

離散化 · Continuity · Branch · MoDELS · 可約的 ·

2023 年 11 月 11 日

VQ-NeRF: Neural Reflectance Decomposition and Editing with Vector Quantization

Hongliang Zhong,Jingbo Zhang,Jing Liao

from arxiv, Accepted by TVCG. Project Page: //jtbzhl.github.io/VQ-NeRF.github.io/

We propose VQ-NeRF, a two-branch neural network model that incorporates Vector Quantization (VQ) to decompose and edit reflectance fields in 3D scenes. Conventional neural reflectance fields use only continuous representations to model 3D scenes, despite the fact that objects are typically composed of discrete materials in reality. This lack of discretization can result in noisy material decomposition and complicated material editing. To address these limitations, our model consists of a continuous branch and a discrete branch. The continuous branch follows the conventional pipeline to predict decomposed materials, while the discrete branch uses the VQ mechanism to quantize continuous materials into individual ones. By discretizing the materials, our model can reduce noise in the decomposition process and generate a segmentation map of discrete materials. Specific materials can be easily selected for further editing by clicking on the corresponding area of the segmentation outcomes. Additionally, we propose a dropout-based VQ codeword ranking strategy to predict the number of materials in a scene, which reduces redundancy in the material segmentation process. To improve usability, we also develop an interactive interface to further assist material editing. We evaluate our model on both computer-generated and real-world scenes, demonstrating its superior performance. To the best of our knowledge, our model is the first to enable discrete material editing in 3D scenes.

有向 · 傳感器 · 可約的 · INTERACT · HTTPS ·

2023 年 11 月 10 日

TacIPC: Intersection- and Inversion-free FEM-based Elastomer Simulation For Optical Tactile Sensors

Wenxin Du,Wenqiang Xu,Jieji Ren,Zhenjun Yu,Cewu Lu

Tactile perception stands as a critical sensory modality for human interaction with the environment. Among various tactile sensor techniques, optical sensor-based approaches have gained traction, notably for producing high-resolution tactile images. This work explores gel elastomer deformation simulation through a physics-based approach. While previous works in this direction usually adopt the explicit material point method (MPM), which has certain limitations in force simulation and rendering, we adopt the finite element method (FEM) and address the challenges in penetration and mesh distortion with incremental potential contact (IPC) method. As a result, we present a simulator named TacIPC, which can ensure numerically stable simulations while accommodating direct rendering and friction modeling. To evaluate TacIPC, we conduct three tasks: pseudo-image quality assessment, deformed geometry estimation, and marker displacement prediction. These tasks show its superior efficacy in reducing the sim-to-real gap. Our method can also seamlessly integrate with existing simulators. More experiments and videos can be found in the supplementary materials and on the website: //sites.google.com/view/tac-ipc.

Networking · 預測器/決策函數 · Neural Networks · 統計量 · Learning ·

2023 年 11 月 10 日

Efficient Uncertainty Quantification and Reduction for Over-Parameterized Neural Networks

Ziyi Huang,Henry Lam,Haofeng Zhang

Uncertainty quantification (UQ) is important for reliability assessment and enhancement of machine learning models. In deep learning, uncertainties arise not only from data, but also from the training procedure that often injects substantial noises and biases. These hinder the attainment of statistical guarantees and, moreover, impose computational challenges on UQ due to the need for repeated network retraining. Building upon the recent neural tangent kernel theory, we create statistically guaranteed schemes to principally \emph{characterize}, and \emph{remove}, the uncertainty of over-parameterized neural networks with very low computation effort. In particular, our approach, based on what we call a procedural-noise-correcting (PNC) predictor, removes the procedural uncertainty by using only \emph{one} auxiliary network that is trained on a suitably labeled dataset, instead of many retrained networks employed in deep ensembles. Moreover, by combining our PNC predictor with suitable light-computation resampling methods, we build several approaches to construct asymptotically exact-coverage confidence intervals using as low as four trained networks without additional overheads.

度量學習 · 學成 · state-of-the-art · 學習器 · 集成 ·

2018 年 4 月 2 日

Attention-based Ensemble for Deep Metric Learning

Wonsik Kim,Bhavya Goyal,Kunal Chawla,Jungmin Lee,Keunjoo Kwon

Recently, ensemble has been applied to deep metric learning to yield state-of-the-art results. Deep metric learning aims to learn deep neural networks for feature embeddings, distances of which satisfy given constraint. In deep metric learning, ensemble takes average of distances learned by multiple learners. As one important aspect of ensemble, the learners should be diverse in their feature embeddings. To this end, we propose an attention-based ensemble, which uses multiple attention masks, so that each learner can attend to different parts of the object. We also propose a divergence loss, which encourages diversity among the learners. The proposed method is applied to the standard benchmarks of deep metric learning and experimental results show that it outperforms the state-of-the-art methods by a significant margin on image retrieval tasks.