亚洲乱色熟女一区二区三区麻豆_在线观看亚洲国产成人精品_无码中文字幕久久不卡_天天干天天日天天射天天操_一区二区三区精品免费在线观看_免费在线观看黄色视频的网址_热99RE久久免费视情品

In this paper, we exploit the unique properties of a deterministic projected belief network (D-PBN) to take full advantage of trainable compound activation functions (TCAs). A D-PBN is a type of auto-encoder that operates by "backing up" through a feed-forward neural network. TCAs are activation functions with complex monotonic-increasing shapes that change the distribution of the data so that the linear transformation that follows is more effective. Because a D-PBN operates by "backing up", the TCAs are inverted in the reconstruction process, restoring the original distribution of the data, thus taking advantage of a given TCA in both analysis and reconstruction. In this paper, we show that a D-PBN auto-encoder with TCAs can significantly out-perform standard auto-encoders including variational auto-encoders.

相關內容

信念(nian)網/信念(nian)網絡(luo)

關注 0

穩健性 · MoDELS · BERT · 可約的 · 可辨認的 ·

2023 年 10 月 31 日

BERT Lost Patience Won't Be Robust to Adversarial Slowdown

Zachary Coalson,Gabriel Ritter,Rakesh Bobba,Sanghyun Hong

from arxiv, Accepted to NeurIPS 2023 [Poster]

In this paper, we systematically evaluate the robustness of multi-exit language models against adversarial slowdown. To audit their robustness, we design a slowdown attack that generates natural adversarial text bypassing early-exit points. We use the resulting WAFFLE attack as a vehicle to conduct a comprehensive evaluation of three multi-exit mechanisms with the GLUE benchmark against adversarial slowdown. We then show our attack significantly reduces the computational savings provided by the three methods in both white-box and black-box settings. The more complex a mechanism is, the more vulnerable it is to adversarial slowdown. We also perform a linguistic analysis of the perturbed text inputs, identifying common perturbation patterns that our attack generates, and comparing them with standard adversarial text attacks. Moreover, we show that adversarial training is ineffective in defeating our slowdown attack, but input sanitization with a conversational model, e.g., ChatGPT, can remove perturbations effectively. This result suggests that future work is needed for developing efficient yet robust multi-exit models. Our code is available at: //github.com/ztcoalson/WAFFLE

SimPLe · Analysis · 相似度 · LDPC · 圖 ·

2023 年 10 月 29 日

Simple Constructions of Unique Neighbor Expanders from Error-correcting Codes

Swastik Kopparty,Noga Ron-Zewi,Shubhangi Saraf

In this note, we give very simple constructions of unique neighbor expander graphs starting from spectral or combinatorial expander graphs of mild expansion. These constructions and their analysis are simple variants of the constructions of LDPC error-correcting codes from expanders, given by Sipser-Spielman\cite{SS96} (and Tanner\cite{Tanner81}), and their analysis. We also show how to obtain expanders with many unique neighbors using similar ideas. There were many exciting results on this topic recently, starting with Asherov-Dinur\cite{AD23} and Hsieh-McKenzie-Mohanty-Paredes\cite{HMMP23}, who gave a similar construction of unique neighbor expander graphs, but using more sophisticated ingredients (such as almost-Ramanujan graphs) and a more involved analysis. Subsequent beautiful works of Cohen-Roth-TaShma\cite{CRT23} and Golowich\cite{Golowich23} gave even stronger objects (lossless expanders), but also using sophisticated ingredients. The main contribution of this work is that we get much more elementary constructions of unique neighbor expanders and with a simpler analysis.

MoDELS · Prompt · Extensibility · Performer · Nuance ·

2023 年 10 月 29 日

Text Promptable Surgical Instrument Segmentation with Vision-Language Models

Zijian Zhou,Oluwatosin Alabi,Meng Wei,Tom Vercauteren,Miaojing Shi

from arxiv, NeurIPS 2023

In this paper, we propose a novel text promptable surgical instrument segmentation approach to overcome challenges associated with diversity and differentiation of surgical instruments in minimally invasive surgeries. We redefine the task as text promptable, thereby enabling a more nuanced comprehension of surgical instruments and adaptability to new instrument types. Inspired by recent advancements in vision-language models, we leverage pretrained image and text encoders as our model backbone and design a text promptable mask decoder consisting of attention- and convolution-based prompting schemes for surgical instrument segmentation prediction. Our model leverages multiple text prompts for each surgical instrument through a new mixture of prompts mechanism, resulting in enhanced segmentation performance. Additionally, we introduce a hard instrument area reinforcement module to improve image feature comprehension and segmentation precision. Extensive experiments on several surgical instrument segmentation datasets demonstrate our model's superior performance and promising generalization capability. To our knowledge, this is the first implementation of a promptable approach to surgical instrument segmentation, offering significant potential for practical application in the field of robotic-assisted surgery.

估計/估計量 · Integration · 均方誤差 · 值域 · 方陣 ·

2023 年 10 月 28 日

Joint Localization and Communication Enhancement in Uplink Integrated Sensing and Communications System with Clock Asynchronism

Xu Chen,XinXin He,Zhiyong Feng,Zhiqing Wei,Qixun Zhang,Xin Yuan,Ping Zhang

from arxiv, 13 pages, 11 figures, submitted to JSAC special issue "Positioning and Sensing Over Wireless Networks"

In this paper, we propose a joint single-base localization and communication enhancement scheme for the uplink (UL) integrated sensing and communications (ISAC) system with asynchronism, which can achieve accurate single-base localization of user equipment (UE) and significantly improve the communication reliability despite the existence of timing offset (TO) due to the clock asynchronism between UE and base station (BS). Our proposed scheme integrates the CSI enhancement into the multiple signal classification (MUSIC)-based AoA estimation and thus imposes no extra complexity on the ISAC system. We further exploit a MUSIC-based range estimation method and prove that it can suppress the time-varying TO-related phase terms. Exploiting the AoA and range estimation of UE, we can estimate the location of UE. Finally, we propose a joint CSI and data signals-based localization scheme that can coherently exploit the data and the CSI signals to improve the AoA and range estimation, which further enhances the single-base localization of UE. The extensive simulation results show that the enhanced CSI can achieve equivalent bit error rate performance to the minimum mean square error (MMSE) CSI estimator. The proposed joint CSI and data signals-based localization scheme can achieve decimeter-level localization accuracy despite the existing clock asynchronism and improve the localization mean square error (MSE) by about 8 dB compared with the maximum likelihood (ML)-based benchmark method.

SGD · 不變 · 泛化理論 · 噪聲 · Analysis ·

2023 年 10 月 27 日

Stochastic Collapse: How Gradient Noise Attracts SGD Dynamics Towards Simpler Subnetworks

Feng Chen,Daniel Kunin,Atsushi Yamamura,Surya Ganguli

from arxiv, 37 pages, 12 figures, NeurIPS 2023

In this work, we reveal a strong implicit bias of stochastic gradient descent (SGD) that drives overly expressive networks to much simpler subnetworks, thereby dramatically reducing the number of independent parameters, and improving generalization. To reveal this bias, we identify invariant sets, or subsets of parameter space that remain unmodified by SGD. We focus on two classes of invariant sets that correspond to simpler (sparse or low-rank) subnetworks and commonly appear in modern architectures. Our analysis uncovers that SGD exhibits a property of stochastic attractivity towards these simpler invariant sets. We establish a sufficient condition for stochastic attractivity based on a competition between the loss landscape's curvature around the invariant set and the noise introduced by stochastic gradients. Remarkably, we find that an increased level of noise strengthens attractivity, leading to the emergence of attractive invariant sets associated with saddle-points or local maxima of the train loss. We observe empirically the existence of attractive invariant sets in trained deep neural networks, implying that SGD dynamics often collapses to simple subnetworks with either vanishing or redundant neurons. We further demonstrate how this simplifying process of stochastic collapse benefits generalization in a linear teacher-student framework. Finally, through this analysis, we mechanistically explain why early training with large learning rates for extended periods benefits subsequent generalization.

Performer · Pair · Learning · 操作 ·

2023 年 10 月 27 日

Private Product Computation using Quantum Entanglement

René B?dker Christensen,Petar Popovski

from arxiv, 9 pages. Identical to published journal paper

In this work, we show that a pair of entangled qubits can be used to compute a product privately. More precisely, two participants with a private input from a finite field can perform local operations on a shared, Bell-like quantum state, and when these qubits are later sent to a third participant, the third participant can determine the product of the inputs, but without learning more about the individual inputs. We give a concrete way to realize this product computation for arbitrary finite fields of prime order.

優化器 · 論文 · 數值分析 ·

2023 年 10 月 26 日

A Bivariate Spline based Collocation Method for Numerical Solution to Optimal Transport Problem

Ming-Jun Lai,Jinsil Lee

In this paper, we study a spline collocation method for a numerical solution to the optimal transport problem We mainly solve the \MAE with the second boundary condition numerically by proposing a center matching algorithm. We prove a pointwise convergence of our iterative algorithm under the assumption the boundedness of spline iterates. We use the \MAE with Dirichlet boundary condition and some known solutions to the \MAE with second boundary condition to demonstrate the effectiveness of our algorithm. Then we use our method to solve some real-life problems. One application problem is to use the optimal transportation for the conversion of fisheye view images into standard rectangular images.

正則化項 · 支持向量 · 支持向量機 · 向量化 · 損失函數（機器學習） ·

2023 年 10 月 26 日

Implicit Regularization in Over-Parameterized Support Vector Machine

Yang Sui,Xin He,Yang Bai

In this paper, we design a regularization-free algorithm for high-dimensional support vector machines (SVMs) by integrating over-parameterization with Nesterov's smoothing method, and provide theoretical guarantees for the induced implicit regularization phenomenon. In particular, we construct an over-parameterized hinge loss function and estimate the true parameters by leveraging regularization-free gradient descent on this loss function. The utilization of Nesterov's method enhances the computational efficiency of our algorithm, especially in terms of determining the stopping criterion and reducing computational complexity. With appropriate choices of initialization, step size, and smoothness parameter, we demonstrate that unregularized gradient descent achieves a near-oracle statistical convergence rate. Additionally, we verify our theoretical findings through a variety of numerical experiments and compare the proposed method with explicit regularization. Our results illustrate the advantages of employing implicit regularization via gradient descent in conjunction with over-parameterization in sparse SVMs.

語言模型化 · MoDELS · 機器人 · Extensibility · Performer ·

2023 年 10 月 25 日

Conditionally Combining Robot Skills using Large Language Models

K. R. Zentner,Ryan Julian,Brian Ichter,Gaurav S. Sukhatme

This paper combines two contributions. First, we introduce an extension of the Meta-World benchmark, which we call "Language-World," which allows a large language model to operate in a simulated robotic environment using semi-structured natural language queries and scripted skills described using natural language. By using the same set of tasks as Meta-World, Language-World results can be easily compared to Meta-World results, allowing for a point of comparison between recent methods using Large Language Models (LLMs) and those using Deep Reinforcement Learning. Second, we introduce a method we call Plan Conditioned Behavioral Cloning (PCBC), that allows finetuning the behavior of high-level plans using end-to-end demonstrations. Using Language-World, we show that PCBC is able to achieve strong performance in a variety of few-shot regimes, often achieving task generalization with as little as a single demonstration. We have made Language-World available as open-source software at //github.com/krzentner/language-world/.

INFORMS · 可辨認的 · Networking · Neural Networks · 黑盒 ·

2021 年 10 月 4 日

Fine-Grained Neural Network Explanation by Identifying Input Features with Predictive Information

Yang Zhang,Ashkan Khakzar,Yawei Li,Azade Farshad,Seong Tae Kim,Nassir Navab

from arxiv, Accepted in NeurIPS 2021 (Neural Information Processing Systems)

One principal approach for illuminating a black-box neural network is feature attribution, i.e. identifying the importance of input features for the network's prediction. The predictive information of features is recently proposed as a proxy for the measure of their importance. So far, the predictive information is only identified for latent features by placing an information bottleneck within the network. We propose a method to identify features with predictive information in the input domain. The method results in fine-grained identification of input features' information and is agnostic to network architecture. The core idea of our method is leveraging a bottleneck on the input that only lets input features associated with predictive latent features pass through. We compare our method with several feature attribution methods using mainstream feature attribution evaluation experiments. The code is publicly available.