亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Stochastic-gradient sampling methods are often used to perform Bayesian inference on neural networks. It has been observed that the methods in which notions of differential geometry are included tend to have better performances, with the Riemannian metric improving posterior exploration by accounting for the local curvature. However, the existing methods often resort to simple diagonal metrics to remain computationally efficient. This loses some of the gains. We propose two non-diagonal metrics that can be used in stochastic-gradient samplers to improve convergence and exploration but have only a minor computational overhead over diagonal metrics. We show that for fully connected neural networks (NNs) with sparsity-inducing priors and convolutional NNs with correlated priors, using these metrics can provide improvements. For some other choices the posterior is sufficiently easy also for the simpler metrics.

相關內容

神經網絡(Neural Networks)是世界上三個最古老的神經建模學會的檔案期刊:國際神經網絡學會(INNS)、歐洲神經網絡學會(ENNS)和日本神經網絡學會(JNNS)。神經網絡提供了一個論壇,以發展和培育一個國際社會的學者和實踐者感興趣的所有方面的神經網絡和相關方法的計算智能。神經網絡歡迎高質量論文的提交,有助于全面的神經網絡研究,從行為和大腦建模,學習算法,通過數學和計算分析,系統的工程和技術應用,大量使用神經網絡的概念和技術。這一獨特而廣泛的范圍促進了生物和技術研究之間的思想交流,并有助于促進對生物啟發的計算智能感興趣的跨學科社區的發展。因此,神經網絡編委會代表的專家領域包括心理學,神經生物學,計算機科學,工程,數學,物理。該雜志發表文章、信件和評論以及給編輯的信件、社論、時事、軟件調查和專利信息。文章發表在五個部分之一:認知科學,神經科學,學習系統,數學和計算分析、工程和應用。 官網地址:

Language models (LMs) have already demonstrated remarkable abilities in understanding and generating both natural and formal language. Despite these advances, their integration with real-world environments such as large-scale knowledge bases (KBs) remains an underdeveloped area, affecting applications such as semantic parsing and indulging in "hallucinated" information. This paper is an experimental investigation aimed at uncovering the robustness challenges that LMs encounter when tasked with knowledge base question answering (KBQA). The investigation covers scenarios with inconsistent data distribution between training and inference, such as generalization to unseen domains, adaptation to various language variations, and transferability across different datasets. Our comprehensive experiments reveal that even when employed with our proposed data augmentation techniques, advanced small and large language models exhibit poor performance in various dimensions. While the LM is a promising technology, the robustness of the current form in dealing with complex environments is fragile and of limited practicality because of the data distribution issue. This calls for future research on data collection and LM learning paradims.

Deep neural networks have shown impressive performance for image-based disease detection. Performance is commonly evaluated through clinical validation on independent test sets to demonstrate clinically acceptable accuracy. Reporting good performance metrics on test sets, however, is not always a sufficient indication of the generalizability and robustness of an algorithm. In particular, when the test data is drawn from the same distribution as the training data, the iid test set performance can be an unreliable estimate of the accuracy on new data. In this paper, we employ stress testing to assess model robustness and subgroup performance disparities in disease detection models. We design progressive stress testing using five different bidirectional and unidirectional image perturbations with six different severity levels. As a use case, we apply stress tests to measure the robustness of disease detection models for chest X-ray and skin lesion images, and demonstrate the importance of studying class and domain-specific model behaviour. Our experiments indicate that some models may yield more robust and equitable performance than others. We also find that pretraining characteristics play an important role in downstream robustness. We conclude that progressive stress testing is a viable and important tool and should become standard practice in the clinical validation of image-based disease detection models.

Smishing, also known as SMS phishing, is a type of fraudulent communication in which an attacker disguises SMS communications to deceive a target into providing their sensitive data. Smishing attacks use a variety of tactics; however, they have a similar goal of stealing money or personally identifying information (PII) from a victim. In response to these attacks, a wide variety of anti-smishing tools have been developed to block or filter these communications. Despite this, the number of phishing attacks continue to rise. In this paper, we developed a test bed for measuring the effectiveness of popular anti-smishing tools against fresh smishing attacks. To collect fresh smishing data, we introduce Smishtank.com, a collaborative online resource for reporting and collecting smishing data sets. The SMS messages were validated by a security expert and an in-depth qualitative analysis was performed on the collected messages to provide further insights. To compare tool effectiveness, we experimented with 20 smishing and benign messages across 3 key segments of the SMS messaging delivery ecosystem. Our results revealed significant room for improvement in all 3 areas against our smishing set. Most anti-phishing apps and bulk messaging services didn't filter smishing messages beyond the carrier blocking. The 2 apps that blocked the most smish also blocked 85-100\% of benign messages. Finally, while carriers did not block any benign messages, they were only able to reach a 25-35\% blocking rate for smishing messages. Our work provides insights into the performance of anti-smishing tools and the roles they play in the message blocking process. This paper would enable the research community and industry to be better informed on the current state of anti-smishing technology on the SMS platform.

In recent studies, linear recurrent neural networks (LRNNs) have achieved Transformer-level performance in natural language modeling and long-range modeling while offering rapid parallel training and constant inference costs. With the resurged interest in LRNNs, we study whether they can learn the hidden rules in training sequences, such as the grammatical structures of regular language. We theoretically analyze some existing LRNNs and discover their limitations on regular language. Motivated by the analysis, we propose a new LRNN equipped with a block-diagonal and input-dependent transition matrix. Experiments suggest that the proposed model is the only LRNN that can perform length extrapolation on regular language tasks such as Sum, Even Pair, and Modular Arithmetic.

Several recent contributions in the field of iterative STFT phase retrieval have demonstrated that the performance of the classical Griffin-Lim method can be considerably improved upon. By using the same projection operators as Griffin-Lim, but combining them in innovative ways, these approaches achieve better results in terms of both reconstruction quality and required number of iterations, while retaining a similar computational complexity per iteration. However, like Griffin-Lim, these algorithms operate in an offline manner and thus require an entire spectrogram as input, which is an unrealistic requirement for many real-world speech communication applications. We propose to extend RTISI -- an existing online (frame-by-frame) variant of the Griffin-Lim algorithm -- into a flexible framework that enables straightforward online implementation of any algorithm based on iterative projections. We further employ this framework to implement online variants of the fast Griffin-Lim algorithm, the accelerated Griffin-Lim algorithm, and two algorithms from the optics domain. Evaluation results on speech signals show that, similarly to the offline case, these algorithms can achieve a considerable performance gain compared to RTISI.

Neural fields, which represent signals as a function parameterized by a neural network, are a promising alternative to traditional discrete vector or grid-based representations. Compared to discrete representations, neural representations both scale well with increasing resolution, are continuous, and can be many-times differentiable. However, given a dataset of signals that we would like to represent, having to optimize a separate neural field for each signal is inefficient, and cannot capitalize on shared information or structures among signals. Existing generalization methods view this as a meta-learning problem and employ gradient-based meta-learning to learn an initialization which is then fine-tuned with test-time optimization, or learn hypernetworks to produce the weights of a neural field. We instead propose a new paradigm that views the large-scale training of neural representations as a part of a partially-observed neural process framework, and leverage neural process algorithms to solve this task. We demonstrate that this approach outperforms both state-of-the-art gradient-based meta-learning approaches and hypernetwork approaches.

Unsupervised contrastive learning methods have recently seen significant improvements, particularly through data augmentation strategies that aim to produce robust and generalizable representations. However, prevailing data augmentation methods, whether hand designed or based on foundation models, tend to rely heavily on prior knowledge or external data. This dependence often compromises their effectiveness and efficiency. Furthermore, the applicability of most existing data augmentation strategies is limited when transitioning to other research domains, especially science-related data. This limitation stems from the paucity of prior knowledge and labeled data available in these domains. To address these challenges, we introduce DiffAug-a novel and efficient Diffusion-based data Augmentation technique. DiffAug aims to ensure that the augmented and original data share a smoothed latent space, which is achieved through diffusion steps. Uniquely, unlike traditional methods, DiffAug first mines sufficient prior semantic knowledge about the neighborhood. This provides a constraint to guide the diffusion steps, eliminating the need for labels, external data/models, or prior knowledge. Designed as an architecture-agnostic framework, DiffAug provides consistent improvements. Specifically, it improves image classification and clustering accuracy by 1.6%~4.5%. When applied to biological data, DiffAug improves performance by up to 10.1%, with an average improvement of 5.8%. DiffAug shows good performance in both vision and biological domains.

Recently, graph neural networks (GNNs) have been widely used for document classification. However, most existing methods are based on static word co-occurrence graphs without sentence-level information, which poses three challenges:(1) word ambiguity, (2) word synonymity, and (3) dynamic contextual dependency. To address these challenges, we propose a novel GNN-based sparse structure learning model for inductive document classification. Specifically, a document-level graph is initially generated by a disjoint union of sentence-level word co-occurrence graphs. Our model collects a set of trainable edges connecting disjoint words between sentences and employs structure learning to sparsely select edges with dynamic contextual dependencies. Graphs with sparse structures can jointly exploit local and global contextual information in documents through GNNs. For inductive learning, the refined document graph is further fed into a general readout function for graph-level classification and optimization in an end-to-end manner. Extensive experiments on several real-world datasets demonstrate that the proposed model outperforms most state-of-the-art results, and reveal the necessity to learn sparse structures for each document.

It has been shown that deep neural networks are prone to overfitting on biased training data. Towards addressing this issue, meta-learning employs a meta model for correcting the training bias. Despite the promising performances, super slow training is currently the bottleneck in the meta learning approaches. In this paper, we introduce a novel Faster Meta Update Strategy (FaMUS) to replace the most expensive step in the meta gradient computation with a faster layer-wise approximation. We empirically find that FaMUS yields not only a reasonably accurate but also a low-variance approximation of the meta gradient. We conduct extensive experiments to verify the proposed method on two tasks. We show our method is able to save two-thirds of the training time while still maintaining the comparable or achieving even better generalization performance. In particular, our method achieves the state-of-the-art performance on both synthetic and realistic noisy labels, and obtains promising performance on long-tailed recognition on standard benchmarks.

Graph Neural Networks (GNNs) are widely used for analyzing graph-structured data. Most GNN methods are highly sensitive to the quality of graph structures and usually require a perfect graph structure for learning informative embeddings. However, the pervasiveness of noise in graphs necessitates learning robust representations for real-world problems. To improve the robustness of GNN models, many studies have been proposed around the central concept of Graph Structure Learning (GSL), which aims to jointly learn an optimized graph structure and corresponding representations. Towards this end, in the presented survey, we broadly review recent progress of GSL methods for learning robust representations. Specifically, we first formulate a general paradigm of GSL, and then review state-of-the-art methods classified by how they model graph structures, followed by applications that incorporate the idea of GSL in other graph tasks. Finally, we point out some issues in current studies and discuss future directions.

北京阿比特科技有限公司