动漫AV观看网站不卡无码,久久久久精品电影

Neural networks are often overconfident about their predictions, which undermines their reliability and trustworthiness. In this work, we present a novel technique, named Error-Driven Uncertainty Aware Training (EUAT), which aims to enhance the ability of neural models to estimate their uncertainty correctly, namely to be highly uncertain when they output inaccurate predictions and low uncertain when their output is accurate. The EUAT approach operates during the model's training phase by selectively employing two loss functions depending on whether the training examples are correctly or incorrectly predicted by the model. This allows for pursuing the twofold goal of i) minimizing model uncertainty for correctly predicted inputs and ii) maximizing uncertainty for mispredicted inputs, while preserving the model's misprediction rate. We evaluate EUAT using diverse neural models and datasets in the image recognition domains considering both non-adversarial and adversarial settings. The results show that EUAT outperforms existing approaches for uncertainty estimation (including other uncertainty-aware training techniques, calibration, ensembles, and DEUP) by providing uncertainty estimates that not only have higher quality when evaluated via statistical metrics (e.g., correlation with residuals) but also when employed to build binary classifiers that decide whether the model's output can be trusted or not and under distributional data shifts.

相關內容

MoDELS

關注 43

ACM/IEEE第23屆模型驅動工程語言和系統國際會議，是模型驅動軟件和系統工程的首要會議系列，由ACM-SIGSOFT和IEEE-TCSE支持組織。自1998年以來，模型涵蓋了建模的各個方面，從語言和方法到工具和應用程序。模特的參加者來自不同的背景，包括研究人員、學者、工程師和工業專業人士。MODELS 2019是一個論壇，參與者可以圍繞建模和模型驅動的軟件和系統交流前沿研究成果和創新實踐經驗。今年的版本將為建模社區提供進一步推進建模基礎的機會，并在網絡物理系統、嵌入式系統、社會技術系統、云計算、大數據、機器學習、安全、開源等新興領域提出建模的創新應用以及可持續性。官網鏈接： · Networking · 最優化 · MoDELS · 評論員 ·

2024 年 6 月 13 日

Jacobian-Enhanced Neural Networks

Steven H. Berguin

from arxiv, 10 pages, 5 figures

Jacobian-Enhanced Neural Networks (JENN) are densely connected multi-layer perceptrons, whose training process is modified to predict partial derivatives accurately. Their main benefit is better accuracy with fewer training points compared to standard neural networks. These attributes are particularly desirable in the field of computer-aided design, where there is often the need to replace computationally expensive, physics-based models with fast running approximations, known as surrogate models or meta-models. Since a surrogate emulates the original model accurately in near-real time, it yields a speed benefit that can be used to carry out orders of magnitude more function calls quickly. However, in the special case of gradient-enhanced methods, there is the additional value proposition that partial derivatives are accurate, which is a critical property for one important use-case: surrogate-based optimization. This work derives the complete theory and exemplifies its superiority over standard neural nets for surrogate-based optimization.

FAST · 評論員 · 優化器 · 值域 · 穩健性 ·

2024 年 6 月 11 日

Fast Adaptive Meta-Heuristic for Large-Scale Facility Location Problem

Bahram Alidaee,Haibo Wang

from arxiv, 18 pages

Facility location problems have been a major research area of interest in the last several decades. In particular, uncapacitated location problems (ULP) have enormous applications. Variations of ULP often appear, especially as large-scale subproblems in more complex combinatorial optimization problems. Although many researchers have studied different versions of ULP (e.g., uncapacitated facility location problem (UCFLP) and p-Median problem), most of these authors have considered small to moderately sized problems. In this paper, we address the ULP and provide a fast adaptive meta-heuristic for large-scale problems. The approach is based on critical event memory tabu search. For the diversification component of the algorithm, we have chosen a procedure based on a sequencing problem commonly used for traveling salesman-type problems. The efficacy of this approach is evaluated across a diverse range of benchmark problems sourced from the Internet, with a comprehensive comparison against four prominent algorithms in the literature. The proposed adaptive critical event tabu search (ACETS) demonstrates remarkable effectiveness for large-scale problems. The algorithm successfully solved all problems optimally within a short computing time. Notably, ACETS discovered three best new solutions for benchmark problems, specifically for Asymmetric 500A-1, Asymmetric 750A-1, and Symmetric 750B-4, underscoring its innovative and robust nature.

語言模型化 · 條件獨立的 · 相互獨立的 · Performer · 圖 ·

2024 年 6 月 11 日

Large Language Models for Constrained-Based Causal Discovery

Kai-Hendrik Cohrs,Gherardo Varando,Emiliano Diaz,Vasileios Sitokonstantinou,Gustau Camps-Valls

Causality is essential for understanding complex systems, such as the economy, the brain, and the climate. Constructing causal graphs often relies on either data-driven or expert-driven approaches, both fraught with challenges. The former methods, like the celebrated PC algorithm, face issues with data requirements and assumptions of causal sufficiency, while the latter demand substantial time and domain knowledge. This work explores the capabilities of Large Language Models (LLMs) as an alternative to domain experts for causal graph generation. We frame conditional independence queries as prompts to LLMs and employ the PC algorithm with the answers. The performance of the LLM-based conditional independence oracle on systems with known causal graphs shows a high degree of variability. We improve the performance through a proposed statistical-inspired voting schema that allows some control over false-positive and false-negative rates. Inspecting the chain-of-thought argumentation, we find causal reasoning to justify its answer to a probabilistic query. We show evidence that knowledge-based CIT could eventually become a complementary tool for data-driven causal discovery.

MoDELS · Performance · 大語言模型 · Performer · 值域 ·

2024 年 6 月 10 日

Low-Rank Quantization-Aware Training for LLMs

Yelysei Bondarenko,Riccardo Del Chiaro,Markus Nagel

Large language models (LLMs) are omnipresent, however their practical deployment is challenging due to their ever increasing computational and memory demands. Quantization is one of the most effective ways to make them more compute and memory efficient. Quantization-aware training (QAT) methods, generally produce the best quantized performance, however it comes at the cost of potentially long training time and excessive memory usage, making it impractical when applying for LLMs. Inspired by parameter-efficient fine-tuning (PEFT) and low-rank adaptation (LoRA) literature, we propose LR-QAT -- a lightweight and memory-efficient QAT algorithm for LLMs. LR-QAT employs several components to save memory without sacrificing predictive performance: (a) low-rank auxiliary weights that are aware of the quantization grid; (b) a downcasting operator using fixed-point or double-packed integers and (c) checkpointing. Unlike most related work, our method (i) is inference-efficient, leading to no additional overhead compared to traditional PTQ; (ii) can be seen as a general extended pretraining framework, meaning that the resulting model can still be utilized for any downstream task afterwards; (iii) can be applied across a wide range of quantization settings, such as different choices quantization granularity, activation quantization, and seamlessly combined with many PTQ techniques. We apply LR-QAT to the LLaMA-2/3 and Mistral model families and validate its effectiveness on several downstream tasks. Our method outperforms common post-training quantization (PTQ) approaches and reaches the same model performance as full-model QAT at the fraction of its memory usage. Specifically, we can train a 7B LLM on a single consumer grade GPU with 24GB of memory.

INTERACT · Extensibility · 數學 · 統計量 · 自助法/自舉法 ·

2024 年 6 月 9 日

Evolving Collective Behavior in Self-Organizing Particle Systems

Devendra Parkar,Kirtus G. Leyba,Raylene A. Faerber,Joshua J. Daymude

from arxiv, 10 pages, 8 figures, 3 tables; accepted for publication at ALIFE 2024

Local interactions drive emergent collective behavior, which pervades biological and social complex systems. But uncovering the interactions that produce a desired behavior remains a core challenge. In this paper, we present EvoSOPS, an evolutionary framework that searches landscapes of stochastic distributed algorithms for those that achieve a mathematically specified target behavior. These algorithms govern self-organizing particle systems (SOPS) comprising individuals with no persistent memory and strictly local sensing and movement. For aggregation, phototaxing, and separation behaviors, EvoSOPS discovers algorithms that achieve 4.2-15.3% higher fitness than those from the existing "stochastic approach to SOPS" based on mathematical theory from statistical physics. EvoSOPS is also flexibly applied to new behaviors such as object coating where the stochastic approach would require bespoke, extensive analysis. Finally, we distill insights from the diverse, best-fitness genomes produced for aggregation across repeated EvoSOPS runs to demonstrate how EvoSOPS can bootstrap future theoretical investigations into SOPS algorithms for new behaviors.

優化器 · 穩健性 · 在線 · Principle · 相互獨立的 ·

2024 年 6 月 7 日

Self-Improving Robust Preference Optimization

Eugene Choi,Arash Ahmadian,Matthieu Geist,Oilvier Pietquin,Mohammad Gheshlaghi Azar

Both online and offline RLHF methods such as PPO and DPO have been extremely successful in aligning AI with human preferences. Despite their success, the existing methods suffer from a fundamental problem that their optimal solution is highly task-dependent (i.e., not robust to out-of-distribution (OOD) tasks). Here we address this challenge by proposing Self-Improving Robust Preference Optimization SRPO, a practical and mathematically principled offline RLHF framework that is completely robust to the changes in the task. The key idea of SRPO is to cast the problem of learning from human preferences as a self-improvement process, which can be mathematically expressed in terms of a min-max objective that aims at joint optimization of self-improvement policy and the generative policy in an adversarial fashion. The solution for this optimization problem is independent of the training task and thus it is robust to its changes. We then show that this objective can be re-expressed in the form of a non-adversarial offline loss which can be optimized using standard supervised optimization techniques at scale without any need for reward model and online inference. We show the effectiveness of SRPO in terms of AI Win-Rate (WR) against human (GOLD) completions. In particular, when SRPO is evaluated on the OOD XSUM dataset, it outperforms the celebrated DPO by a clear margin of 15% after 5 self-revisions, achieving WR of 90%.

MINE · 哈希學習 · FAST · Performer · Extensibility ·

2024 年 6 月 6 日

Fast Redescription Mining Using Locality-Sensitive Hashing

Maiju Karjalainen,Esther Galbrun,Pauli Miettinen

from arxiv, 20 pages, 4 figures, to appear at ECML-PKDD 2024

Redescription mining is a data analysis technique that has found applications in diverse fields. The most used redescription mining approaches involve two phases: finding matching pairs among data attributes and extending the pairs. This process is relatively efficient when the number of attributes remains limited and when the attributes are Boolean, but becomes almost intractable when the data consist of many numerical attributes. In this paper, we present new algorithms that perform the matching and extension orders of magnitude faster than the existing approaches. Our algorithms are based on locality-sensitive hashing with a tailored approach to handle the discretisation of numerical attributes as used in redescription mining.

線性的 · 代碼 · 樣本 · 有偏 · Alphabet ·

2024 年 6 月 6 日

Punctured Low-Bias Codes Behave Like Random Linear Codes

Venkatesan Guruswami,Jonathan Mosheiff

Random linear codes are a workhorse in coding theory, and are used to show the existence of codes with the best known or even near-optimal trade-offs in many noise models. However, they have little structure besides linearity, and are not amenable to tractable error-correction algorithms. In this work, we prove a general derandomization result applicable to random linear codes. Namely, in settings where the coding-theoretic property of interest is "local" (in the sense of forbidding certain bad configurations involving few vectors -- code distance and list-decodability being notable examples), one can replace random linear codes (RLCs) with a significantly derandomized variant with essentially no loss in parameters. Specifically, instead of randomly sampling coordinates of the (long) Hadamard code (which is an equivalent way to describe RLCs), one can randomly sample coordinates of any code with low bias. Over large alphabets, the low bias requirement can be weakened to just large distance. Furthermore, large distance suffices even with a small alphabet in order to match the current best known bounds for RLC list-decodability. In particular, by virtue of our result, all current (and future) achievability bounds for list-decodability of random linear codes extend automatically to random puncturings of any low-bias (or large alphabet) "mother" code. We also show that our punctured codes emulate the behavior of RLCs on stochastic channels, thus giving a derandomization of RLCs in the context of achieving Shannon capacity as well. Thus, we have a randomness-efficient way to sample codes achieving capacity in both worst-case and stochastic settings that can further inherit algebraic or other algorithmically useful structural properties of the mother code.

模態 · MoDELS · 掩碼 · Better · 設計 ·

2024 年 6 月 6 日

Attribute-Aware Implicit Modality Alignment for Text Attribute Person Search

Xin Wang,Fangfang Liu,Zheng Li,Caili Guo

Text attribute person search aims to find specific pedestrians through given textual attributes, which is very meaningful in the scene of searching for designated pedestrians through witness descriptions. The key challenge is the significant modality gap between textual attributes and images. Previous methods focused on achieving explicit representation and alignment through unimodal pre-trained models. Nevertheless, the absence of inter-modality correspondence in these models may lead to distortions in the local information of intra-modality. Moreover, these methods only considered the alignment of inter-modality and ignored the differences between different attribute categories. To mitigate the above problems, we propose an Attribute-Aware Implicit Modality Alignment (AIMA) framework to learn the correspondence of local representations between textual attributes and images and combine global representation matching to narrow the modality gap. Firstly, we introduce the CLIP model as the backbone and design prompt templates to transform attribute combinations into structured sentences. This facilitates the model's ability to better understand and match image details. Next, we design a Masked Attribute Prediction (MAP) module that predicts the masked attributes after the interaction of image and masked textual attribute features through multi-modal interaction, thereby achieving implicit local relationship alignment. Finally, we propose an Attribute-IoU Guided Intra-Modal Contrastive (A-IoU IMC) loss, aligning the distribution of different textual attributes in the embedding space with their IoU distribution, achieving better semantic arrangement. Extensive experiments on the Market-1501 Attribute, PETA, and PA100K datasets show that the performance of our proposed method significantly surpasses the current state-of-the-art methods.

控制器 · INTERACT · state-of-the-art · 模型評估 · Next ·

2020 年 8 月 3 日

Controllable Multi-Interest Framework for Recommendation

Yukuo Cen,Jianwei Zhang,Xu Zou,Chang Zhou,Hongxia Yang,Jie Tang

from arxiv, Accepted to KDD 2020

Recently, neural networks have been widely used in e-commerce recommender systems, owing to the rapid development of deep learning. We formalize the recommender system as a sequential recommendation problem, intending to predict the next items that the user might be interacted with. Recent works usually give an overall embedding from a user's behavior sequence. However, a unified user embedding cannot reflect the user's multiple interests during a period. In this paper, we propose a novel controllable multi-interest framework for the sequential recommendation, called ComiRec. Our multi-interest module captures multiple interests from user behavior sequences, which can be exploited for retrieving candidate items from the large-scale item pool. These items are then fed into an aggregation module to obtain the overall recommendation. The aggregation module leverages a controllable factor to balance the recommendation accuracy and diversity. We conduct experiments for the sequential recommendation on two real-world datasets, Amazon and Taobao. Experimental results demonstrate that our framework achieves significant improvements over state-of-the-art models. Our framework has also been successfully deployed on the offline Alibaba distributed cloud platform.