99欧美日韩精品一区二区红桃,激情欧美综合,91最新地址永久备用入口发布,亚洲欧美视频在线观看网址

Neural networks offer good approximation to many tasks but consistently fail to reach perfect generalization, even when theoretical work shows that such perfect solutions can be expressed by certain architectures. Using the task of formal language learning, we focus on one simple formal language and show that the theoretically correct solution is in fact not an optimum of commonly used objectives -- even with regularization techniques that according to common wisdom should lead to simple weights and good generalization (L1, L2) or other meta-heuristics (early-stopping, dropout). However, replacing standard targets with the Minimum Description Length objective (MDL) results in the correct solution being an optimum.

相關內容

極小點

關注 0

Networking · Neural Networks · 置信度 · 判別器 · 測試數據 ·

2024 年 3 月 27 日

Preventing Arbitrarily High Confidence on Far-Away Data in Point-Estimated Discriminative Neural Networks

Ahmad Rashid,Serena Hacker,Guojun Zhang,Agustinus Kristiadi,Pascal Poupart

from arxiv, Accepted at AISTATS 2024

Discriminatively trained, deterministic neural networks are the de facto choice for classification problems. However, even though they achieve state-of-the-art results on in-domain test sets, they tend to be overconfident on out-of-distribution (OOD) data. For instance, ReLU networks - a popular class of neural network architectures - have been shown to almost always yield high confidence predictions when the test data are far away from the training set, even when they are trained with OOD data. We overcome this problem by adding a term to the output of the neural network that corresponds to the logit of an extra class, that we design to dominate the logits of the original classes as we move away from the training data.This technique provably prevents arbitrarily high confidence on far-away test data while maintaining a simple discriminative point-estimate training. Evaluation on various benchmarks demonstrates strong performance against competitive baselines on both far-away and realistic OOD data.

統計量 · INTERACT · 估計/估計量 · 集成 · 3D ·

2024 年 3 月 27 日

Neural Fields for Interactive Visualization of Statistical Dependencies in 3D Simulation Ensembles

Fatemeh Farokhmanesh,Kevin H?hlein,Christoph Neuhauser,Tobias Necker,Martin Weissmann,Takemasa Miyoshi,Rüdiger Westermann

We present the first neural network that has learned to compactly represent and can efficiently reconstruct the statistical dependencies between the values of physical variables at different spatial locations in large 3D simulation ensembles. Going beyond linear dependencies, we consider mutual information as a measure of non-linear dependence. We demonstrate learning and reconstruction with a large weather forecast ensemble comprising 1000 members, each storing multiple physical variables at a 250 x 352 x 20 simulation grid. By circumventing compute-intensive statistical estimators at runtime, we demonstrate significantly reduced memory and computation requirements for reconstructing the major dependence structures. This enables embedding the estimator into a GPU-accelerated direct volume renderer and interactively visualizing all mutual dependencies for a selected domain point.

Cognition · Agent · MoDELS · Prompt · Performer ·

2024 年 3 月 26 日

Unleashing the Emergent Cognitive Synergy in Large Language Models: A Task-Solving Agent through Multi-Persona Self-Collaboration

Zhenhailong Wang,Shaoguang Mao,Wenshan Wu,Tao Ge,Furu Wei,Heng Ji

from arxiv, Accepted as a main conference paper at NAACL 2024

Human intelligence thrives on cognitive synergy, where collaboration among different minds yield superior outcomes compared to isolated individuals. In this work, we propose Solo Performance Prompting (SPP), which transforms a single LLM into a cognitive synergist by engaging in multi-turn self-collaboration with multiple personas. A cognitive synergist is an intelligent agent that collaboratively combines multiple minds' strengths and knowledge to enhance problem-solving in complex tasks. By dynamically identifying and simulating different personas based on task inputs, SPP unleashes the potential of cognitive synergy in LLMs. Our in-depth analysis shows that assigning multiple fine-grained personas in LLMs improves problem-solving abilities compared to using a single or fixed number of personas. We evaluate SPP on three challenging tasks: Trivia Creative Writing, Codenames Collaborative, and Logic Grid Puzzle, encompassing both knowledge-intensive and reasoning-intensive types. Unlike previous works, such as Chain-of-Thought, that solely enhance the reasoning abilities in LLMs, experimental results demonstrate that SPP effectively reduces factual hallucination, and maintains strong reasoning capabilities. Additionally, comparative experiments show that cognitive synergy only emerges in GPT-4 and does not appear in less capable models, such as GPT-3.5-turbo and Llama2-13b-chat, which draws an interesting analogy to human development. Code, data, and prompts can be found at: //github.com/MikeWangWZHL/Solo-Performance-Prompting.git.

優化器 · 路徑 · Extensibility · 評論員 · 回合 ·

2024 年 3 月 25 日

A Semi-Lagrangian Approach for Time and Energy Path Planning Optimization in Static Flow Fields

Víctor C. da S. Campos,Armando A. Neto,Douglas G. Macharet

from arxiv, 12 pages, initial paper submission; Preprint submitted to the IEEE Transactions on Intelligent Transportation Systems

Efficient path planning for autonomous mobile robots is a critical problem across numerous domains, where optimizing both time and energy consumption is paramount. This paper introduces a novel methodology that considers the dynamic influence of an environmental flow field and considers geometric constraints, including obstacles and forbidden zones, enriching the complexity of the planning problem. We formulate it as a multi-objective optimal control problem, propose a novel transformation called Harmonic Transformation, and apply a semi-Lagrangian scheme to solve it. The set of Pareto efficient solutions is obtained considering two distinct approaches: a deterministic method and an evolutionary-based one, both of which are designed to make use of the proposed Harmonic Transformation. Through an extensive analysis of these approaches, we demonstrate their efficacy in finding optimized paths.

Learning · Machine Learning · 總回報 · MoDELS · Performer ·

2024 年 3 月 25 日

Return to Tradition: Learning Reliable Heuristics with Classical Machine Learning

Dillon Z. Chen,Felipe Trevizan,Sylvie Thiébaux

from arxiv, Extended version of ICAPS 2024 paper

Current approaches for learning for planning have yet to achieve competitive performance against classical planners in several domains, and have poor overall performance. In this work, we construct novel graph representations of lifted planning tasks and use the WL algorithm to generate features from them. These features are used with classical machine learning methods which have up to 2 orders of magnitude fewer parameters and train up to 3 orders of magnitude faster than the state-of-the-art deep learning for planning models. Our novel approach, WL-GOOSE, reliably learns heuristics from scratch and outperforms the $h^{\text{FF}}$ heuristic in a fair competition setting. It also outperforms or ties with LAMA on 4 out of 10 domains on coverage and 7 out of 10 domains on plan quality. WL-GOOSE is the first learning for planning model which achieves these feats. Furthermore, we study the connections between our novel WL feature generation method, previous theoretically flavoured learning architectures, and Description Logic Features for planning.

Subspace · Learning · 樣例 · MoDELS · 相互獨立的 ·

2024 年 3 月 24 日

Subspace Defense: Discarding Adversarial Perturbations by Learning a Subspace for Clean Signals

Rui Zheng,Yuhao Zhou,Zhiheng Xi,Tao Gui,Qi Zhang,Xuanjing Huang

from arxiv, Accepted by COLING 2024

Deep neural networks (DNNs) are notoriously vulnerable to adversarial attacks that place carefully crafted perturbations on normal examples to fool DNNs. To better understand such attacks, a characterization of the features carried by adversarial examples is needed. In this paper, we tackle this challenge by inspecting the subspaces of sample features through spectral analysis. We first empirically show that the features of either clean signals or adversarial perturbations are redundant and span in low-dimensional linear subspaces respectively with minimal overlap, and the classical low-dimensional subspace projection can suppress perturbation features out of the subspace of clean signals. This makes it possible for DNNs to learn a subspace where only features of clean signals exist while those of perturbations are discarded, which can facilitate the distinction of adversarial examples. To prevent the residual perturbations that is inevitable in subspace learning, we propose an independence criterion to disentangle clean signals from perturbations. Experimental results show that the proposed strategy enables the model to inherently suppress adversaries, which not only boosts model robustness but also motivates new directions of effective adversarial defense.

Taxonomy · 評論員 · Learning · INFORMS · AIM ·

2024 年 3 月 22 日

A Survey on Safe Multi-Modal Learning System

Tianyi Zhao,Liangliang Zhang,Yao Ma,Lu Cheng

In the rapidly evolving landscape of artificial intelligence, multimodal learning systems (MMLS) have gained traction for their ability to process and integrate information from diverse modality inputs. Their expanding use in vital sectors such as healthcare has made safety assurance a critical concern. However, the absence of systematic research into their safety is a significant barrier to progress in this field. To bridge the gap, we present the first taxonomy that systematically categorizes and assesses MMLS safety. This taxonomy is structured around four fundamental pillars that are critical to ensuring the safety of MMLS: robustness, alignment, monitoring, and controllability. Leveraging this taxonomy, we review existing methodologies, benchmarks, and the current state of research, while also pinpointing the principal limitations and gaps in knowledge. Finally, we discuss unique challenges in MMLS safety. In illuminating these challenges, we aim to pave the way for future research, proposing potential directions that could lead to significant advancements in the safety protocols of MMLS.

有向 · 有偏 · MoDELS · 估計/估計量 · Performer ·

2024 年 3 月 22 日

Spectral Initialization for High-Dimensional Phase Retrieval with Biased Spatial Directions

Pierre Bousseyroux,Marc Potters

from arxiv, 13 pages, 4 figures

We explore a spectral initialization method that plays a central role in contemporary research on signal estimation in nonconvex scenarios. In a noiseless phase retrieval framework, we precisely analyze the method's performance in the high-dimensional limit when sensing vectors follow a multivariate Gaussian distribution for two rotationally invariant models of the covariance matrix C. In the first model C is a projector on a lower dimensional space while in the second it is a Wishart matrix. Our analytical results extend the well-established case when C is the identity matrix. Our examination shows that the introduction of biased spatial directions leads to a substantial improvement in the spectral method's effectiveness, particularly when the number of measurements is less than the signal's dimension. This extension also consistently reveals a phase transition phenomenon dependent on the ratio between sample size and signal dimension. Surprisingly, both of these models share the same threshold value.

變換 · Processing（編程語言） · Extensibility · Branch · 推斷 ·

2024 年 3 月 22 日

Allspark: Workload Orchestration for Visual Transformers on Processing In-Memory Systems

Mengke Ge,Junpeng Wang,Binhan Chen,Yingjian Zhong,Haitao Du,Song Chen,Yi Kang

from arxiv, The article is currently under review by IEEE Transactions on Computers, and has been submitted to HPCA'2024 and ISCA'2024

The advent of Transformers has revolutionized computer vision, offering a powerful alternative to convolutional neural networks (CNNs), especially with the local attention mechanism that excels at capturing local structures within the input and achieve state-of-the-art performance. Processing in-memory (PIM) architecture offers extensive parallelism, low data movement costs, and scalable memory bandwidth, making it a promising solution to accelerate Transformer with memory-intensive operations. However, the crucial challenge lies in efficiently deploying the entire model onto a resource-limited PIM system while parallelizing each transformer block with potentially many computational branches based on local attention mechanisms. We present Allspark, which focuses on workload orchestration for visual Transformers on PIM systems, aiming at minimizing inference latency. Firstly, to fully utilize the massive parallelism of PIM, Allspark empolys a finer-grained partitioning scheme for computational branches, and format a systematic layout and interleaved dataflow with maximized data locality and reduced data movement. Secondly, Allspark formulates the scheduling of the complete model on a resource-limited distributed PIM system as an integer linear programming (ILP) problem. Thirdly, as local-global data interactions exhibit complex yet regular dependencies, Allspark provides a greedy-based mapping method to allocate computational branches onto the PIM system and minimize NoC communication costs. Extensive experiments on 3D-stacked DRAM-based PIM systems show that Allspark brings 1.2x-24.0x inference speedup for various visual Transformers over baselines, and that Allspark-enriched PIM system yields average speedups of 2.3x and energy savings of 20x-55x over Nvidia V100 GPU.

知識 (knowledge) · Processing（編程語言） · 圖 · NLP · 知識圖譜 ·

2022 年 9 月 30 日

A Decade of Knowledge Graphs in Natural Language Processing: A Survey

Phillip Schneider,Tim Schopf,Juraj Vladika,Mikhail Galkin,Elena Simperl,Florian Matthes

from arxiv, Accepted to AACL-IJCNLP 2022

In pace with developments in the research field of artificial intelligence, knowledge graphs (KGs) have attracted a surge of interest from both academia and industry. As a representation of semantic relations between entities, KGs have proven to be particularly relevant for natural language processing (NLP), experiencing a rapid spread and wide adoption within recent years. Given the increasing amount of research work in this area, several KG-related approaches have been surveyed in the NLP research community. However, a comprehensive study that categorizes established topics and reviews the maturity of individual research streams remains absent to this day. Contributing to closing this gap, we systematically analyzed 507 papers from the literature on KGs in NLP. Our survey encompasses a multifaceted review of tasks, research types, and contributions. As a result, we present a structured overview of the research landscape, provide a taxonomy of tasks, summarize our findings, and highlight directions for future work.