国产综合欧美日韩激情在线,日本成年黄色一区二区三区,A国产乱理伦片在线观看,国产美女在线播放国产福利一,久久天天躁狠狠躁夜夜婷

Causal dynamics models (CDMs) have demonstrated significant potential in addressing various challenges in reinforcement learning. To learn CDMs, recent studies have performed causal discovery to capture the causal dependencies among environmental variables. However, the learning of CDMs is still confined to small-scale environments due to computational complexity and sample efficiency constraints. This paper aims to extend CDMs to large-scale object-oriented environments, which consist of a multitude of objects classified into different categories. We introduce the Object-Oriented CDM (OOCDM) that shares causalities and parameters among objects belonging to the same class. Furthermore, we propose a learning method for OOCDM that enables it to adapt to a varying number of objects. Experiments on large-scale tasks indicate that OOCDM outperforms existing CDMs in terms of causal discovery, prediction accuracy, generalization, and computational efficiency.

相關內容

回合

關注 3

tuning · Prompt · MoDELS · 泛化理論 · 多樣性 ·

2024 年 7 月 1 日

Patch-Prompt Aligned Bayesian Prompt Tuning for Vision-Language Models

Xinyang Liu,Dongsheng Wang,Bowei Fang,Miaoge Li,Zhibin Duan,Yishi Xu,Bo Chen,Mingyuan Zhou

from arxiv, Accepted by UAI 2024

For downstream applications of vision-language pre-trained models, there has been significant interest in constructing effective prompts. Existing works on prompt engineering, which either require laborious manual designs or optimize the prompt tuning as a point estimation problem, may fail to describe diverse characteristics of categories and limit their applications. We introduce a Bayesian probabilistic resolution to prompt tuning, where the label-specific stochastic prompts are generated hierarchically by first sampling a latent vector from an underlying distribution and then employing a lightweight generative model. Importantly, we semantically regularize the tuning process by minimizing the statistical distance between the visual patches and linguistic prompts, which pushes the stochastic label representations to faithfully capture diverse visual concepts, instead of overfitting the training categories. We evaluate the effectiveness of our approach on four tasks: few-shot image recognition, base-to-new generalization, dataset transfer learning, and domain shifts. Extensive results over 15 datasets show promising transferability and generalization performance of our proposed model, both quantitatively and qualitatively.

MoDELS · 模型評估 · SPICE · CASE · Less ·

2024 年 7 月 1 日

A Hybrid Delay Model for Interconnected Multi-Input Gates

Arman Ferdowsi,Matthias Függer,Josef Salzmann,Ulrich Schmid

Dynamic digital timing analysis is a less accurate but fast alternative to highly accurate but slow analog simulations of digital circuits. It relies on gate delay models, which allow the determination of input-to-output delays of a gate on a per-transition basis. Accurate delay models not only consider the effect of preceding output transitions here but also delay variations induced by multi-input switching (MIS) effects in the case of multi-input gates. Starting out from a first-order hybrid delay model for CMOS two-input NOR gates, we develop a hybrid delay model for Muller C gates and show how to augment these models and their analytic delay formulas by a first-order interconnect. Moreover, we conduct a systematic evaluation of the resulting modeling accuracy: Using SPICE simulations, we quantify the MIS effects on the gate delays under various wire lengths, load capacitances, and input strengths for two different CMOS technologies, comparing these results to the predictions of appropriately parameterized versions of our new gate delay models. Overall, our experimental results reveal that they capture all MIS effects with a surprisingly good accuracy despite being first-order only.

Learning · DNN · 標注 · 特征提取 · MoDELS ·

2024 年 7 月 1 日

Deep Active Audio Feature Learning in Resource-Constrained Environments

Md Mohaimenuzzaman,Christoph Bergmeir,Bernd Meyer

The scarcity of labelled data makes training Deep Neural Network (DNN) models in bioacoustic applications challenging. In typical bioacoustics applications, manually labelling the required amount of data can be prohibitively expensive. To effectively identify both new and current classes, DNN models must continue to learn new features from a modest amount of fresh data. Active Learning (AL) is an approach that can help with this learning while requiring little labelling effort. Nevertheless, the use of fixed feature extraction approaches limits feature quality, resulting in underutilization of the benefits of AL. We describe an AL framework that addresses this issue by incorporating feature extraction into the AL loop and refining the feature extractor after each round of manual annotation. In addition, we use raw audio processing rather than spectrograms, which is a novel approach. Experiments reveal that the proposed AL framework requires 14.3%, 66.7%, and 47.4% less labelling effort on benchmark audio datasets ESC-50, UrbanSound8k, and InsectWingBeat, respectively, for a large DNN model and similar savings on a microcontroller-based counterpart. Furthermore, we showcase the practical relevance of our study by incorporating data from conservation biology projects. All codes are publicly available on GitHub.

Networking · Learning · 6G · Agent · INFORMS ·

2024 年 6 月 30 日

Intelligible Protocol Learning for Resource Allocation in 6G O-RAN Slicing

Farhad Rezazadeh,Hatim Chergui,Shuaib Siddiqui,Josep Mangues,Houbing Song,Walid Saad,Mehdi Bennis

from arxiv, 8 pages, 6 Figures

An adaptive standardized protocol is essential for addressing inter-slice resource contention and conflict in network slicing. Traditional protocol standardization is a cumbersome task that yields hardcoded predefined protocols, resulting in increased costs and delayed rollout. Going beyond these limitations, this paper proposes a novel multi-agent deep reinforcement learning (MADRL) communication framework called standalone explainable protocol (STEP) for future sixth-generation (6G) open radio access network (O-RAN) slicing. As new conditions arise and affect network operation, resource orchestration agents adapt their communication messages to promote the emergence of a protocol on-the-fly, which enables the mitigation of conflict and resource contention between network slices. STEP weaves together the notion of information bottleneck (IB) theory with deep Q-network (DQN) learning concepts. By incorporating a stochastic bottleneck layer -- inspired by variational autoencoders (VAEs) -- STEP imposes an information-theoretic constraint for emergent inter-agent communication. This ensures that agents exchange concise and meaningful information, preventing resource waste and enhancing the overall system performance. The learned protocols enhance interpretability, laying a robust foundation for standardizing next-generation 6G networks. By considering an O-RAN compliant network slicing resource allocation problem, a conflict resolution protocol is developed. In particular, the results demonstrate that, on average, STEP reduces inter-slice conflicts by up to 6.06x compared to a predefined protocol method. Furthermore, in comparison with an MADRL baseline, STEP achieves 1.4x and 3.5x lower resource underutilization and latency, respectively.

Networking · Processing（編程語言） · 泛函 · Networks · 多重集 ·

2024 年 6 月 29 日

Efficient Computation in Congested Anonymous Dynamic Networks

Giuseppe A. Di Luna,Giovanni Viglietta

from arxiv, 26 pages, 2 figures

An anonymous dynamic network is a network of indistinguishable processes whose communication links may appear or disappear unpredictably over time. Previous research has shown that deterministically computing an arbitrary function of a multiset of input values given to these processes takes only a linear number of communication rounds (Di Luna-Viglietta, FOCS 2022). However, fast algorithms for anonymous dynamic networks rely on the construction and transmission of large data structures called "history trees", whose size is polynomial in the number of processes. This approach is unfeasible if the network is congested, and only messages of logarithmic size can be sent through its links. Observe that sending a large message piece by piece over several rounds is not in itself a solution, due to the anonymity of the processes combined with the dynamic nature of the network. Moreover, it is known that certain basic tasks such as all-to-all token dissemination (by means of single-token forwarding) require $\Omega(n^2/\log n)$ rounds in congested networks (Dutta et al., SODA 2013). In this work, we develop a series of practical and efficient techniques that make it possible to use history trees in congested anonymous dynamic networks. Among other applications, we show how to compute arbitrary functions in such networks in $O(n^3)$ communication rounds, greatly improving upon previous state-of-the-art algorithms for congested networks.

Integration · MoDELS · 語言模型化 · 穩健性 · 層 ·

2024 年 6 月 28 日

Integrating Pre-Trained Language Model with Physical Layer Communications

Ju-Hyung Lee,Dong-Ho Lee,Joohan Lee,Jay Pujara

The burgeoning field of on-device AI communication, where devices exchange information directly through embedded foundation models, such as language models (LMs), requires robust, efficient, and generalizable communication frameworks. However, integrating these frameworks with existing wireless systems and effectively managing noise and bit errors pose significant challenges. In this work, we introduce a practical ondevice AI communication framework, integrated with physical layer (PHY) communication functions, demonstrated through its performance on a link-level simulator. Our framework incorporates end-to-end training with channel noise to enhance resilience, incorporates vector quantized variational autoencoders (VQ-VAE) for efficient and robust communication, and utilizes pre-trained encoder-decoder transformers for improved generalization capabilities. Simulations, across various communication scenarios, reveal that our framework achieves a 50% reduction in transmission size while demonstrating substantial generalization ability and noise robustness under standardized 3GPP channel models.

詞元分析器 · 圖 · 結點 · Graph Transformer · INFORMS ·

2024 年 6 月 27 日

Leveraging Contrastive Learning for Enhanced Node Representations in Tokenized Graph Transformers

Jinsong Chen,Hanpeng Liu,John E. Hopcroft,Kun He

While tokenized graph Transformers have demonstrated strong performance in node classification tasks, their reliance on a limited subset of nodes with high similarity scores for constructing token sequences overlooks valuable information from other nodes, hindering their ability to fully harness graph information for learning optimal node representations. To address this limitation, we propose a novel graph Transformer called GCFormer. Unlike previous approaches, GCFormer develops a hybrid token generator to create two types of token sequences, positive and negative, to capture diverse graph information. And a tailored Transformer-based backbone is adopted to learn meaningful node representations from these generated token sequences. Additionally, GCFormer introduces contrastive learning to extract valuable information from both positive and negative token sequences, enhancing the quality of learned node representations. Extensive experimental results across various datasets, including homophily and heterophily graphs, demonstrate the superiority of GCFormer in node classification, when compared to representative graph neural networks (GNNs) and graph Transformers.

多任務學習 · Learning · 優化器 · Adam · Performer ·

2024 年 6 月 27 日

Examining Common Paradigms in Multi-Task Learning

Cathrin Elich,Lukas Kirchdorfer,Jan M. K?hler,Lukas Schott

from arxiv, -

While multi-task learning (MTL) has gained significant attention in recent years, its underlying mechanisms remain poorly understood. Recent methods did not yield consistent performance improvements over single task learning (STL) baselines, underscoring the importance of gaining more profound insights about challenges specific to MTL. In our study, we investigate paradigms in MTL in the context of STL: First, the impact of the choice of optimizer has only been mildly investigated in MTL. We show the pivotal role of common STL tools such as the Adam optimizer in MTL empirically in various experiments. To further investigate Adam's effectiveness, we theoretical derive a partial loss-scale invariance under mild assumptions. Second, the notion of gradient conflicts has often been phrased as a specific problem in MTL. We delve into the role of gradient conflicts in MTL and compare it to STL. For angular gradient alignment we find no evidence that this is a unique problem in MTL. We emphasize differences in gradient magnitude as the main distinguishing factor. Overall, we find surprising similarities between STL and MTL suggesting to consider methods from both fields in a broader context.

離散化 · MoDELS · 樣本 · FAST · Processing（編程語言） ·

2024 年 6 月 27 日

Fast Sampling via Discrete Non-Markov Diffusion Models

Zixiang Chen,Huizhuo Yuan,Yongqian Li,Yiwen Kou,Junkai Zhang,Quanquan Gu

from arxiv, 33 pages, 5 figures, 12 tables

Discrete diffusion models have emerged as powerful tools for high-quality data generation. Despite their success in discrete spaces, such as text generation tasks, the acceleration of discrete diffusion models remains under explored. In this paper, we propose a discrete non-Markov diffusion model, which admits an accelerated reverse sampling for discrete data generation. Our method significantly reduces the number of function evaluations (i.e., calls to the neural network), making the sampling process much faster. Furthermore, we study the transition from finite to infinite step sampling, offering new insights into bridging the gap between discrete and continuous-time processes for discrete diffusion models. Extensive experiments on natural language generation and machine translation tasks demonstrate the superior performance of our method in terms of both generation speed and sample quality compared to existing methods for discrete diffusion models.

學成 · 表示學習 · MoDELS · CASES · contrastive ·

2021 年 6 月 3 日

Improving Event Causality Identification via Self-Supervised Representation Learning on External Causal Statement

Xinyu Zuo,Pengfei Cao,Yubo Chen,Kang Liu,Jun Zhao,Weihua Peng,Yuguang Chen

from arxiv, Accepted to Findings of ACL 2021

Current models for event causality identification (ECI) mainly adopt a supervised framework, which heavily rely on labeled data for training. Unfortunately, the scale of current annotated datasets is relatively limited, which cannot provide sufficient support for models to capture useful indicators from causal statements, especially for handing those new, unseen cases. To alleviate this problem, we propose a novel approach, shortly named CauSeRL, which leverages external causal statements for event causality identification. First of all, we design a self-supervised framework to learn context-specific causal patterns from external causal statements. Then, we adopt a contrastive transfer strategy to incorporate the learned context-specific causal patterns into the target ECI model. Experimental results show that our method significantly outperforms previous methods on EventStoryLine and Causal-TimeBank (+2.0 and +3.4 points on F1 value respectively).