亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Empathetic response generation aims to comprehend the cognitive and emotional states in dialogue utterances and generate proper responses. Psychological theories posit that comprehending emotional and cognitive states necessitates iteratively capturing and understanding associated words across dialogue utterances. However, existing approaches regard dialogue utterances as either a long sequence or independent utterances for comprehension, which are prone to overlook the associated words between them. To address this issue, we propose an Iterative Associative Memory Model (IAMM) for empathetic response generation. Specifically, we employ a novel second-order interaction attention mechanism to iteratively capture vital associated words between dialogue utterances and situations, dialogue history, and a memory module (for storing associated words), thereby accurately and nuancedly comprehending the utterances. We conduct experiments on the Empathetic-Dialogue dataset. Both automatic and human evaluations validate the efficacy of the model. Variant experiments on LLMs also demonstrate that attending to associated words improves empathetic comprehension and expression.

相關內容

Modern neural network architectures still struggle to learn algorithmic procedures that require to systematically apply compositional rules to solve out-of-distribution problem instances. In this work, we focus on formula simplification problems, a class of synthetic benchmarks used to study the systematic generalization capabilities of neural architectures. We propose a modular architecture designed to learn a general procedure for solving nested mathematical formulas by only relying on a minimal set of training examples. Inspired by rewriting systems, a classic framework in symbolic artificial intelligence, we include in the architecture three specialized and interacting modules: the Selector, trained to identify solvable sub-expressions; the Solver, mapping sub-expressions to their values; and the Combiner, replacing sub-expressions in the original formula with the solution provided by the Solver. We benchmark our system against the Neural Data Router, a recent model specialized for systematic generalization, and a state-of-the-art large language model (GPT-4) probed with advanced prompting strategies. We demonstrate that our approach achieves a higher degree of out-of-distribution generalization compared to these alternative approaches on three different types of formula simplification problems, and we discuss its limitations by analyzing its failures.

We study push-based sampling and transmission policies for a status update system consisting of a general finite-state continuous-time Markov chain (CTMC) information source with known dynamics, with the goal of minimizing the average age of incorrect information (AoII). The problem setting we investigate involves an exponentially distributed delay channel for transmissions and a constraint on the average sampling rate. We first show that the optimum sampling and transmission policy is a 'multi-threshold policy', where the thresholds depend on both the estimation value and the state of the original process, and sampling and transmission need to be initiated when the instantaneous AoII exceeds the corresponding threshold, called the estimation- and state-aware transmission (ESAT) policy. Subsequently, we formulate the problem of finding the thresholds as a constrained semi-Markov decision process (CSMDP) and the Lagrangian approach. Additionally, we propose two lower complexity sub-optimum policies, namely the estimation-aware transmission (EAT) policy, and the single-threshold (ST) policy, for which it is possible to obtain these thresholds for CTMCs with relatively larger number of states. The underlying CSMDP formulation relies on the 'multi-regime phase-type' (MRPH) distribution which is a generalization of the well-known phase-type distribution, which allows us to obtain the distribution of time until absorption in a CTMC whose transition rates change with respect to time in a piece-wise manner. The effectiveness of the proposed ESAT, EAT and ST sampling and transmission policies are shown through numerical examples, along with comparisons with a baseline scheme that transmits packets according to a Poisson process in out-of-sync periods.

Brain decoding that classifies cognitive states using the functional fluctuations of the brain can provide insightful information for understanding the brain mechanisms of cognitive functions. Among the common procedures of decoding the brain cognitive states with functional magnetic resonance imaging (fMRI), extracting the time series of each brain region after brain parcellation traditionally averages across the voxels within a brain region. This neglects the spatial information among the voxels and the requirement of extracting information for the downstream tasks. In this study, we propose to use a fully connected neural network that is jointly trained with the brain decoder to perform an adaptively weighted average across the voxels within each brain region. We perform extensive evaluations by cognitive state decoding, manifold learning, and interpretability analysis on the Human Connectome Project (HCP) dataset. The performance comparison of the cognitive state decoding presents an accuracy increase of up to 5\% and stable accuracy improvement under different time window sizes, resampling sizes, and training data sizes. The results of manifold learning show that our method presents a considerable separability among cognitive states and basically excludes subject-specific information. The interpretability analysis shows that our method can identify reasonable brain regions corresponding to each cognitive state. Our study would aid the improvement of the basic pipeline of fMRI processing.

Humans perceive the world by concurrently processing and fusing high-dimensional inputs from multiple modalities such as vision and audio. Machine perception models, in stark contrast, are typically modality-specific and optimised for unimodal benchmarks, and hence late-stage fusion of final representations or predictions from each modality (`late-fusion') is still a dominant paradigm for multimodal video classification. Instead, we introduce a novel transformer based architecture that uses `fusion bottlenecks' for modality fusion at multiple layers. Compared to traditional pairwise self-attention, our model forces information between different modalities to pass through a small number of bottleneck latents, requiring the model to collate and condense the most relevant information in each modality and only share what is necessary. We find that such a strategy improves fusion performance, at the same time reducing computational cost. We conduct thorough ablation studies, and achieve state-of-the-art results on multiple audio-visual classification benchmarks including Audioset, Epic-Kitchens and VGGSound. All code and models will be released.

Recent contrastive representation learning methods rely on estimating mutual information (MI) between multiple views of an underlying context. E.g., we can derive multiple views of a given image by applying data augmentation, or we can split a sequence into views comprising the past and future of some step in the sequence. Contrastive lower bounds on MI are easy to optimize, but have a strong underestimation bias when estimating large amounts of MI. We propose decomposing the full MI estimation problem into a sum of smaller estimation problems by splitting one of the views into progressively more informed subviews and by applying the chain rule on MI between the decomposed views. This expression contains a sum of unconditional and conditional MI terms, each measuring modest chunks of the total MI, which facilitates approximation via contrastive bounds. To maximize the sum, we formulate a contrastive lower bound on the conditional MI which can be approximated efficiently. We refer to our general approach as Decomposed Estimation of Mutual Information (DEMI). We show that DEMI can capture a larger amount of MI than standard non-decomposed contrastive bounds in a synthetic setting, and learns better representations in a vision domain and for dialogue generation.

Graph Neural Networks (GNNs) have recently become increasingly popular due to their ability to learn complex systems of relations or interactions arising in a broad spectrum of problems ranging from biology and particle physics to social networks and recommendation systems. Despite the plethora of different models for deep learning on graphs, few approaches have been proposed thus far for dealing with graphs that present some sort of dynamic nature (e.g. evolving features or connectivity over time). In this paper, we present Temporal Graph Networks (TGNs), a generic, efficient framework for deep learning on dynamic graphs represented as sequences of timed events. Thanks to a novel combination of memory modules and graph-based operators, TGNs are able to significantly outperform previous approaches being at the same time more computationally efficient. We furthermore show that several previous models for learning on dynamic graphs can be cast as specific instances of our framework. We perform a detailed ablation study of different components of our framework and devise the best configuration that achieves state-of-the-art performance on several transductive and inductive prediction tasks for dynamic graphs.

Embedding entities and relations into a continuous multi-dimensional vector space have become the dominant method for knowledge graph embedding in representation learning. However, most existing models ignore to represent hierarchical knowledge, such as the similarities and dissimilarities of entities in one domain. We proposed to learn a Domain Representations over existing knowledge graph embedding models, such that entities that have similar attributes are organized into the same domain. Such hierarchical knowledge of domains can give further evidence in link prediction. Experimental results show that domain embeddings give a significant improvement over the most recent state-of-art baseline knowledge graph embedding models.

We study the problem of textual relation embedding with distant supervision. To combat the wrong labeling problem of distant supervision, we propose to embed textual relations with global statistics of relations, i.e., the co-occurrence statistics of textual and knowledge base relations collected from the entire corpus. This approach turns out to be more robust to the training noise introduced by distant supervision. On a popular relation extraction dataset, we show that the learned textual relation embedding can be used to augment existing relation extraction models and significantly improve their performance. Most remarkably, for the top 1,000 relational facts discovered by the best existing model, the precision can be improved from 83.9% to 89.3%.

Multi-relation Question Answering is a challenging task, due to the requirement of elaborated analysis on questions and reasoning over multiple fact triples in knowledge base. In this paper, we present a novel model called Interpretable Reasoning Network that employs an interpretable, hop-by-hop reasoning process for question answering. The model dynamically decides which part of an input question should be analyzed at each hop; predicts a relation that corresponds to the current parsed results; utilizes the predicted relation to update the question representation and the state of the reasoning process; and then drives the next-hop reasoning. Experiments show that our model yields state-of-the-art results on two datasets. More interestingly, the model can offer traceable and observable intermediate predictions for reasoning analysis and failure diagnosis.

The amount of publicly available biomedical literature has been growing rapidly in recent years, yet question answering systems still struggle to exploit the full potential of this source of data. In a preliminary processing step, many question answering systems rely on retrieval models for identifying relevant documents and passages. This paper proposes a weighted cosine distance retrieval scheme based on neural network word embeddings. Our experiments are based on publicly available data and tasks from the BioASQ biomedical question answering challenge and demonstrate significant performance gains over a wide range of state-of-the-art models.

北京阿比特科技有限公司