Recently, the community has witnessed the advancement of Large Language Models (LLMs), which have shown remarkable performance on various downstream tasks. Led by powerful models like ChatGPT and Claude, LLMs are revolutionizing how users engage with software, assuming more than mere tools but intelligent assistants. Consequently, evaluating LLMs' anthropomorphic capabilities becomes increasingly important in contemporary discourse. Utilizing the emotion appraisal theory from psychology, we propose to evaluate the empathy ability of LLMs, i.e., how their feelings change when presented with specific situations. After a careful and comprehensive survey, we collect a dataset containing over 400 situations that have proven effective in eliciting the eight emotions central to our study. Categorizing the situations into 36 factors, we conduct a human evaluation involving more than 1,200 subjects worldwide. With the human evaluation results as references, our evaluation includes five LLMs, covering both commercial and open-source models, including variations in model sizes, featuring the latest iterations, such as GPT-4 and LLaMA 2. A conclusion can be drawn from the results that, despite several misalignments, LLMs can generally respond appropriately to certain situations. Nevertheless, they fall short in alignment with the emotional behaviors of human beings and cannot establish connections between similar situations. Our collected dataset of situations, the human evaluation results, and the code of our testing framework, dubbed EmotionBench, is made publicly in //github.com/CUHK-ARISE/EmotionBench. We aspire to contribute to the advancement of LLMs regarding better alignment with the emotional behaviors of human beings, thereby enhancing their utility and applicability as intelligent assistants.
In the contemporary landscape of computing education, the ubiquity of Generative Artificial Intelligence has significantly disrupted traditional assessment methods, rendering them obsolete and prompting educators to seek innovative alternatives. This research paper explores the challenges posed by Generative AI in the assessment domain and the persistent attempts to circumvent its impact. Despite various efforts to devise workarounds, the academic community is yet to find a comprehensive solution. Amidst this struggle, ungrading emerges as a potential yet under-appreciated solution to the assessment dilemma. Ungrading, a pedagogical approach that involves moving away from traditional grading systems, has faced resistance due to its perceived complexity and the reluctance of educators to depart from conventional assessment practices. However, as the inadequacies of current assessment methods become increasingly evident in the face of Generative AI, the time is ripe to reconsider and embrace ungrading.
Recently it has been shown that tensor networks (TNs) have the ability to represent the expected return of a single-agent finite Markov decision process (FMDP). The TN represents a distribution model, where all possible trajectories are considered. When extending these ideas to a multi-agent setting, distribution models suffer from the curse of dimensionality: the exponential relation between the number of possible trajectories and the number of agents. The key advantage of using TNs in this setting is that there exists a large number of established optimisation and decomposition techniques that are specific to TNs, that one can apply to ensure the most efficient representation is found. In this report, these methods are used to form a TN that represents the expected return of a multi-agent reinforcement learning (MARL) task. This model is then applied to a 2 agent random walker example, where it was shown that the policy is correctly optimised using a DMRG technique. Finally, I demonstrate the use of an exact decomposition technique, reducing the number of elements in the tensors by 97.5%, without experiencing any loss of information.
The latest assessments of the emerging technologies for reconfigurable intelligent surfaces (RISs) have indicated the concept's significant potential for localization and sensing, either as individual or simultaneously realized tasks. However, in the vast majority of those studies, the RIS state (i.e., its position and rotation angles) is required to be known a priori. In this paper, we address the problem of the joint three-dimensional (3D) localization of a hybrid RIS (HRIS) and a user. The most cost- and power-efficient hybrid version of an RIS is equipped with a single reception radio-frequency chain and meta-atoms capable of simultaneous reconfigurable reflection and sensing. This dual functionality is controlled by adjustable power splitters embedded at each hybrid meta-atom. Focusing on a downlink scenario where a multi-antenna base station transmits multicarrier signals to a user via an HRIS, we propose a multistage approach to jointly estimate the metasurface's 3D position and 3D rotation matrix (i.e., 6D parameter estimation) as well as the user's 3D position. Our simulation results verify the validity of the proposed estimator via extensive comparisons of the root-mean-square error of the state estimations with the Cram\'{e}r-Rao lower bound (CRB), which is analytically derived. Furthermore, it is showcased that there exists an optimal hybrid reconfigurable intelligent surface (HRIS) power splitting ratio for the desired multi-parameter estimation problem. We also study the robustness of the proposed method in the presence of scattering points in the wireless propagation environment.
Correlation coefficients play a pivotal role in quantifying linear relationships between random variables. Yet, their application to time series data is very challenging due to temporal dependencies. This paper introduces a novel approach to estimate the statistical significance of correlation coefficients in time series data, addressing the limitations of traditional methods based on the concept of effective degrees of freedom (or effective sample size, ESS). These effective degrees of freedom represent the independent sample size that would yield comparable test statistics under the assumption of no temporal correlation. We propose to assume a parametric Gaussian form for the autocorrelation function. We show that this assumption, motivated by a Laplace approximation, enables a simple estimator of the ESS that depends only on the temporal derivatives of the time series. Through numerical experiments, we show that the proposed approach yields accurate statistics while significantly reducing computational overhead. In addition, we evaluate the adequacy of our approach on real physiological signals, for assessing the connectivity measures in electrophysiology and detecting correlated arm movements in motion capture data. Our methodology provides a simple tool for researchers working with time series data, enabling robust hypothesis testing in the presence of temporal dependencies.
Soccer Simulation 2D (SS2D) is a simulation of a real soccer game in two dimensions. In soccer, passing behavior is an essential action for keeping the ball in possession of our team and creating goal opportunities. Similarly, for SS2D, predicting the passing behaviors of both opponents and our teammates helps manage resources and score more goals. Therefore, in this research, we have tried to address the modeling of passing behavior of soccer 2D players using Deep Neural Networks (DNN) and Random Forest (RF). We propose an embedded data extraction module that can record the decision-making of agents in an online format. Afterward, we apply four data sorting techniques for training data preparation. After, we evaluate the trained models' performance playing against 6 top teams of RoboCup 2019 that have distinctive playing strategies. Finally, we examine the importance of different feature groups on the prediction of a passing strategy. All results in each step of this work prove our suggested methodology's effectiveness and improve the performance of the pass prediction in Soccer Simulation 2D games ranging from 5\% (e.g., playing against the same team) to 10\% (e.g., playing against Robocup top teams).
Crowdfunding in the realm of the Social Web has received substantial attention, with prior research examining various aspects of campaigns, including project objectives, durations, and influential project categories for successful fundraising. These factors are crucial for entrepreneurs seeking donor support. However, the terrain of charity crowdfunding within the Social Web remains relatively unexplored, lacking comprehension of the motivations driving donations that often lack concrete reciprocation. Distinct from conventional crowdfunding that offers tangible returns, charity crowdfunding relies on intangible rewards like tax advantages, recognition posts, or advisory roles. Such details are often embedded within campaign narratives, yet, the analysis of textual content in charity crowdfunding is limited. This study introduces an inventive text analytics framework, utilizing Latent Dirichlet Allocation (LDA) to extract latent themes from textual descriptions of charity campaigns. The study has explored four different themes, two each in campaign and incentive descriptions. Campaign description themes are focused on child and elderly health mainly the ones who are diagnosed with terminal diseases. Incentive description themes are based on tax benefits, certificates, and appreciation posts. These themes, combined with numerical parameters, predict campaign success. The study was successful in using Random Forest Classifier to predict success of the campaign using both thematic and numerical parameters. The study distinguishes thematic categories, particularly medical need-based charity and general causes, based on project and incentive descriptions. In conclusion, this research bridges the gap by showcasing topic modelling utility in uncharted charity crowdfunding domains.
Recently, a considerable literature has grown up around the theme of Graph Convolutional Network (GCN). How to effectively leverage the rich structural information in complex graphs, such as knowledge graphs with heterogeneous types of entities and relations, is a primary open challenge in the field. Most GCN methods are either restricted to graphs with a homogeneous type of edges (e.g., citation links only), or focusing on representation learning for nodes only instead of jointly propagating and updating the embeddings of both nodes and edges for target-driven objectives. This paper addresses these limitations by proposing a novel framework, namely the Knowledge Embedding based Graph Convolutional Network (KE-GCN), which combines the power of GCNs in graph-based belief propagation and the strengths of advanced knowledge embedding (a.k.a. knowledge graph embedding) methods, and goes beyond. Our theoretical analysis shows that KE-GCN offers an elegant unification of several well-known GCN methods as specific cases, with a new perspective of graph convolution. Experimental results on benchmark datasets show the advantageous performance of KE-GCN over strong baseline methods in the tasks of knowledge graph alignment and entity classification.
How can we estimate the importance of nodes in a knowledge graph (KG)? A KG is a multi-relational graph that has proven valuable for many tasks including question answering and semantic search. In this paper, we present GENI, a method for tackling the problem of estimating node importance in KGs, which enables several downstream applications such as item recommendation and resource allocation. While a number of approaches have been developed to address this problem for general graphs, they do not fully utilize information available in KGs, or lack flexibility needed to model complex relationship between entities and their importance. To address these limitations, we explore supervised machine learning algorithms. In particular, building upon recent advancement of graph neural networks (GNNs), we develop GENI, a GNN-based method designed to deal with distinctive challenges involved with predicting node importance in KGs. Our method performs an aggregation of importance scores instead of aggregating node embeddings via predicate-aware attention mechanism and flexible centrality adjustment. In our evaluation of GENI and existing methods on predicting node importance in real-world KGs with different characteristics, GENI achieves 5-17% higher NDCG@100 than the state of the art.
Knowledge graphs (KGs), which could provide essential relational information between entities, have been widely utilized in various knowledge-driven applications. Since the overall human knowledge is innumerable that still grows explosively and changes frequently, knowledge construction and update inevitably involve automatic mechanisms with less human supervision, which usually bring in plenty of noises and conflicts to KGs. However, most conventional knowledge representation learning methods assume that all triple facts in existing KGs share the same significance without any noises. To address this problem, we propose a novel confidence-aware knowledge representation learning framework (CKRL), which detects possible noises in KGs while learning knowledge representations with confidence simultaneously. Specifically, we introduce the triple confidence to conventional translation-based methods for knowledge representation learning. To make triple confidence more flexible and universal, we only utilize the internal structural information in KGs, and propose three kinds of triple confidences considering both local and global structural information. In experiments, We evaluate our models on knowledge graph noise detection, knowledge graph completion and triple classification. Experimental results demonstrate that our confidence-aware models achieve significant and consistent improvements on all tasks, which confirms the capability of CKRL modeling confidence with structural information in both KG noise detection and knowledge representation learning.
Detecting carried objects is one of the requirements for developing systems to reason about activities involving people and objects. We present an approach to detect carried objects from a single video frame with a novel method that incorporates features from multiple scales. Initially, a foreground mask in a video frame is segmented into multi-scale superpixels. Then the human-like regions in the segmented area are identified by matching a set of extracted features from superpixels against learned features in a codebook. A carried object probability map is generated using the complement of the matching probabilities of superpixels to human-like regions and background information. A group of superpixels with high carried object probability and strong edge support is then merged to obtain the shape of the carried object. We applied our method to two challenging datasets, and results show that our method is competitive with or better than the state-of-the-art.