Current state-of-the-art recommender systems predominantly rely on either implicit or explicit feedback from users to suggest new items. While effective in recommending novel options, these conventional systems often use uninterpretable embeddings. This lack of transparency not only limits user understanding of why certain items are suggested but also reduces the user's ability to easily scrutinize and edit their preferences. For example, if a user has a change in interests, they would need to make significant changes to their interaction history to adjust the model's recommendations. To address these limitations, we introduce a novel method that utilizes user reviews to craft personalized, natural language profiles describing users' preferences. Through these descriptive profiles, our system provides transparent recommendations in natural language. Our evaluations show that this novel approach maintains a performance level on par with established recommender systems, but with the added benefits of transparency and user control. By enabling users to scrutinize why certain items are recommended, they can more easily verify, adjust, and have greater autonomy over their recommendations.
Recommendation systems, as widely implemented nowadays on various platforms, recommend relevant items to users based on their preferences. The classical methods which rely on user-item interaction matrices has limitations, especially in scenarios where there is a lack of interaction data for new items. Knowledge graph (KG)-based recommendation systems have emerged as a promising solution. However, most KG-based methods adopt node embeddings, which do not provide personalized recommendations for different users and cannot generalize well to the new items. To address these limitations, we propose Knowledge-enhanced User-Centric subgraph Network (KUCNet), a subgraph learning approach with graph neural network (GNN) for effective recommendation. KUCNet constructs a U-I subgraph for each user-item pair that captures both the historical information of user-item interactions and the side information provided in KG. An attention-based GNN is designed to encode the U-I subgraphs for recommendation. Considering efficiency, the pruned user-centric computation graph is further introduced such that multiple U-I subgraphs can be simultaneously computed and that the size can be pruned by Personalized PageRank. Our proposed method achieves accurate, efficient, and interpretable recommendations especially for new items. Experimental results demonstrate the superiority of KUCNet over state-of-the-art KG-based and collaborative filtering (CF)-based methods.
Previous stance detection studies typically concentrate on evaluating stances within individual instances, thereby exhibiting limitations in effectively modeling multi-party discussions concerning the same specific topic, as naturally transpire in authentic social media interactions. This constraint arises primarily due to the scarcity of datasets that authentically replicate real social media contexts, hindering the research progress of conversational stance detection. In this paper, we introduce a new multi-turn conversation stance detection dataset (called \textbf{MT-CSD}), which encompasses multiple targets for conversational stance detection. To derive stances from this challenging dataset, we propose a global-local attention network (\textbf{GLAN}) to address both long and short-range dependencies inherent in conversational data. Notably, even state-of-the-art stance detection methods, exemplified by GLAN, exhibit an accuracy of only 50.47\%, highlighting the persistent challenges in conversational stance detection. Furthermore, our MT-CSD dataset serves as a valuable resource to catalyze advancements in cross-domain stance detection, where a classifier is adapted from a different yet related target. We believe that MT-CSD will contribute to advancing real-world applications of stance detection research. Our source code, data, and models are available at \url{//github.com/nfq729/MT-CSD}.
NSFW (Not Safe for Work) content, in the context of a dialogue, can have severe side effects on users in open-domain dialogue systems. However, research on detecting NSFW language, especially sexually explicit content, within a dialogue context has significantly lagged behind. To address this issue, we introduce CensorChat, a dialogue monitoring dataset aimed at NSFW dialogue detection. Leveraging knowledge distillation techniques involving GPT-4 and ChatGPT, this dataset offers a cost-effective means of constructing NSFW content detectors. The process entails collecting real-life human-machine interaction data and breaking it down into single utterances and single-turn dialogues, with the chatbot delivering the final utterance. ChatGPT is employed to annotate unlabeled data, serving as a training set. Rationale validation and test sets are constructed using ChatGPT and GPT-4 as annotators, with a self-criticism strategy for resolving discrepancies in labeling. A BERT model is fine-tuned as a text classifier on pseudo-labeled data, and its performance is assessed. The study emphasizes the importance of AI systems prioritizing user safety and well-being in digital conversations while respecting freedom of expression. The proposed approach not only advances NSFW content detection but also aligns with evolving user protection needs in AI-driven dialogues.
Cross-modal retrieval (CMR) aims to establish interaction between different modalities, among which supervised CMR is emerging due to its flexibility in learning semantic category discrimination. Despite the remarkable performance of previous supervised CMR methods, much of their success can be attributed to the well-annotated data. However, even for unimodal data, precise annotation is expensive and time-consuming, and it becomes more challenging with the multimodal scenario. In practice, massive multimodal data are collected from the Internet with coarse annotation, which inevitably introduces noisy labels. Training with such misleading labels would bring two key challenges -- enforcing the multimodal samples to \emph{align incorrect semantics} and \emph{widen the heterogeneous gap}, resulting in poor retrieval performance. To tackle these challenges, this work proposes UOT-RCL, a Unified framework based on Optimal Transport (OT) for Robust Cross-modal Retrieval. First, we propose a semantic alignment based on partial OT to progressively correct the noisy labels, where a novel cross-modal consistent cost function is designed to blend different modalities and provide precise transport cost. Second, to narrow the discrepancy in multi-modal data, an OT-based relation alignment is proposed to infer the semantic-level cross-modal matching. Both of these two components leverage the inherent correlation among multi-modal data to facilitate effective cost function. The experiments on three widely-used cross-modal retrieval datasets demonstrate that our UOT-RCL surpasses the state-of-the-art approaches and significantly improves the robustness against noisy labels.
With the increasing amount of data available to scientists in disciplines as diverse as bioinformatics, physics, and remote sensing, scientific workflow systems are becoming increasingly important for composing and executing scalable data analysis pipelines. When writing such workflows, users need to specify the resources to be reserved for tasks so that sufficient resources are allocated on the target cluster infrastructure. Crucially, underestimating a task's memory requirements can result in task failures. Therefore, users often resort to overprovisioning, resulting in significant resource wastage and decreased throughput. In this paper, we propose a novel online method that uses monitoring time series data to predict task memory usage in order to reduce the memory wastage of scientific workflow tasks. Our method predicts a task's runtime, divides it into k equally-sized segments, and learns the peak memory value for each segment depending on the total file input size. We evaluate the prototype implementation of our method using workflows from the publicly available nf-core repository, showing an average memory wastage reduction of 29.48% compared to the best state-of-the-art approach.
Next Point-of-Interests (POIs) recommendation task aims to provide a dynamic ranking of POIs based on users' current check-in trajectories. The recommendation performance of this task is contingent upon a comprehensive understanding of users' personalized behavioral patterns through Location-based Social Networks (LBSNs) data. While prior studies have adeptly captured sequential patterns and transitional relationships within users' check-in trajectories, a noticeable gap persists in devising a mechanism for discerning specialized behavioral patterns during distinct time slots, such as noon, afternoon, or evening. In this paper, we introduce an innovative data structure termed the ``Mobility Tree'', tailored for hierarchically describing users' check-in records. The Mobility Tree encompasses multi-granularity time slot nodes to learn user preferences across varying temporal periods. Meanwhile, we propose the Mobility Tree Network (MTNet), a multitask framework for personalized preference learning based on Mobility Trees. We develop a four-step node interaction operation to propagate feature information from the leaf nodes to the root node. Additionally, we adopt a multitask training strategy to push the model towards learning a robust representation. The comprehensive experimental results demonstrate the superiority of MTNet over ten state-of-the-art next POI recommendation models across three real-world LBSN datasets, substantiating the efficacy of time slot preference learning facilitated by Mobility Tree.
Ensembles over neural network weights trained from different random initialization, known as deep ensembles, achieve state-of-the-art accuracy and calibration. The recently introduced batch ensembles provide a drop-in replacement that is more parameter efficient. In this paper, we design ensembles not only over weights, but over hyperparameters to improve the state of the art in both settings. For best performance independent of budget, we propose hyper-deep ensembles, a simple procedure that involves a random search over different hyperparameters, themselves stratified across multiple random initializations. Its strong performance highlights the benefit of combining models with both weight and hyperparameter diversity. We further propose a parameter efficient version, hyper-batch ensembles, which builds on the layer structure of batch ensembles and self-tuning networks. The computational and memory costs of our method are notably lower than typical ensembles. On image classification tasks, with MLP, LeNet, and Wide ResNet 28-10 architectures, our methodology improves upon both deep and batch ensembles.
Providing model-generated explanations in recommender systems is important to user experience. State-of-the-art recommendation algorithms -- especially the collaborative filtering (CF) based approaches with shallow or deep models -- usually work with various unstructured information sources for recommendation, such as textual reviews, visual images, and various implicit or explicit feedbacks. Though structured knowledge bases were considered in content-based approaches, they have been largely ignored recently due to the availability of vast amount of data and the learning power of many complex models. However, structured knowledge bases exhibit unique advantages in personalized recommendation systems. When the explicit knowledge about users and items is considered for recommendation, the system could provide highly customized recommendations based on users' historical behaviors and the knowledge is helpful for providing informed explanations regarding the recommended items. In this work, we propose to reason over knowledge base embeddings for explainable recommendation. Specifically, we propose a knowledge base representation learning framework to embed heterogeneous entities for recommendation, and based on the embedded knowledge base, a soft matching algorithm is proposed to generate personalized explanations for the recommended items. Experimental results on real-world e-commerce datasets verified the superior recommendation performance and the explainability power of our approach compared with state-of-the-art baselines.
Dynamic programming (DP) solves a variety of structured combinatorial problems by iteratively breaking them down into smaller subproblems. In spite of their versatility, DP algorithms are usually non-differentiable, which hampers their use as a layer in neural networks trained by backpropagation. To address this issue, we propose to smooth the max operator in the dynamic programming recursion, using a strongly convex regularizer. This allows to relax both the optimal value and solution of the original combinatorial problem, and turns a broad class of DP algorithms into differentiable operators. Theoretically, we provide a new probabilistic perspective on backpropagating through these DP operators, and relate them to inference in graphical models. We derive two particular instantiations of our framework, a smoothed Viterbi algorithm for sequence prediction and a smoothed DTW algorithm for time-series alignment. We showcase these instantiations on two structured prediction tasks and on structured and sparse attention for neural machine translation.
Recommender systems play a crucial role in mitigating the problem of information overload by suggesting users' personalized items or services. The vast majority of traditional recommender systems consider the recommendation procedure as a static process and make recommendations following a fixed strategy. In this paper, we propose a novel recommender system with the capability of continuously improving its strategies during the interactions with users. We model the sequential interactions between users and a recommender system as a Markov Decision Process (MDP) and leverage Reinforcement Learning (RL) to automatically learn the optimal strategies via recommending trial-and-error items and receiving reinforcements of these items from users' feedbacks. In particular, we introduce an online user-agent interacting environment simulator, which can pre-train and evaluate model parameters offline before applying the model online. Moreover, we validate the importance of list-wise recommendations during the interactions between users and agent, and develop a novel approach to incorporate them into the proposed framework LIRD for list-wide recommendations. The experimental results based on a real-world e-commerce dataset demonstrate the effectiveness of the proposed framework.