Best-of-N (BoN) sampling with a reward model has been shown to be an effective strategy for aligning Large Language Models (LLMs) to human preferences at the time of decoding. BoN sampling is susceptible to a problem known as reward hacking. Because the reward model is an imperfect proxy for the true objective, over-optimizing its value can compromise its performance on the true objective. A common solution to prevent reward hacking in preference learning techniques is to optimize a reward using proximity regularization (e.g., KL regularization), which ensures that the language model remains close to the reference model. In this research, we propose Regularized Best-of-N (RBoN), a variant of BoN that aims to mitigate reward hacking by incorporating a proximity term in response selection, similar to preference learning techniques. We evaluate two variants of RBoN on the AlpacaFarm dataset and find that they outperform BoN, especially when the proxy reward model has a low correlation with the true objective.
Graph Neural Networks (GNNs) have emerged as potent tools for predicting outcomes in graph-structured data. Despite their efficacy, a significant drawback of GNNs lies in their limited ability to provide robust uncertainty estimates, posing challenges to their reliability in contexts where errors carry significant consequences. Moreover, GNNs typically excel in in-distribution settings, assuming that training and test data follow identical distributions: a condition often unmet in real-world graph data scenarios. In this article, we leverage conformal prediction, a widely recognized statistical technique for quantifying uncertainty by transforming predictive model outputs into prediction sets, to address uncertainty quantification in GNN predictions amidst conditional shift \footnote{Representing the change in conditional probability distribution $P(label |input)$ from source domain to target domain.} in graph-based semi-supervised learning (SSL). Additionally, we propose a novel loss function aimed at refining model predictions by minimizing conditional shift in latent stages. Termed Conditional Shift Robust (CondSR) conformal prediction for GNNs, our approach CondSR is model-agnostic and adaptable to various classification models. We validate the effectiveness of our method on standard graph benchmark datasets, integrating it with state-of-the-art GNNs in node classification tasks. The code implementation is publicly available for further exploration and experimentation.
Out-of-distribution (OOD) detection is a critical task in machine learning that seeks to identify abnormal samples. Traditionally, unsupervised methods utilize a deep generative model for OOD detection. However, such approaches necessitate a different model when evaluating abnormality against a new distribution. With the emergence of foundational generative models, this paper explores whether a single generalist model can also perform OOD detection across diverse tasks. To that end, we introduce our method, Diffusion Paths, (DiffPath) in this work. DiffPath proposes to utilize a single diffusion model originally trained to perform unconditional generation for OOD detection. Specifically, we introduce a novel technique of measuring the rate-of-change and curvature of the diffusion paths connecting samples to the standard normal. Extensive experiments show that with a single model, DiffPath outperforms prior work on a variety of OOD tasks involving different distributions. Our code is publicly available at //github.com/clear-nus/diffpath.
We model a Markov decision process, parametrized by an unknown parameter, and study the asymptotic behavior of a sampling-based algorithm, called Thompson sampling. The standard definition of regret is not always suitable to evaluate a policy, especially when the underlying chain structure is general. We show that the standard (expected) regret can grow (super-)linearly and fails to capture the notion of learning in realistic settings with non-trivial state evolution. By decomposing the standard (expected) regret, we develop a new metric, called the expected residual regret, which forgets the immutable consequences of past actions. Instead, it measures regret against the optimal reward moving forward from the current period. We show that the expected residual regret of the Thompson sampling algorithm is upper bounded by a term which converges exponentially fast to 0. We present conditions under which the posterior sampling error of Thompson sampling converges to 0 almost surely. We then introduce the probabilistic version of the expected residual regret and present conditions under which it converges to 0 almost surely. Thus, we provide a viable concept of learning for sampling algorithms which will serve useful in broader settings than had been considered previously.
Neural Networks (NNs) have been successfully employed to represent the state evolution of complex dynamical systems. Such models, referred to as NN dynamic models (NNDMs), use iterative noisy predictions of NN to estimate a distribution of system trajectories over time. Despite their accuracy, safety analysis of NNDMs is known to be a challenging problem and remains largely unexplored. To address this issue, in this paper, we introduce a method of providing safety guarantees for NNDMs. Our approach is based on stochastic barrier functions, whose relation with safety are analogous to that of Lyapunov functions with stability. We first show a method of synthesizing stochastic barrier functions for NNDMs via a convex optimization problem, which in turn provides a lower bound on the system's safety probability. A key step in our method is the employment of the recent convex approximation results for NNs to find piece-wise linear bounds, which allow the formulation of the barrier function synthesis problem as a sum-of-squares optimization program. If the obtained safety probability is above the desired threshold, the system is certified. Otherwise, we introduce a method of generating controls for the system that robustly maximizes the safety probability in a minimally-invasive manner. We exploit the convexity property of the barrier function to formulate the optimal control synthesis problem as a linear program. Experimental results illustrate the efficacy of the method. Namely, they show that the method can scale to multi-dimensional NNDMs with multiple layers and hundreds of neurons per layer, and that the controller can significantly improve the safety probability.
Stable partitioned techniques for simulating unsteady fluid-structure interaction (FSI) are known to be computationally expensive when high added-mass is involved. Multiple coupling strategies have been developed to accelerate these simulations, but often use predictors in the form of simple finite-difference extrapolations. In this work, we propose a non-intrusive data-driven predictor that couples reduced-order models of both the solid and fluid subproblems, providing an initial guess for the nonlinear problem of the next time step calculation. Each reduced order model is composed of a nonlinear encoder-regressor-decoder architecture and is equipped with an adaptive update strategy that adds robustness for extrapolation. In doing so, the proposed methodology leverages physics-based insights from high-fidelity solvers, thus establishing a physics-aware machine learning predictor. Using three strongly coupled FSI examples, this study demonstrates the improved convergence obtained with the new predictor and the overall computational speedup realized compared to classical approaches.
In the quest for accurate and interpretable AI models, eXplainable AI (XAI) has become crucial. Fuzzy Cognitive Maps (FCMs) stand out as an advanced XAI method because of their ability to synergistically combine and exploit both expert knowledge and data-driven insights, providing transparency and intrinsic interpretability. This letter introduces and investigates the "Total Causal Effect Calculation for FCMs" (TCEC-FCM) algorithm, an innovative approach that, for the first time, enables the efficient calculation of total causal effects among concepts in large-scale FCMs by leveraging binary search and graph traversal techniques, thereby overcoming the challenge of exhaustive causal path exploration that hinder existing methods. We evaluate the proposed method across various synthetic FCMs that demonstrate TCEC-FCM's superior performance over exhaustive methods, marking a significant advancement in causal effect analysis within FCMs, thus broadening their usability for modern complex XAI applications.
Few-shot segmentation (FSS) aims to train a model which can segment the object from novel classes with a few labeled samples. The insufficient generalization ability of models leads to unsatisfactory performance when the models lack enough labeled data from the novel classes. Considering that there are abundant unlabeled data available, it is promising to improve the generalization ability by exploiting these various data. For leveraging unlabeled data, we propose a novel method, named Image to Pseudo-Episode (IPE), to generate pseudo-episodes from unlabeled data. Specifically, our method contains two modules, i.e., the pseudo-label generation module and the episode generation module. The former module generates pseudo-labels from unlabeled images by the spectral clustering algorithm, and the latter module generates pseudo-episodes from pseudo-labeled images by data augmentation methods. Extensive experiments on PASCAL-$5^i$ and COCO-$20^i$ demonstrate that our method achieves the state-of-the-art performance for FSS.
Most current click-through rate prediction(CTR)models create explicit or implicit high-order feature crosses through Hadamard product or inner product, with little attention to the importance of feature crossing; only few models are either limited to the second-order explicit feature crossing, implicitly to high-order feature crossing, or can learn the importance of high-order explicit feature crossing but fail to provide good interpretability for the model. This paper proposes a new model, FiiNet (Multiple Order Feature Interaction Importance Neural Networks). The model first uses the selective kernel network (SKNet) to explicitly construct multi-order feature crosses. It dynamically learns the importance of feature interaction combinations in a fine grained manner, increasing the attention weight of important feature cross combinations and reducing the weight of featureless crosses. To verify that the FiiNet model can dynamically learn the importance of feature interaction combinations in a fine-grained manner and improve the model's recommendation performance and interpretability, this paper compares it with many click-through rate prediction models on two real datasets, proving that the FiiNet model incorporating the selective kernel network can effectively improve the recommendation effect and provide better interpretability. FiiNet model implementations are available in PyTorch.
Retrieval-Augmented Generation (RAG) merges retrieval methods with deep learning advancements to address the static limitations of large language models (LLMs) by enabling the dynamic integration of up-to-date external information. This methodology, focusing primarily on the text domain, provides a cost-effective solution to the generation of plausible but incorrect responses by LLMs, thereby enhancing the accuracy and reliability of their outputs through the use of real-world data. As RAG grows in complexity and incorporates multiple concepts that can influence its performance, this paper organizes the RAG paradigm into four categories: pre-retrieval, retrieval, post-retrieval, and generation, offering a detailed perspective from the retrieval viewpoint. It outlines RAG's evolution and discusses the field's progression through the analysis of significant studies. Additionally, the paper introduces evaluation methods for RAG, addressing the challenges faced and proposing future research directions. By offering an organized framework and categorization, the study aims to consolidate existing research on RAG, clarify its technological underpinnings, and highlight its potential to broaden the adaptability and applications of LLMs.
Visual Question Answering (VQA) models have struggled with counting objects in natural images so far. We identify a fundamental problem due to soft attention in these models as a cause. To circumvent this problem, we propose a neural network component that allows robust counting from object proposals. Experiments on a toy task show the effectiveness of this component and we obtain state-of-the-art accuracy on the number category of the VQA v2 dataset without negatively affecting other categories, even outperforming ensemble models with our single model. On a difficult balanced pair metric, the component gives a substantial improvement in counting over a strong baseline by 6.6%.