Markowitz's criterion aims to balance expected return and risk when optimizing the portfolio. The expected return level is usually fixed according to the risk appetite of an investor, then the risk is minimized at this fixed return level. However, the investor may not know which return level is suitable for her/him and the current financial circumstance. It motivates us to find a novel approach that adaptively optimizes this return level and the portfolio at the same time. It not only relieves the trouble of deciding the return level during an investment but also gets more adaptive to the ever-changing financial market than a subjective return level. In order to solve the new model, we propose an exact, convergent, and efficient Krasnoselskii-Mann Proximity Algorithm based on the proximity operator and Krasnoselskii-Mann momentum technique. Extensive experiments show that the proposed method achieves significant improvements over state-of-the-art methods in portfolio optimization. This finding may contribute a new perspective on the relationship between return and risk in portfolio optimization.
Human values and their measurement are long-standing interdisciplinary inquiry. Recent advances in AI have sparked renewed interest in this area, with large language models (LLMs) emerging as both tools and subjects of value measurement. This work introduces Generative Psychometrics for Values (GPV), an LLM-based, data-driven value measurement paradigm, theoretically grounded in text-revealed selective perceptions. The core idea is to dynamically parse unstructured texts into perceptions akin to static stimuli in traditional psychometrics, measure the value orientations they reveal, and aggregate the results. Applying GPV to human-authored blogs, we demonstrate its stability, validity, and superiority over prior psychological tools. Then, extending GPV to LLM value measurement, we advance the current art with 1) a psychometric methodology that measures LLM values based on their scalable and free-form outputs, enabling context-specific measurement; 2) a comparative analysis of measurement paradigms, indicating response biases of prior methods; and 3) an attempt to bridge LLM values and their safety, revealing the predictive power of different value systems and the impacts of various values on LLM safety. Through interdisciplinary efforts, we aim to leverage AI for next-generation psychometrics and psychometrics for value-aligned AI.
Neural operators effectively solve PDE problems from data without knowing the explicit equations, which learn the map from the input sequences of observed samples to the predicted values. Most existing works build the model in the original geometric space, leading to high computational costs when the number of sample points is large. We present the Latent Neural Operator (LNO) solving PDEs in the latent space. In particular, we first propose Physics-Cross-Attention (PhCA) transforming representation from the geometric space to the latent space, then learn the operator in the latent space, and finally recover the real-world geometric space via the inverse PhCA map. Our model retains flexibility that can decode values in any position not limited to locations defined in the training set, and therefore can naturally perform interpolation and extrapolation tasks particularly useful for inverse problems. Moreover, the proposed LNO improves both prediction accuracy and computational efficiency. Experiments show that LNO reduces the GPU memory by 50%, speeds up training 1.8 times, and reaches state-of-the-art accuracy on four out of six benchmarks for forward problems and a benchmark for inverse problem. Code is available at //github.com/L-I-M-I-T/LatentNeuralOperator.
In causal inference, treatment effects are typically estimated under the ignorability, or unconfoundedness, assumption, which is often unrealistic in observational data. By relaxing this assumption and conducting a sensitivity analysis, we introduce novel bounds and derive confidence intervals for the Average Potential Outcome (APO) - a standard metric for evaluating continuous-valued treatment or exposure effects. We demonstrate that these bounds are sharp under a continuous sensitivity model, in the sense that they give the smallest possible interval under this model, and propose a doubly robust version of our estimators. In a comparative analysis with the method of Jesson et al. (2022) (arXiv:2204.10022), using both simulated and real datasets, we show that our approach not only yields sharper bounds but also achieves good coverage of the true APO, with significantly reduced computation times.
Understanding relations arising out of interactions among entities can be very difficult, and predicting them is even more challenging. This problem has many applications in various fields, such as financial networks and e-commerce. These relations can involve much more complexities than just involving more than two entities. One such scenario is evolving recursive relations between multiple entities, and so far, this is still an open problem. This work addresses the problem of forecasting higher-order interaction events that can be multi-relational and recursive. We pose the problem in the framework of representation learning of temporal hypergraphs that can capture complex relationships involving multiple entities. The proposed model, \textit{Relational Recursive Hyperedge Temporal Point Process} (RRHyperTPP) uses an encoder that learns a dynamic node representation based on the historical interaction patterns and then a hyperedge link prediction-based decoder to model the occurrence of interaction events. These learned representations are then used for downstream tasks involving forecasting the type and time of interactions. The main challenge in learning from hyperedge events is that the number of possible hyperedges grows exponentially with the number of nodes in the network. This will make the computation of negative log-likelihood of the temporal point process expensive, as the calculation of survival function requires a summation over all possible hyperedges. In our work, we develop a noise contrastive estimation method to learn the parameters of our model, and we have experimentally shown that our models perform better than previous state-of-the-art methods for interaction forecasting.
Constrained Markov decision processes (CMDPs), in which the agent optimizes expected payoffs while keeping the expected cost below a given threshold, are the leading framework for safe sequential decision making under stochastic uncertainty. Among algorithms for planning and learning in CMDPs, methods based on Monte Carlo tree search (MCTS) have particular importance due to their efficiency and extendibility to more complex frameworks (such as partially observable settings and games). However, current MCTS-based methods for CMDPs either struggle with finding safe (i.e., constraint-satisfying) policies, or are too conservative and do not find valuable policies. We introduce Threshold UCT (T-UCT), an online MCTS-based algorithm for CMDP planning. Unlike previous MCTS-based CMDP planners, T-UCT explicitly estimates Pareto curves of cost-utility trade-offs throughout the search tree, using these together with a novel action selection and threshold update rules to seek safe and valuable policies. Our experiments demonstrate that our approach significantly outperforms state-of-the-art methods from the literature.
Forecasting relations between entities is paramount in the current era of data and AI. However, it is often overlooked that real-world relationships are inherently directional, involve more than two entities, and can change with time. In this paper, we provide a comprehensive solution to the problem of forecasting directional relations in a general setting, where relations are higher-order, i.e., directed hyperedges in a hypergraph. This problem has not been previously explored in the existing literature. The primary challenge in solving this problem is that the number of possible hyperedges is exponential in the number of nodes at each event time. To overcome this, we propose a sequential generative approach that segments the forecasting process into multiple stages, each contingent upon the preceding stages, thereby reducing the search space involved in predictions of hyperedges. The first stage involves a temporal point process-based node event forecasting module that identifies the subset of nodes involved in an event. The second stage is a candidate generation module that predicts hyperedge sizes and adjacency vectors for nodes observing events. The final stage is a directed hyperedge predictor that identifies the truth by searching over the set of candidate hyperedges. To validate the effectiveness of our model, we compiled five datasets and conducted an extensive empirical study to assess each downstream task. Our proposed method achieves a performance gain of 32\% and 41\% compared to the state-of-the-art pairwise and hyperedge event forecasting models, respectively, for the event type prediction.
Photoacoustic imaging (PAI) suffers from inherent limitations that can degrade the quality of reconstructed results, such as noise, artifacts and incomplete data acquisition caused by sparse sampling or partial array detection. In this study, we proposed a new optimization method for both two-dimensional (2D) and three-dimensional (3D) PAI reconstruction results, called the regularized iteration method with shape prior. The shape prior is a probability matrix derived from the reconstruction results of multiple sets of random partial array signals in a computational imaging system using any reconstruction algorithm, such as Delay-and-Sum (DAS) and Back-Projection (BP). In the probability matrix, high-probability locations indicate high consistency among multiple reconstruction results at those positions, suggesting a high likelihood of representing the true imaging results. In contrast, low-probability locations indicate higher randomness, leaning more towards noise or artifacts. As a shape prior, this probability matrix guides the iteration and regularization of the entire array signal reconstruction results using the original reconstruction algorithm (the same algorithm for processing random partial array signals). The method takes advantage of the property that the similarity of the object to be imitated is higher than that of noise or artifact in the results reconstructed by multiple sets of random partial array signals of the entire imaging system. The probability matrix is taken as a prerequisite for improving the original reconstruction results, and the optimizer is used to further iterate the imaging results to remove noise and artifacts and improve the imaging fidelity. Especially in the case involving sparse view which brings more artifacts, the effect is remarkable. Simulation and real experiments have both demonstrated the superiority of this method.
Graphs are important data representations for describing objects and their relationships, which appear in a wide diversity of real-world scenarios. As one of a critical problem in this area, graph generation considers learning the distributions of given graphs and generating more novel graphs. Owing to their wide range of applications, generative models for graphs, which have a rich history, however, are traditionally hand-crafted and only capable of modeling a few statistical properties of graphs. Recent advances in deep generative models for graph generation is an important step towards improving the fidelity of generated graphs and paves the way for new kinds of applications. This article provides an extensive overview of the literature in the field of deep generative models for graph generation. Firstly, the formal definition of deep generative models for the graph generation and the preliminary knowledge are provided. Secondly, taxonomies of deep generative models for both unconditional and conditional graph generation are proposed respectively; the existing works of each are compared and analyzed. After that, an overview of the evaluation metrics in this specific domain is provided. Finally, the applications that deep graph generation enables are summarized and five promising future research directions are highlighted.
Data processing and analytics are fundamental and pervasive. Algorithms play a vital role in data processing and analytics where many algorithm designs have incorporated heuristics and general rules from human knowledge and experience to improve their effectiveness. Recently, reinforcement learning, deep reinforcement learning (DRL) in particular, is increasingly explored and exploited in many areas because it can learn better strategies in complicated environments it is interacting with than statically designed algorithms. Motivated by this trend, we provide a comprehensive review of recent works focusing on utilizing DRL to improve data processing and analytics. First, we present an introduction to key concepts, theories, and methods in DRL. Next, we discuss DRL deployment on database systems, facilitating data processing and analytics in various aspects, including data organization, scheduling, tuning, and indexing. Then, we survey the application of DRL in data processing and analytics, ranging from data preparation, natural language processing to healthcare, fintech, etc. Finally, we discuss important open challenges and future research directions of using DRL in data processing and analytics.
Multi-relation Question Answering is a challenging task, due to the requirement of elaborated analysis on questions and reasoning over multiple fact triples in knowledge base. In this paper, we present a novel model called Interpretable Reasoning Network that employs an interpretable, hop-by-hop reasoning process for question answering. The model dynamically decides which part of an input question should be analyzed at each hop; predicts a relation that corresponds to the current parsed results; utilizes the predicted relation to update the question representation and the state of the reasoning process; and then drives the next-hop reasoning. Experiments show that our model yields state-of-the-art results on two datasets. More interestingly, the model can offer traceable and observable intermediate predictions for reasoning analysis and failure diagnosis.