Uncertainty quantification techniques such as the time-dependent generalized polynomial chaos (TD-gPC) use an adaptive orthogonal basis to better represent the stochastic part of the solution space (aka random function space) in time. However, because the random function space is constructed using tensor products, TD-gPC-based methods are known to suffer from the curse of dimensionality. In this paper, we introduce a new numerical method called the 'flow-driven spectral chaos' (FSC) which overcomes this curse of dimensionality at the random-function-space level. The proposed method is not only computationally more efficient than existing TD-gPC-based methods but is also far more accurate. The FSC method uses the concept of 'enriched stochastic flow maps' to track the evolution of a finite-dimensional random function space efficiently in time. To transfer the probability information from one random function space to another, two approaches are developed and studied herein. In the first approach, the probability information is transferred in the mean-square sense, whereas in the second approach the transfer is done exactly using a new theorem that was developed for this purpose. The FSC method can quantify uncertainties with high fidelity, especially for the long-time response of stochastic dynamical systems governed by ODEs of arbitrary order. Six representative numerical examples, including a nonlinear problem (the Van-der-Pol oscillator), are presented to demonstrate the performance of the FSC method and corroborate the claims of its superior numerical properties. Finally, a parametric, high-dimensional stochastic problem is used to demonstrate that when the FSC method is used in conjunction with Monte Carlo integration, the curse of dimensionality can be overcome altogether.
This paper is devoted to the robust approximation with a variational phase field approach of multiphase mean curvature flows with possibly highly contrasted mobilities. The case of harmonically additive mobilities has been addressed recently using a suitable metric to define the gradient flow of the phase field approximate energy. We generalize this approach to arbitrary nonnegative mobilities using a decomposition as sums of harmonically additive mobilities. We establish the consistency of the resulting method by analyzing the sharp interface limit of the flow: a formal expansion of the phase field shows that the method is of second order. We propose a simple numerical scheme to approximate the solutions to our new model. Finally, we present some numerical experiments in dimensions 2 and 3 that illustrate the interest and effectiveness of our approach, in particular for approximating flows in which the mobility of some phases is zero.
Node clustering is a powerful tool in the analysis of networks. We introduce a graph neural network framework to obtain node embeddings for directed networks in a self-supervised manner, including a novel probabilistic imbalance loss, which can be used for network clustering. Here, we propose directed flow imbalance measures, which are tightly related to directionality, to reveal clusters in the network even when there is no density difference between clusters. In contrast to standard approaches in the literature, in this paper, directionality is not treated as a nuisance, but rather contains the main signal. DIGRAC optimizes directed flow imbalance for clustering without requiring label supervision, unlike existing graph neural network methods, and can naturally incorporate node features, unlike existing spectral methods. Extensive experimental results on synthetic data, in the form of directed stochastic block models, and real-world data at different scales, demonstrate that our method, based on flow imbalance, attains state-of-the-art results on directed graph clustering when compared against 10 state-of-the-art methods from the literature, for a wide range of noise and sparsity levels, graph structures and topologies, and even outperforms supervised methods.
Modeling and control of high-dimensional, nonlinear robotic systems remains a challenging task. While various model- and learning-based approaches have been proposed to address these challenges, they broadly lack generalizability to different control tasks and rarely preserve the structure of the dynamics. In this work, we propose a new, data-driven approach for extracting low-dimensional models from data using Spectral Submanifold Reduction (SSMR). In contrast to other data-driven methods which fit dynamical models to training trajectories, we identify the dynamics on generic, low-dimensional attractors embedded in the full phase space of the robotic system. This allows us to obtain computationally-tractable models for control which preserve the system's dominant dynamics and better track trajectories radically different from the training data. We demonstrate the superior performance and generalizability of SSMR in dynamic trajectory tracking tasks vis-a-vis the state of the art, including Koopman operator-based approaches.
Feature transformation aims to extract a good representation (feature) space by mathematically transforming existing features. It is crucial to address the curse of dimensionality, enhance model generalization, overcome data sparsity, and expand the availability of classic models. Current research focuses on domain knowledge-based feature engineering or learning latent representations; nevertheless, these methods are not entirely automated and cannot produce a traceable and optimal representation space. When rebuilding a feature space for a machine learning task, can these limitations be addressed concurrently? In this extension study, we present a self-optimizing framework for feature transformation. To achieve a better performance, we improved the preliminary work by (1) obtaining an advanced state representation for enabling reinforced agents to comprehend the current feature set better; and (2) resolving Q-value overestimation in reinforced agents for learning unbiased and effective policies. Finally, to make experiments more convincing than the preliminary work, we conclude by adding the outlier detection task with five datasets, evaluating various state representation approaches, and comparing different training strategies. Extensive experiments and case studies show that our work is more effective and superior.
In this paper, we consider a drift-diffusion charge transport model for perovskite solar cells, where electrons and holes may diffuse linearly (Boltzmann approximation) or nonlinearly (e.g. due to Fermi-Dirac statistics). To incorporate volume exclusion effects, we rely on the Fermi-Dirac integral of order -1 when modeling moving anionic vacancies within the perovskite layer which is sandwiched between electron and hole transport layers. After non-dimensionalization, we first prove a continuous entropy-dissipation inequality for the model. Then, we formulate a corresponding two-point flux finite volume scheme on Voronoi meshes and show an analogous discrete entropy-dissipation inequality. This inequality helps us to show the existence of a discrete solution of the nonlinear discrete system with the help of a corollary of Brouwer's fixed point theorem and the minimization of a convex functional. Finally, we verify our theoretically proven properties numerically, simulate a realistic device setup and show exponential decay in time with respect to the L^2 error as well as a physically and analytically meaningful relative entropy.
This work considers Gaussian process interpolation with a periodized version of the Mat{\'e}rn covariance function (Stein, 1999, Section 6.7) with Fourier coefficients $\phi$($\alpha$^2 + j^2)^(--$\nu$--1/2). Convergence rates are studied for the joint maximum likelihood estimation of $\nu$ and $\phi$ when the data is sampled according to the model. The mean integrated squared error is also analyzed with fixed and estimated parameters, showing that maximum likelihood estimation yields asymptotically the same error as if the ground truth was known. Finally, the case where the observed function is a ''deterministic'' element of a continuous Sobolev space is also considered, suggesting that bounding assumptions on some parameters can lead to different estimates.
We consider the stochastic gradient descent (SGD) algorithm driven by a general stochastic sequence, including i.i.d noise and random walk on an arbitrary graph, among others; and analyze it in the asymptotic sense. Specifically, we employ the notion of `efficiency ordering', a well-analyzed tool for comparing the performance of Markov Chain Monte Carlo (MCMC) samplers, for SGD algorithms in the form of Loewner ordering of covariance matrices associated with the scaled iterate errors in the long term. Using this ordering, we show that input sequences that are more efficient for MCMC sampling also lead to smaller covariance of the errors for SGD algorithms in the limit. This also suggests that an arbitrarily weighted MSE of SGD iterates in the limit becomes smaller when driven by more efficient chains. Our finding is of particular interest in applications such as decentralized optimization and swarm learning, where SGD is implemented in a random walk fashion on the underlying communication graph for cost issues and/or data privacy. We demonstrate how certain non-Markovian processes, for which typical mixing-time based non-asymptotic bounds are intractable, can outperform their Markovian counterparts in the sense of efficiency ordering for SGD. We show the utility of our method by applying it to gradient descent with shuffling and mini-batch gradient descent, reaffirming key results from existing literature under a unified framework. Empirically, we also observe efficiency ordering for variants of SGD such as accelerated SGD and Adam, open up the possibility of extending our notion of efficiency ordering to a broader family of stochastic optimization algorithms.
Much of the literature on optimal design of bandit algorithms is based on minimization of expected regret. It is well known that designs that are optimal over certain exponential families can achieve expected regret that grows logarithmically in the number of arm plays, at a rate governed by the Lai-Robbins lower bound. In this paper, we show that when one uses such optimized designs, the regret distribution of the associated algorithms necessarily has a very heavy tail, specifically, that of a truncated Cauchy distribution. Furthermore, for $p>1$, the $p$'th moment of the regret distribution grows much faster than poly-logarithmically, in particular as a power of the total number of arm plays. We show that optimized UCB bandit designs are also fragile in an additional sense, namely when the problem is even slightly mis-specified, the regret can grow much faster than the conventional theory suggests. Our arguments are based on standard change-of-measure ideas, and indicate that the most likely way that regret becomes larger than expected is when the optimal arm returns below-average rewards in the first few arm plays, thereby causing the algorithm to believe that the arm is sub-optimal. To alleviate the fragility issues exposed, we show that UCB algorithms can be modified so as to ensure a desired degree of robustness to mis-specification. In doing so, we also provide a sharp trade-off between the amount of UCB exploration and the tail exponent of the resulting regret distribution.
In this work we present a hierarchical framework for solving discrete stochastic pursuit-evasion games (PEGs) in large grid worlds. With a partition of the grid world into superstates (e.g., "rooms"), the proposed approach creates a two-resolution decision-making process, which consists of a set of local PEGs at the original state level and an aggregated PEG at the superstate level. Having much smaller cardinality, both the local games and the aggregated game can be easily solved to a Nash equilibrium. To connect the decision-making at the two resolutions, we use the Nash values of the local PEGs as the rewards for the aggregated game. Through numerical simulations, we show that the proposed hierarchical framework significantly reduces the computation overhead, while still maintaining a satisfactory level of performance when competing against the flat Nash policies.
Recommender system is one of the most important information services on today's Internet. Recently, graph neural networks have become the new state-of-the-art approach of recommender systems. In this survey, we conduct a comprehensive review of the literature in graph neural network-based recommender systems. We first introduce the background and the history of the development of both recommender systems and graph neural networks. For recommender systems, in general, there are four aspects for categorizing existing works: stage, scenario, objective, and application. For graph neural networks, the existing methods consist of two categories, spectral models and spatial ones. We then discuss the motivation of applying graph neural networks into recommender systems, mainly consisting of the high-order connectivity, the structural property of data, and the enhanced supervision signal. We then systematically analyze the challenges in graph construction, embedding propagation/aggregation, model optimization, and computation efficiency. Afterward and primarily, we provide a comprehensive overview of a multitude of existing works of graph neural network-based recommender systems, following the taxonomy above. Finally, we raise discussions on the open problems and promising future directions of this area. We summarize the representative papers along with their codes repositories in //github.com/tsinghua-fib-lab/GNN-Recommender-Systems.