We consider the problem of comparing several samples of stochastic processes with respect to their second-order structure, and describing the main modes of variation in this second order structure, if present. These tasks can be seen as an Analysis of Variance (ANOVA) and a Principal Component Analysis (PCA) of covariance operators, respectively. They arise naturally in functional data analysis, where several populations are to be contrasted relative to the nature of their dispersion around their means, rather than relative to their means themselves. We contribute a novel approach based on optimal (multi)transport, where each covariance can be identified with a a centred Gaussian process of corresponding covariance. By means of constructing the optimal simultaneous coupling of these Gaussian processes, we contrast the (linear) maps that achieve it with the identity with respect to a norm-induced distance. The resulting test statistic, calibrated by permutation, is seen to distinctly outperform the state-of-the-art, and to furnish considerable power even under local alternatives. This effect is seen to be genuinely functional, and is related to the potential for perfect discrimination in infinite dimensions. In the event of a rejection of the null hypothesis stipulating equality, a geometric interpretation of the transport maps allows us to construct a (tangent space) PCA revealing the main modes of variation. As a necessary step to developing our methodology, we prove results on the existence and boundedness of optimal multitransport maps. These are of independent interest in the theory of transport of Gaussian processes. The transportation ANOVA and PCA are illustrated on a variety of simulated and real examples.
When estimating a Global Average Treatment Effect (GATE) under network interference, units can have widely different relationships to the treatment depending on a combination of the structure of their network neighborhood, the structure of the interference mechanism, and how the treatment was distributed in their neighborhood. In this work, we introduce a sequential procedure to generate and select graph- and treatment-based covariates for GATE estimation under regression adjustment. We show that it is possible to simultaneously achieve low bias and considerably reduce variance with such a procedure. To tackle inferential complications caused by our feature generation and selection process, we introduce a way to construct confidence intervals based on a block bootstrap. We illustrate that our selection procedure and subsequent estimator can achieve good performance in terms of root mean squared error in several semi-synthetic experiments with Bernoulli designs, comparing favorably to an oracle estimator that takes advantage of regression adjustments for the known underlying interference structure. We apply our method to a real world experimental dataset with strong evidence of interference and demonstrate that it can estimate the GATE reasonably well without knowing the interference process a priori.
In the usual Bayesian setting, a full probabilistic model is required to link the data and parameters, and the form of this model and the inference and prediction mechanisms are specified via de Finetti's representation. In general, such a formulation is not robust to model mis-specification of its component parts. An alternative approach is to draw inference based on loss functions, where the quantity of interest is defined as a minimizer of some expected loss, and to construct posterior distributions based on the loss-based formulation; this strategy underpins the construction of the Gibbs posterior. We develop a Bayesian non-parametric approach; specifically, we generalize the Bayesian bootstrap, and specify a Dirichlet process model for the distribution of the observables. We implement this using direct prior-to-posterior calculations, but also using predictive sampling. We also study the assessment of posterior validity for non-standard Bayesian calculations, and provide an efficient way to calibrate the scaling parameter in the Gibbs posterior so that it can achieve the desired coverage rate. We show that the developed non-standard Bayesian updating procedures yield valid posterior distributions in terms of consistency and asymptotic normality under model mis-specification. Simulation studies show that the proposed methods can recover the true value of the parameter efficiently and achieve frequentist coverage even when the sample size is small. Finally, we apply our methods to evaluate the causal impact of speed cameras on traffic collisions in England.
Objective: Electroencephalography signals are recorded as a multidimensional dataset. We propose a new framework based on the augmented covariance extracted from an autoregressive model to improve motor imagery classification. Methods: From the autoregressive model can be derived the Yule-Walker equations, which show the emergence of a symmetric positive definite matrix: the augmented covariance matrix. The state-of the art for classifying covariance matrices is based on Riemannian Geometry. A fairly natural idea is therefore to extend the standard approach using these augmented covariance matrices. The methodology for creating the augmented covariance matrix shows a natural connection with the delay embedding theorem proposed by Takens for dynamical systems. Such an embedding method is based on the knowledge of two parameters: the delay and the embedding dimension, respectively related to the lag and the order of the autoregressive model. This approach provides new methods to compute the hyper-parameters in addition to standard grid search. Results: The augmented covariance matrix performed noticeably better than any state-of-the-art methods. We will test our approach on several datasets and several subjects using the MOABB framework, using both within-session and cross-session evaluation. Conclusion: The improvement in results is due to the fact that the augmented covariance matrix incorporates not only spatial but also temporal information, incorporating nonlinear components of the signal through an embedding procedure, which allows the leveraging of dynamical systems algorithms. Significance: These results extend the concepts and the results of the Riemannian distance based classification algorithm.
Online platforms often incentivize consumers to improve user engagement and platform revenue. Since different consumers might respond differently to incentives, individual-level budget allocation is an essential task in marketing campaigns. Recent advances in this field often address the budget allocation problem using a two-stage paradigm: the first stage estimates the individual-level treatment effects using causal inference algorithms, and the second stage invokes integer programming techniques to find the optimal budget allocation solution. Since the objectives of these two stages might not be perfectly aligned, such a two-stage paradigm could hurt the overall marketing effectiveness. In this paper, we propose a novel end-to-end framework to directly optimize the business goal under budget constraints. Our core idea is to construct a regularizer to represent the marketing goal and optimize it efficiently using gradient estimation techniques. As such, the obtained models can learn to maximize the marketing goal directly and precisely. We extensively evaluate our proposed method in both offline and online experiments, and experimental results demonstrate that our method outperforms current state-of-the-art methods. Our proposed method is currently deployed to allocate marketing budgets for hundreds of millions of users on a short video platform and achieves significant business goal improvements. Our code will be publicly available.
Although extensive research in emergency collision avoidance has been carried out for straight or curved roads in a highway scenario, a general method that could be implemented for all road environments has not been thoroughly explored. Moreover, most current algorithms don't consider collision mitigation in an emergency. This functionality is essential since the problem may have no feasible solution. We propose a safe controller using model predictive control and artificial potential function to address these problems. A new artificial potential function inspired by line charge is proposed as the cost function for our model predictive controller. The vehicle dynamics and actuator limitations are set as constraints. The new artificial potential function considers the shape of all objects. In particular, the artificial potential function we proposed has the flexibility to fit the shape of the road structures, such as the intersection. We could also realize collision mitigation for a specific part of the vehicle by increasing the charge quantity at the corresponding place. We have tested our methods in 192 cases from 8 different scenarios in simulation with two different models. The simulation results show that the success rate of the proposed safe controller is 20% higher than using HJ-reachability with system decomposition by using a unicycle model. It could also decrease 43% of collision that happens at the pre-assigned part. The method is further validated in a dynamic bicycle model.
With the proliferation of mobile devices, an increasing amount of population data is being collected, and there is growing demand to use the large-scale, multidimensional data in real-world situations. We introduced functional data analysis (FDA) into the problem of predicting the hourly population of different districts of Tokyo. FDA is a methodology that treats and analyzes longitudinal data as curves, which reduces the number of parameters and makes it easier to handle high-dimensional data. Specifically, by assuming a Gaussian process, we avoided the large covariance matrix parameters of the multivariate normal distribution. In addition, the data were time and spatially dependent between districts. To capture these characteristics, a Bayesian factor model was introduced, which modeled the time series of a small number of common factors and expressed the spatial structure in terms of factor loading matrices. Furthermore, the factor loading matrices were made identifiable and sparse to ensure the interpretability of the model. We also proposed a method for selecting factors using the Bayesian shrinkage method. We studied the forecast accuracy and interpretability of the proposed method through numerical experiments and data analysis. We found that the flexibility of our proposed method could be extended to reflect further time series features, which contributed to the accuracy.
Switching costs, which capture the costs for changing policies, are regarded as a critical metric in reinforcement learning (RL), in addition to the standard metric of losses (or rewards). However, existing studies on switching costs (with a coefficient $\beta$ that is strictly positive and is independent of $T$) have mainly focused on static RL, where the loss distribution is assumed to be fixed during the learning process, and thus practical scenarios where the loss distribution could be non-stationary or even adversarial are not considered. While adversarial RL better models this type of practical scenarios, an open problem remains: how to develop a provably efficient algorithm for adversarial RL with switching costs? This paper makes the first effort towards solving this problem. First, we provide a regret lower-bound that shows that the regret of any algorithm must be larger than $\tilde{\Omega}( ( H S A )^{1/3} T^{2/3} )$, where $T$, $S$, $A$ and $H$ are the number of episodes, states, actions and layers in each episode, respectively. Our lower bound indicates that, due to the fundamental challenge of switching costs in adversarial RL, the best achieved regret (whose dependency on $T$ is $\tilde{O}(\sqrt{T})$) in static RL with switching costs (as well as adversarial RL without switching costs) is no longer achievable. Moreover, we propose two novel switching-reduced algorithms with regrets that match our lower bound when the transition function is known, and match our lower bound within a small factor of $\tilde{O}( H^{1/3} )$ when the transition function is unknown. Our regret analysis demonstrates the near-optimal performance of them.
As the availability of omics data has increased in the last few years, more multi-omics data have been generated, that is, high-dimensional molecular data consisting of several types such as genomic, transcriptomic, or proteomic data, all obtained from the same patients. Such data lend themselves to being used as covariates in automatic outcome prediction because each omics type may contribute unique information, possibly improving predictions compared to using only one omics data type. Frequently, however, in the training data and the data to which automatic prediction rules should be applied, the test data, the different omics data types are not available for all patients. We refer to this type of data as block-wise missing multi-omics data. First, we provide a literature review on existing prediction methods applicable to such data. Subsequently, using a collection of 13 publicly available multi-omics data sets, we compare the predictive performances of several of these approaches for different block-wise missingness patterns. Finally, we discuss the results of this empirical comparison study and draw some tentative conclusions.
Bi-clustering is a technique that allows for the simultaneous clustering of observations and features in a dataset. This technique is often used in bioinformatics, text mining, and time series analysis. An important advantage of biclustering algorithm is the ability to uncover multiple ``views'' (i.e., through rows and column groupings) in the data. Several Gaussian mixture model based biclustering approach currently exist in the literature. However, they impose severe restrictions on the structure of the covariance matrix. Here, we propose a Gaussian mixture model-based bi-clustering approach that provides a more flexible block-diagonal covariance structure. We show that the clustering accuracy of the proposed model is comparable to other known techniques but our approach provides a more flexible covariance structure and has substantially lower computational time. We demonstrate the application of the proposed model in bioinformatics and topic modelling.
Purpose of review: We review recent advances in algorithmic development and validation for modeling and control of soft robots leveraging the Koopman operator theory. Recent findings: We identify the following trends in recent research efforts in this area. (1) The design of lifting functions used in the data-driven approximation of the Koopman operator is critical for soft robots. (2) Robustness considerations are emphasized. Works are proposed to reduce the effect of uncertainty and noise during the process of modeling and control. (3) The Koopman operator has been embedded into different model-based control structures to drive the soft robots. Summary: Because of their compliance and nonlinearities, modeling and control of soft robots face key challenges. To resolve these challenges, Koopman operator-based approaches have been proposed, in an effort to express the nonlinear system in a linear manner. The Koopman operator enables global linearization to reduce nonlinearities and/or serves as model constraints in model-based control algorithms for soft robots. Various implementations in soft robotic systems are illustrated and summarized in the review.