Many standard estimators, when applied to adaptively collected data, fail to be asymptotically normal, thereby complicating the construction of confidence intervals. We address this challenge in a semi-parametric context: estimating the parameter vector of a generalized linear regression model contaminated by a non-parametric nuisance component. We construct suitably weighted estimating equations that account for adaptivity in data collection, and provide conditions under which the associated estimates are asymptotically normal. Our results characterize the degree of "explorability" required for asymptotic normality to hold. For the simpler problem of estimating a linear functional, we provide similar guarantees under much weaker assumptions. We illustrate our general theory with concrete consequences for various problems, including standard linear bandits and sparse generalized bandits, and compare with other methods via simulation studies.
We develop a framework for self-induced phase changes in programmable matter in which a collection of agents with limited computational and communication capabilities can collectively perform appropriate global tasks in response to local stimuli that dynamically appear and disappear. Agents reside on graph vertices, where each stimulus is only recognized locally, and agents communicate via token passing along edges to alert other agents to transition to an "aware" state when stimuli are present and an "unaware" state when the stimuli disappear. We present an Adaptive Stimuli Algorithm that is robust to competing waves of messages as multiple stimuli change, possibly adversarially. Moreover, in addition to handling arbitrary stimulus dynamics, the algorithm can handle agents reconfiguring the connections (edges) of the graph over time in a controlled way. As an application, we show how this Adaptive Stimuli Algorithm on reconfigurable graphs can be used to solve the foraging problem, where food sources may be discovered, removed, or shifted at arbitrary times. We would like the agents to consistently self-organize using only local interactions, such that if the food remains in position long enough, the agents transition to a gather phase, collectively forming a single large component with small perimeter around the food. Alternatively, if no food source has existed recently, the agents should self-induce a switch to a search phase in which they distribute themselves randomly throughout the lattice region to search for food. Unlike previous approaches to foraging, this process is indefinitely repeatable. Like a physical phase change, microscopic changes such as the deletion or addition of a single food source triggers these macroscopic, system-wide transitions as agents share information about the environment and respond locally to get the desired collective response.
We consider the problem of online interval scheduling on a single machine, where intervals arrive online in an order chosen by an adversary, and the algorithm must output a set of non-conflicting intervals. Traditionally in scheduling theory, it is assumed that intervals arrive in order of increasing start times. We drop that assumption and allow for intervals to arrive in any possible order. We call this variant any-order interval selection (AOIS). We assume that some online acceptances can be revoked, but a feasible solution must always be maintained. For unweighted intervals and deterministic algorithms, this problem is unbounded. Under the assumption that there are at most $k$ different interval lengths, we give a simple algorithm that achieves a competitive ratio of $2k$ and show that it is optimal amongst deterministic algorithms, and a restricted class of randomized algorithms we call memoryless, contributing to an open question by Adler and Azar 2003; namely whether a randomized algorithm without access to history can achieve a constant competitive ratio. We connect our model to the problem of call control on the line, and show how the algorithms of Garay et al. 1997 can be applied to our setting, resulting in an optimal algorithm for the case of proportional weights. We also discuss the case of intervals with arbitrary weights, and show how to convert the single-length algorithm of Fung et al. 2014 into a classify and randomly select algorithm that achieves a competitive ratio of 2k. Finally, we consider the case of intervals arriving in a random order, and show that for single-lengthed instances, a one-directional algorithm (i.e. replacing intervals in one direction), is the only deterministic memoryless algorithm that can possibly benefit from random arrivals. Finally, we briefly discuss the case of intervals with arbitrary weights.
We propose a novel nonparametric regression framework subject to the positive definiteness constraint. It offers a highly modular approach for estimating covariance functions of stationary processes. Our method can impose positive definiteness, as well as isotropy and monotonicity, on the estimators, and its hyperparameters can be decided using cross validation. We define our estimators by taking integral transforms of kernel-based distribution surrogates. We then use the iterated density estimation evolutionary algorithm, a variant of estimation of distribution algorithms, to fit the estimators. We also extend our method to estimate covariance functions for point-referenced data. Compared to alternative approaches, our method provides more reliable estimates for long-range dependence. Several numerical studies are performed to demonstrate the efficacy and performance of our method. Also, we illustrate our method using precipitation data from the Spatial Interpolation Comparison 97 project.
Modern advanced manufacturing and advanced materials design often require searches of relatively high-dimensional process control parameter spaces for settings that result in optimal structure, property, and performance parameters. The mapping from the former to the latter must be determined from noisy experiments or from expensive simulations. We abstract this problem to a mathematical framework in which an unknown function from a control space to a design space must be ascertained by means of expensive noisy measurements, which locate optimal control settings generating desired design features within specified tolerances, with quantified uncertainty. We describe targeted adaptive design (TAD), a new algorithm that performs this sampling task efficiently. TAD creates a Gaussian process surrogate model of the unknown mapping at each iterative stage, proposing a new batch of control settings to sample experimentally and optimizing the updated log-predictive likelihood of the target design. TAD either stops upon locating a solution with uncertainties that fit inside the tolerance box or uses a measure of expected future information to determine that the search space has been exhausted with no solution. TAD thus embodies the exploration-exploitation tension in a manner that recalls, but is essentially different from, Bayesian optimization and optimal experimental design.
Implicit generative modeling (IGM) aims to produce samples of synthetic data matching the characteristics of a target data distribution. Recent work (e.g. score-matching networks, diffusion models) has approached the IGM problem from the perspective of pushing synthetic source data toward the target distribution via dynamical perturbations or flows in the ambient space. We introduce the score difference (SD) between arbitrary target and source distributions as a flow that optimally reduces the Kullback-Leibler divergence between them while also solving the Schr\"odinger bridge problem. We apply the SD flow to convenient proxy distributions, which are aligned if and only if the original distributions are aligned. We demonstrate the formal equivalence of this formulation to denoising diffusion models under certain conditions. However, unlike diffusion models, SD flow places no restrictions on the prior distribution. We also show that the training of generative adversarial networks includes a hidden data-optimization sub-problem, which induces the SD flow under certain choices of loss function when the discriminator is optimal. As a result, the SD flow provides a theoretical link between model classes that, taken together, address all three challenges of the "generative modeling trilemma": high sample quality, mode coverage, and fast sampling.
Learning multi-view data is an emerging problem in machine learning research, and nonnegative matrix factorization (NMF) is a popular dimensionality-reduction method for integrating information from multiple views. These views often provide not only consensus but also complementary information. However, most multi-view NMF algorithms assign equal weight to each view or tune the weight via line search empirically, which can be infeasible without any prior knowledge of the views or computationally expensive. In this paper, we propose a weighted multi-view NMF (WM-NMF) algorithm. In particular, we aim to address the critical technical gap, which is to learn both view-specific weight and observation-specific reconstruction weight to quantify each view's information content. The introduced weighting scheme can alleviate unnecessary views' adverse effects and enlarge the positive effects of the important views by assigning smaller and larger weights, respectively. Experimental results confirm the effectiveness and advantages of the proposed algorithm in terms of achieving better clustering performance and dealing with the noisy data compared to the existing algorithms.
Functional data is a powerful tool for capturing and analyzing complex patterns and relationships in a variety of fields, allowing for more precise modeling, visualization, and decision-making. For example, in healthcare, functional data such as medical images can help doctors make more accurate diagnoses and develop more effective treatment plans. However, understanding the causal relationships between functional predictors and time-to-event outcomes remains a challenge. To address this, we propose a functional causal framework including a functional accelerated failure time (FAFT) model and three causal approaches. The regression adjustment approach is based on conditional FAFT with subsequent confounding marginalization, while the functional-inverse-probability-weighting approach is based on marginal FAFT with well-defined functional propensity scores. The double robust approach combines the strengths of both methods and achieves a balance condition through the weighted residuals between imputed observations and regression adjustment outcomes. Our approach can accurately estimate causality, predict outcomes, and is robust to different censoring rates. We demonstrate the power of our framework with simulations and real-world data from the Alzheimer's Disease Neuroimaging Initiative (ADNI) study. Our findings provide more precise subregions of the hippocampus that align with medical research, highlighting the power of this work for improving healthcare outcomes.
The promotion time cure rate model (PCM) is an extensively studied model for the analysis of time-to-event data in the presence of a cured subgroup. There are several strategies proposed in the literature to model the latency part of PCM. However, there aren't many strategies proposed to investigate the effects of covariates on the incidence part of PCM. In this regard, most existing studies assume the boundary separating the cured and non-cured subjects with respect to the covariates to be linear. As such, they can only capture simple effects of the covariates on the cured/non-cured probability. In this manuscript, we propose a new promotion time cure model that uses the support vector machine (SVM) to model the incidence part. The proposed model inherits the features of the SVM and provides flexibility in capturing non-linearity in the data. To the best of our knowledge, this is the first work that integrates the SVM with PCM model. For the estimation of model parameters, we develop an expectation maximization algorithm where we make use of the sequential minimal optimization technique together with the Platt scaling method to obtain the posterior probabilities of cured/uncured. A detailed simulation study shows that the proposed model outperforms the existing logistic regression-based PCM model as well as the spline regression-based PCM model, which is also known to capture non linearity in the data. This is true in terms of bias and mean square error of different quantities of interest, and also in terms of predictive and classification accuracies of cure. Finally, we illustrate the applicability and superiority of our model using the data from a study on leukemia patients who went through bone marrow transplantation.
Finding saddle points of dynamical systems is an important problem in practical applications such as the study of rare events of molecular systems. Gentlest ascent dynamics (GAD) is one of a number of algorithms in existence that attempt to find saddle points in dynamical systems. It works by deriving a new dynamical system in which saddle points of the original system become stable equilibria. GAD has been recently generalized to the study of dynamical systems on manifolds (differential algebraic equations) described by equality constraints and given in an extrinsic formulation. In this paper, we present an extension of GAD to manifolds defined by point-clouds, formulated using the intrinsic viewpoint. These point-clouds are adaptively sampled during an iterative process that drives the system from the initial conformation (typically in the neighborhood of a stable equilibrium) to a saddle point. Our method requires the reactant (initial conformation), does not require the explicit constraint equations to be specified, and is purely data-driven.
Tensor Factor Models (TFM) are appealing dimension reduction tools for high-order large-dimensional tensor time series, and have wide applications in economics, finance and medical imaging. In this paper, we propose a projection estimator for the Tucker-decomposition based TFM, and provide its least-square interpretation which parallels to the least-square interpretation of the Principal Component Analysis (PCA) for the vector factor model. The projection technique simultaneously reduces the dimensionality of the signal component and the magnitudes of the idiosyncratic component tensor, thus leading to an increase of the signal-to-noise ratio. We derive a convergence rate of the projection estimator of the loadings and the common factor tensor which are faster than that of the naive PCA-based estimator. Our results are obtained under mild conditions which allow the idiosyncratic components to be weakly cross- and auto- correlated. We also provide a novel iterative procedure based on the eigenvalue-ratio principle to determine the factor numbers. Extensive numerical studies are conducted to investigate the empirical performance of the proposed projection estimators relative to the state-of-the-art ones.