We consider a class of high-dimensional spatial filtering problems, where the spatial locations of the observations are unknown and driven by the unobserved signal. This problem is exceptionally challenging as not only is the problem of high-dimensions in the signal, but the model for the signal yields longer-range time dependencies on this object. Motivated by this model we revisit a lesser-known and $\textit{exact}$ computational methodology from Centanni $\&$ Minozzo (2006a) (see also Martin et al. (2013)) designed for filtering of point-processes. We adapt the methodology for this new class of problem. The algorithm is implemented on high-dimensional (of the order of $10^4$) rotating shallow water model with real and synthetic observational data from ocean drifters. In comparison to existing methodology, we demonstrate a significant improvement in speed and accuracy.
The design of particle simulation methods for collisional plasma physics has always represented a challenge due to the unbounded total collisional cross section, which prevents a natural extension of the classical Direct Simulation Monte Carlo (DSMC) method devised for the Boltzmann equation. One way to overcome this problem is to consider the design of Monte Carlo algorithms that are robust in the so-called grazing collision limit. In the first part of this manuscript, we will focus on the construction of collision algorithms for the Landau-Fokker-Planck equation based on the grazing collision asymptotics and which avoids the use of iterative solvers. Subsequently, we discuss problems involving uncertainties and show how to develop a stochastic Galerkin projection of the particle dynamics which permits to recover spectral accuracy for smooth solutions in the random space. Several classical numerical tests are reported to validate the present approach.
We introduce a generalized additive model for location, scale, and shape (GAMLSS) next of kin aiming at distribution-free and parsimonious regression modelling for arbitrary outcomes. We replace the strict parametric distribution formulating such a model by a transformation function, which in turn is estimated from data. Doing so not only makes the model distribution-free but also allows to limit the number of linear or smooth model terms to a pair of location-scale predictor functions. We derive the likelihood for continuous, discrete, and randomly censored observations, along with corresponding score functions. A plethora of existing algorithms is leveraged for model estimation, including constrained maximum-likelihood, the original GAMLSS algorithm, and transformation trees. Parameter interpretability in the resulting models is closely connected to model selection. We propose the application of a novel best subset selection procedure to achieve especially simple ways of interpretation. All techniques are motivated and illustrated by a collection of applications from different domains, including crossing and partial proportional hazards, complex count regression, non-linear ordinal regression, and growth curves. All analyses are reproducible with the help of the "tram" add-on package to the R system for statistical computing and graphics.
Suitable representations of dynamical systems can simplify their analysis and control. On this line of thought, this paper considers the input decoupling problem for input-affine Lagrangian dynamics, namely the problem of finding a transformation of the generalized coordinates that decouples the input channels. We identify a class of systems for which this problem is solvable. Such systems are called collocated because the decoupling variables correspond to the coordinates on which the actuators directly perform work. Under mild conditions on the input matrix, a simple test is presented to verify whether a system is collocated or not. By exploiting power invariance, it is proven that a change of coordinates decouples the input channels if and only if the dynamics is collocated. We illustrate the theoretical results by considering several Lagrangian systems, focusing on underactuated mechanical systems, for which novel controllers that exploit input decoupling are designed.
While conformal predictors reap the benefits of rigorous statistical guarantees for their error frequency, the size of their corresponding prediction sets is critical to their practical utility. Unfortunately, there is currently a lack of finite-sample analysis and guarantees for their prediction set sizes. To address this shortfall, we theoretically quantify the expected size of the prediction set under the split conformal prediction framework. As this precise formulation cannot usually be calculated directly, we further derive point estimates and high probability intervals that can be easily computed, providing a practical method for characterizing the expected prediction set size across different possible realizations of the test and calibration data. Additionally, we corroborate the efficacy of our results with experiments on real-world datasets, for both regression and classification problems.
In this paper, we develop an {\em epsilon admissible subsets} (EAS) model selection approach for performing group variable selection in the high-dimensional multivariate regression setting. This EAS strategy is designed to estimate a posterior-like, generalized fiducial distribution over a parsimonious class of models in the setting of correlated predictors and/or in the absence of a sparsity assumption. The effectiveness of our approach, to this end, is demonstrated empirically in simulation studies, and is compared to other state-of-the-art model/variable selection procedures. Furthermore, assuming a matrix-Normal linear model we show that the EAS strategy achieves {\em strong model selection consistency} in the high-dimensional setting if there does exist a sparse, true data generating set of predictors. In contrast to Bayesian approaches for model selection, our generalized fiducial approach completely avoids the problem of simultaneously having to specify arbitrary prior distributions for model parameters and penalize model complexity; our approach allows for inference directly on the model complexity. \textcolor{black}{Implementation of the method is illustrated through yeast data to identify significant cell-cycle regulating transcription factors.
We study Stochastic Gradient Descent with AdaGrad stepsizes: a popular adaptive (self-tuning) method for first-order stochastic optimization. Despite being well studied, existing analyses of this method suffer from various shortcomings: they either assume some knowledge of the problem parameters, impose strong global Lipschitz conditions, or fail to give bounds that hold with high probability. We provide a comprehensive analysis of this basic method without any of these limitations, in both the convex and non-convex (smooth) cases, that additionally supports a general ``affine variance'' noise model and provides sharp rates of convergence in both the low-noise and high-noise~regimes.
We present a complete proof synthesis method for the eight type systems of Barendregt's cube extended with $\eta$-conversion. Because these systems verify the proofs-as-objects paradigm, the proof synthesis method is a one level process merging unification and resolution. Then we present a variant of this method, which is incomplete but much more efficient. At last we show how to turn this algorithm into a unification algorithm.
Diffusion models have demonstrated excellent potential for generating diverse images. However, their performance often suffers from slow generation due to iterative denoising. Knowledge distillation has been recently proposed as a remedy that can reduce the number of inference steps to one or a few without significant quality degradation. However, existing distillation methods either require significant amounts of offline computation for generating synthetic training data from the teacher model or need to perform expensive online learning with the help of real data. In this work, we present a novel technique called BOOT, that overcomes these limitations with an efficient data-free distillation algorithm. The core idea is to learn a time-conditioned model that predicts the output of a pre-trained diffusion model teacher given any time step. Such a model can be efficiently trained based on bootstrapping from two consecutive sampled steps. Furthermore, our method can be easily adapted to large-scale text-to-image diffusion models, which are challenging for conventional methods given the fact that the training sets are often large and difficult to access. We demonstrate the effectiveness of our approach on several benchmark datasets in the DDIM setting, achieving comparable generation quality while being orders of magnitude faster than the diffusion teacher. The text-to-image results show that the proposed approach is able to handle highly complex distributions, shedding light on more efficient generative modeling.
This paper focuses on the expected difference in borrower's repayment when there is a change in the lender's credit decisions. Classical estimators overlook the confounding effects and hence the estimation error can be magnificent. As such, we propose another approach to construct the estimators such that the error can be greatly reduced. The proposed estimators are shown to be unbiased, consistent, and robust through a combination of theoretical analysis and numerical testing. Moreover, we compare the power of estimating the causal quantities between the classical estimators and the proposed estimators. The comparison is tested across a wide range of models, including linear regression models, tree-based models, and neural network-based models, under different simulated datasets that exhibit different levels of causality, different degrees of nonlinearity, and different distributional properties. Most importantly, we apply our approaches to a large observational dataset provided by a global technology firm that operates in both the e-commerce and the lending business. We find that the relative reduction of estimation error is strikingly substantial if the causal effects are accounted for correctly.
With the rapid increase of large-scale, real-world datasets, it becomes critical to address the problem of long-tailed data distribution (i.e., a few classes account for most of the data, while most classes are under-represented). Existing solutions typically adopt class re-balancing strategies such as re-sampling and re-weighting based on the number of observations for each class. In this work, we argue that as the number of samples increases, the additional benefit of a newly added data point will diminish. We introduce a novel theoretical framework to measure data overlap by associating with each sample a small neighboring region rather than a single point. The effective number of samples is defined as the volume of samples and can be calculated by a simple formula $(1-\beta^{n})/(1-\beta)$, where $n$ is the number of samples and $\beta \in [0,1)$ is a hyperparameter. We design a re-weighting scheme that uses the effective number of samples for each class to re-balance the loss, thereby yielding a class-balanced loss. Comprehensive experiments are conducted on artificially induced long-tailed CIFAR datasets and large-scale datasets including ImageNet and iNaturalist. Our results show that when trained with the proposed class-balanced loss, the network is able to achieve significant performance gains on long-tailed datasets.