This work considers Gaussian process interpolation with a periodized version of the Mat{\'e}rn covariance function (Stein, 1999, Section 6.7) with Fourier coefficients $\phi$($\alpha$^2 + j^2)^(--$\nu$--1/2). Convergence rates are studied for the joint maximum likelihood estimation of $\nu$ and $\phi$ when the data is sampled according to the model. The mean integrated squared error is also analyzed with fixed and estimated parameters, showing that maximum likelihood estimation yields asymptotically the same error as if the ground truth was known. Finally, the case where the observed function is a ''deterministic'' element of a continuous Sobolev space is also considered, suggesting that bounding assumptions on some parameters can lead to different estimates.
We introduce two synthetic likelihood methods for Simulation-Based Inference (SBI), to conduct either amortized or targeted inference from experimental observations when a high-fidelity simulator is available. Both methods learn a conditional energy-based model (EBM) of the likelihood using synthetic data generated by the simulator, conditioned on parameters drawn from a proposal distribution. The learned likelihood can then be combined with any prior to obtain a posterior estimate, from which samples can be drawn using MCMC. Our methods uniquely combine a flexible Energy-Based Model and the minimization of a KL loss: this is in contrast to other synthetic likelihood methods, which either rely on normalizing flows, or minimize score-based objectives; choices that come with known pitfalls. Our first method, Amortized Unnormalized Neural Likelihood Estimation (AUNLE), introduces a tilting trick during training that allows to significantly lower the computational cost of inference by enabling the use of efficient MCMC techniques. Our second method, Sequential UNLE (SUNLE), employs a robust doubly intractable approach in order to re-use simulation data and improve posterior accuracy on a specific dataset. We demonstrate the properties of both methods on a range of synthetic datasets, and apply them to a neuroscience model of the pyloric network in the crab Cancer Borealis, matching the performance of other synthetic likelihood methods at a fraction of the simulation budget.
We present an alternating least squares type numerical optimization scheme to estimate conditionally-independent mixture models in $\mathbb{R}^n$, with minimal additional distributional assumptions. Following the method of moments, we tackle a coupled system of low-rank tensor decomposition problems. The steep costs associated with high-dimensional tensors are avoided, through the development of specialized tensor-free operations. Numerical experiments illustrate the performance of the algorithm and its applicability to various models and applications. In many cases the results exhibit improved reliability over the expectation-maximization algorithm, with similar time and storage costs. We also provide some supporting theory, establishing identifiability and local linear convergence.
We propose a method for maximum likelihood estimation of path choice model parameters and arc travel time using data of different levels of granularity. Hitherto these two tasks have been tackled separately under strong assumptions. Using a small example, we illustrate that this can lead to biased results. Results on both real (New York yellow cab) and simulated data show strong performance of our method compared to existing baselines.
Building and maintaining a catalog of resident space objects involves several tasks, ranging from observations to data analysis. Once acquired, the knowledge of a space object needs to be updated following a dedicated observing schedule. Dynamics mismodeling and unknown maneuvers can alter the catalog's accuracy, resulting in uncorrelated observations originating from the same object. Starting from two independent orbits, this work presents a novel approach to detect and estimate maneuvers of resident space objects, which allows for correlation recovery. The estimation is performed with successive convex optimization without a-priori assumption on the thrust arcs structure and thrust direction.
Tyler's M-estimator is a well known procedure for robust and heavy-tailed covariance estimation. Tyler himself suggested an iterative fixed-point algorithm for computing his estimator however, it requires super-linear (in the size of the data) runtime per iteration, which maybe prohibitive in large scale. In this work we propose, to the best of our knowledge, the first Frank-Wolfe-based algorithms for computing Tyler's estimator. One variant uses standard Frank-Wolfe steps, the second also considers \textit{away-steps} (AFW), and the third is a \textit{geodesic} version of AFW (GAFW). AFW provably requires, up to a log factor, only linear time per iteration, while GAFW runs in linear time (up to a log factor) in a large $n$ (number of data-points) regime. All three variants are shown to provably converge to the optimal solution with sublinear rate, under standard assumptions, despite the fact that the underlying optimization problem is not convex nor smooth. Under an additional fairly mild assumption, that holds with probability 1 when the (normalized) data-points are i.i.d. samples from a continuous distribution supported on the entire unit sphere, AFW and GAFW are proved to converge with linear rates. Importantly, all three variants are parameter-free and use adaptive step-sizes.
Quantum machine learning is an emerging field at the intersection of machine learning and quantum computing. Classical cross entropy plays a central role in machine learning. We define its quantum generalization, the quantum cross entropy, prove its lower bounds, and investigate its relation to quantum fidelity. In the classical case, minimizing cross entropy is equivalent to maximizing likelihood. In the quantum case, when the quantum cross entropy is constructed from quantum data undisturbed by quantum measurements, this relation holds. Classical cross entropy is equal to negative log-likelihood. When we obtain quantum cross entropy through empirical density matrix based on measurement outcomes, the quantum cross entropy is lower-bounded by negative log-likelihood. These two different scenarios illustrate the information loss when making quantum measurements. We conclude that to achieve the goal of full quantum machine learning, it is crucial to utilize the deferred measurement principle.
The presence of outliers can significantly degrade the performance of ellipse fitting methods. We develop an ellipse fitting method that is robust to outliers based on the maximum correntropy criterion with variable center (MCC-VC), where a Laplacian kernel is used. For single ellipse fitting, we formulate a non-convex optimization problem to estimate the kernel bandwidth and center and divide it into two subproblems, each estimating one parameter. We design sufficiently accurate convex approximation to each subproblem such that computationally efficient closed-form solutions are obtained. The two subproblems are solved in an alternate manner until convergence is reached. We also investigate coupled ellipses fitting. While there exist multiple ellipses fitting methods that can be used for coupled ellipses fitting, we develop a couple ellipses fitting method by exploiting the special structure. Having unknown association between data points and ellipses, we introduce an association vector for each data point and formulate a non-convex mixed-integer optimization problem to estimate the data associations, which is approximately solved by relaxing it into a second-order cone program. Using the estimated data associations, we extend the proposed method to achieve the final coupled ellipses fitting. The proposed method is shown to have significantly better performance over the existing methods in both simulated data and real images.
We study the problem of learning generalized linear models under adversarial corruptions. We analyze a classical heuristic called the iterative trimmed maximum likelihood estimator which is known to be effective against label corruptions in practice. Under label corruptions, we prove that this simple estimator achieves minimax near-optimal risk on a wide range of generalized linear models, including Gaussian regression, Poisson regression and Binomial regression. Finally, we extend the estimator to the more challenging setting of label and covariate corruptions and demonstrate its robustness and optimality in that setting as well.
A difficulty in MSE estimation occurs because we do not specify a full distribution for the survey weights. This obfuscates the use of fully parametric bootstrap procedures. To overcome this challenge, we develop a novel MSE estimator. We estimate the leading term in the MSE, which is the MSE of the best predictor (constructed with the true parameters), using the same simulated samples used to construct the basic predictor. We then exploit the asymptotic normal distribution of the parameter estimators to estimate the second term in the MSE, which reflects variability in the estimated parameters. We incorporate a correction for the bias of the estimator of the leading term without the use of computationally intensive double-bootstrap procedures. We further develop calibrated prediction intervals that rely less on normal theory than standard prediction intervals. We empirically demonstrate the validity of the proposed procedures through extensive simulation studies. We apply the methods to predict several functions of sheet and rill erosion for Iowa counties using data from a complex agricultural survey.
This study presents a theoretical structure for the monocular pose estimation problem using the total least squares. The unit-vector line-of-sight observations of the features are extracted from the monocular camera images. First, the optimization framework is formulated for the pose estimation problem with observation vectors extracted from unit vectors from the camera center-of-projection, pointing towards the image features. The attitude and position solutions obtained via the derived optimization framework are proven to reach the Cram\'er-Rao lower bound under the small angle approximation of the attitude errors. Specifically, The Fisher Information Matrix and the Cram\'er-Rao bounds are evaluated and compared to the analytical derivations of the error-covariance expressions to rigorously prove the optimality of the estimates. The sensor data for the measurement model is provided through a series of vector observations, and two fully populated noise-covariance matrices are assumed for the body and reference observation data. The inverse of the former matrices appear in terms of a series of weight matrices in the cost function. The proposed solution is simulated in a Monte-Carlo framework with 10,000 samples to validate the error-covariance analysis.