We consider the problem of parameter estimation for a stochastic McKean-Vlasov equation, and the associated system of weakly interacting particles. We first establish consistency and asymptotic normality of the offline maximum likelihood estimator for the interacting particle system in the limit as the number of particles $N\rightarrow\infty$. We then propose an online estimator for the parameters of the McKean-Vlasov SDE, which evolves according to a continuous-time stochastic gradient descent algorithm on the asymptotic log-likelihood of the interacting particle system. We prove that this estimator converges in $\mathbb{L}^1$ to the stationary points of the asymptotic log-likelihood of the McKean-Vlasov SDE in the joint limit as $N\rightarrow\infty$ and $t\rightarrow\infty$, under suitable assumptions which guarantee ergodicity and uniform-in-time propagation of chaos. We then demonstrate, under the additional assumption of global strong concavity, that our estimator converges in $\mathbb{L}^2$ to the unique maximiser of this asymptotic log-likelihood function, and establish an $\mathbb{L}^2$ convergence rate. We also obtain analogous results under the assumption that, rather than observing multiple trajectories of the interacting particle system, we instead observe multiple independent replicates of the McKean-Vlasov SDE itself or, less realistically, a single sample path of the McKean-Vlasov SDE and its law. Our theoretical results are demonstrated via two numerical examples, a linear mean field model and a stochastic opinion dynamics model.
The additive hazards model specifies the effect of covariates on the hazard in an additive way, in contrast to the popular Cox model, in which it is multiplicative. As non-parametric model, it offers a very flexible way of modeling time-varying covariate effects. It is most commonly estimated by ordinary least squares. In this paper we consider the case where covariates are bounded, and derive the maximum likelihood estimator under the constraint that the hazard is non-negative for all covariate values in their domain. We describe an efficient algorithm to find the maximum likelihood estimator. The method is contrasted with the ordinary least squares approach in a simulation study, and the method is illustrated on a realistic data set.
A joint sparse-regression-code (SPARC) and low-density-parity-check (LDPC) coding scheme for multiple-input multiple-output (MIMO) massive unsourced random access (URA) is proposed in this paper. Different from the state-of-the-art covariance-based maximum likelihood (CB-ML) detection scheme, we first split users' messages into two parts. The former part is encoded by SPARCs and tasked to recover part of the messages, the corresponding channel coefficients as well as the interleaving patterns by compressed sensing. The latter part is coded by LDPC codes and then interleaved by the interleave-division multiple access (IDMA) scheme. The decoding of the latter part is based on belief propagation (BP) joint with successive interference cancellation (SIC). Numerical results show our scheme outperforms the CB-ML scheme when the number of antennas at the base station is smaller than that of active users. The complexity of our scheme is with the order $\mathcal{O}\left(2^{B_p}ML+\widehat{K}ML\right)$ and lower than the CB-ML scheme. Moreover, our scheme has higher spectral efficiency (nearly $15$ times larger) than CB-ML as we only split messages into two parts.
This study develops an asymptotic theory for estimating the time-varying characteristics of locally stationary functional time series. We investigate a kernel-based method to estimate the time-varying covariance operator and the time-varying mean function of a locally stationary functional time series. In particular, we derive the convergence rate of the kernel estimator of the covariance operator and associated eigenvalue and eigenfunctions and establish a central limit theorem for the kernel-based locally weighted sample mean. As applications of our results, we discuss the prediction of locally stationary functional time series and methods for testing the equality of time-varying mean functions in two functional samples.
We consider a moving boundary problem with kinetic condition that describes the diffusion of solvent into rubber and study semi-discrete finite element approximations of the corresponding weak solutions. We report on both a priori and a posteriori error estimates for the mass concentration of the diffusants, and respectively, for the a priori unknown position of the moving boundary. Our working techniques include integral and energy-based estimates for a nonlinear parabolic problem posed in a transformed fixed domain combined with a suitable use of the interpolation-trace inequality to handle the interface terms. Numerical illustrations of our FEM approximations are within the experimental range and show good agreement with our theoretical investigation. This work is a preliminary investigation necessary before extending the current moving boundary modeling to account explicitly for the mechanics of hyperelastic rods to capture a directional swelling of the underlying elastomer.
A new clustering accuracy measure is proposed to determine the unknown number of clusters and to assess the quality of clustering of a data set given in any dimensional space. Our validity index applies the classical nonparametric univariate kernel density estimation method to the interpoint distances computed between the members of data. Being based on interpoint distances, it is free of the curse of dimensionality and therefore efficiently computable for high-dimensional situations where the number of study variables can be larger than the sample size. The proposed measure is compatible with any clustering algorithm and with every kind of data set where the interpoint distance measure can be defined to have a density function. Simulation study proves its superiority over widely used cluster validity indices like the average silhouette width and the Dunn index, whereas its applicability is shown with respect to a high-dimensional Biostatistical study of Alon data set and a large Astrostatistical application of time series with light curves of new variable stars.
The aim of this thesis is to develop a theoretical framework to study parameter estimation of quantum channels. We study the task of estimating unknown parameters encoded in a channel in the sequential setting. A sequential strategy is the most general way to use a channel multiple times. Our goal is to establish lower bounds (called Cramer-Rao bounds) on the estimation error. The bounds we develop are universally applicable; i.e., they apply to all permissible quantum dynamics. We consider the use of catalysts to enhance the power of a channel estimation strategy. This is termed amortization. The power of a channel for a parameter estimation is determined by its Fisher information. Thus, we study how much a catalyst quantum state can enhance the Fisher information of a channel by defining the amortized Fisher information. We establish our bounds by proving that for certain Fisher information quantities, catalyst states do not improve the performance of a sequential estimation protocol compared to a parallel one. The technical term for this is an amortization collapse. We use this to establish bounds when estimating one parameter, or multiple parameters simultaneously. Our bounds apply universally and we also cast them as optimization problems. For the single parameter case, we establish bounds for general quantum channels using both the symmetric logarithmic derivative (SLD) Fisher information and the right logarithmic derivative (RLD) Fisher information. The task of estimating multiple parameters simultaneously is more involved than the single parameter case, because the Cramer-Rao bounds take the form of matrix inequalities. We establish a scalar Cramer-Rao bound for multiparameter channel estimation using the RLD Fisher information. For both single and multiparameter estimation, we provide a no-go condition for the so-called Heisenberg scaling using our RLD-based bound.
Stochastic majorization-minimization (SMM) is an online extension of the classical principle of majorization-minimization, which consists of sampling i.i.d. data points from a fixed data distribution and minimizing a recursively defined majorizing surrogate of an objective function. In this paper, we introduce stochastic block majorization-minimization, where the surrogates can now be only block multi-convex and a single block is optimized at a time within a diminishing radius. Relaxing the standard strong convexity requirements for surrogates in SMM, our framework gives wider applicability including online CANDECOMP/PARAFAC (CP) dictionary learning and yields greater computational efficiency especially when the problem dimension is large. We provide an extensive convergence analysis on the proposed algorithm, which we derive under possibly dependent data streams, relaxing the standard i.i.d. assumption on data samples. We show that the proposed algorithm converges almost surely to the set of stationary points of a nonconvex objective under constraints at a rate $O((\log n)^{1+\eps}/n^{1/2})$ for the empirical loss function and $O((\log n)^{1+\eps}/n^{1/4})$ for the expected loss function, where $n$ denotes the number of data samples processed. Under some additional assumption, the latter convergence rate can be improved to $O((\log n)^{1+\eps}/n^{1/2})$. Our results provide first convergence rate bounds for various online matrix and tensor decomposition algorithms under a general Markovian data setting.
The aim of this paper is to study the recovery of a spatially dependent potential in a (sub)diffusion equation from overposed final time data. We construct a monotone operator one of whose fixed points is the unknown potential. The uniqueness of the identification is theoretically verified by using the monotonicity of the operator and a fixed point argument. Moreover, we show a conditional stability in Hilbert spaces under some suitable conditions on the problem data. Next, a completely discrete scheme is developed, by using Galerkin finite element method in space and finite difference method in time, and then a fixed point iteration is applied to reconstruct the potential. We prove the linear convergence of the iterative algorithm by the contraction mapping theorem, and present a thorough error analysis for the reconstructed potential. Our derived \textsl{a priori} error estimate provides a guideline to choose discretization parameters according to the noise level. The analysis relies heavily on some suitable nonstandard error estimates for the direct problem as well as the aforementioned conditional stability. Numerical experiments are provided to illustrate and complement our theoretical analysis.
We investigate the quality of space approximation of a class of stochastic integral equations of convolution type with Gaussian noise. Such equations arise, for example, when considering mild solutions of stochastic fractional order partial differential equations but also when considering mild solutions of classical stochastic partial differential equations. The key requirement for the equations is a smoothing property of the deterministic evolution operator which is typical in parabolic type problems. We show that if one has access to nonsmooth data estimates for the deterministic error operator together with its derivative of a space discretization procedure, then one obtains error estimates in pathwise H\"older norms with rates that can be read off the deterministic error rates. We illustrate the main result by considering a class of stochastic fractional order partial differential equations and space approximations performed by spectral Galerkin methods and finite elements. We also improve an existing result on the stochastic heat equation.
We introduce a new family of deep neural network models. Instead of specifying a discrete sequence of hidden layers, we parameterize the derivative of the hidden state using a neural network. The output of the network is computed using a black-box differential equation solver. These continuous-depth models have constant memory cost, adapt their evaluation strategy to each input, and can explicitly trade numerical precision for speed. We demonstrate these properties in continuous-depth residual networks and continuous-time latent variable models. We also construct continuous normalizing flows, a generative model that can train by maximum likelihood, without partitioning or ordering the data dimensions. For training, we show how to scalably backpropagate through any ODE solver, without access to its internal operations. This allows end-to-end training of ODEs within larger models.