We study here the approximation by a finite-volume scheme of a heat equation forced by a Lipschitz continuous multiplicative noise in the sense of It\^o. More precisely, we consider a discretization which is semi-implicit in time and a two-point flux approximation scheme (TPFA) in space. We adapt the method based on the theorem of Prokhorov to obtain a convergence in distribution result, then Skorokhod's representation theorem yields the convergence of the scheme towards a martingale solution and the Gy\"{o}ngy-Krylov argument is used to prove convergence in probability of the scheme towards the unique variational solution of our parabolic problem.
Finite element methods are well-known to admit robust optimal convergence on simplicial meshes satisfying the maximum angle conditions. But how to generalize this condition to polyhedra is unknown in the literature. In this work, we argue that this generation is possible for virtual element methods (VEMs). In particular, we develop an anisotropic analysis framework for VEMs where the virtual spaces and projection spaces remain abstract and can be problem-adapted, carrying forward the ``virtual'' spirit of VEMs. Three anisotropic cases will be analyzed under this framework: (1) elements only contain non-shrinking inscribed balls but are not necessarily star convex to those balls; (2) elements are cut arbitrarily from a background Cartesian mesh, which can extremely shrink; (3) elements contain different materials on which the virtual spaces involve discontinuous coefficients. The error estimates are guaranteed to be independent of polyhedral element shapes. The present work largely improves the current theoretical results in the literature and also broadens the scope of the application of VEMs.
Tensor decomposition serves as a powerful primitive in statistics and machine learning. In this paper, we focus on using power iteration to decompose an overcomplete random tensor. Past work studying the properties of tensor power iteration either requires a non-trivial data-independent initialization, or is restricted to the undercomplete regime. Moreover, several papers implicitly suggest that logarithmically many iterations (in terms of the input dimension) are sufficient for the power method to recover one of the tensor components. In this paper, we analyze the dynamics of tensor power iteration from random initialization in the overcomplete regime. Surprisingly, we show that polynomially many steps are necessary for convergence of tensor power iteration to any of the true component, which refutes the previous conjecture. On the other hand, our numerical experiments suggest that tensor power iteration successfully recovers tensor components for a broad range of parameters, despite that it takes at least polynomially many steps to converge. To further complement our empirical evidence, we prove that a popular objective function for tensor decomposition is strictly increasing along the power iteration path. Our proof is based on the Gaussian conditioning technique, which has been applied to analyze the approximate message passing (AMP) algorithm. The major ingredient of our argument is a conditioning lemma that allows us to generalize AMP-type analysis to non-proportional limit and polynomially many iterations of the power method.
Zernike radial polynomials play a significant role in application areas such as optics design, imaging systems, and image processing systems. Currently, there are two kinds of numerical schemes for computing the Zernike radial polynomials automatically with computer programs: one is based on the definition in which the factorial operations may lead to the overflow problem and the high order derivatives are troublesome, and the other is based on recursion which is either unstable or with high computational complexity. In this paper, our emphasis is focused on exploring the balanced binary tree (BBT) schemes for computing Zernike radial polynomials: firstly we established an elegant formulae for computation; secondly we proposed the recursive and iterative algorithms based-on BBT; thirdly we analyzed the computational complexity of the algorithms rigorously; finally we verified and validated the performance of BBT schemes by testing the running time. Theoretic analysis shows that the computational complexity of BBT recursive algorithm and iterative algorithm are exponential and quadratic respectively, which coincides with the running time test very well. Experiments show that the time consumption is about $1\sim 10$ microseconds with different computation platforms for the BBT iterative algorithm (BBTIA), which is stable and efficient for realtime applications.
We present a novel mathematical framework for computing the number of maintenance cycles in a system with critical and non-critical components, where "critical" (CR) means that the component's failure is fatal for the system's operation and renders any more repairs inapplicable, whereas "noncritical" (NC) means that the component can undergo corrective maintenance (replacement or minimal repair) whenever it fails, provided that the CR component is still in operation. Whenever the NC component fails, the CR component can optionally be preventively replaced. We extend traditional renewal theory (whether classical or generalized) for various maintenance scenarios for a system composed of one CR and one NC component in order to compute the average number of renewals of NC under the restriction ("bound") necessitated by CR. We also develop approximations in closed form for the proposed "bounded" renewal functions. We validate our formulas by simulations on a variety of component lifetime distributions, including actual lifetime distributions of wind turbine components.
We present a novel implicit scheme for the numerical solution of time-dependent conservation laws. The core idea of the presented method is to exploit and approximate the mixed spatial-temporal derivative of the solution that occurs naturally when deriving some second order accurate schemes in time. Such an approach is introduced in the context of the Lax-Wendroff (or Cauchy-Kowalevski) procedure when the second time derivative is not completely replaced by space derivatives using the PDE, but the mixed derivative is kept. If approximated in a suitable way, the resulting compact implicit scheme produces algebraic systems that have a more convenient structure than the systems derived by fully implicit schemes. We derive a high resolution TVD form of the implicit scheme for some representative hyperbolic equations in the one-dimensional case, including illustrative numerical experiments.
Multiplicative error models (MEMs) are commonly used for real-valued time series, but they cannot be applied to discrete-valued count time series as the involved multiplication would not preserve the integer nature of the data. We propose the concept of a multiplicative operator for counts as well as several specific instances thereof, which are then used to develop MEMs for count time series (CMEMs). If equipped with a linear conditional mean, the resulting CMEMs are closely related to the class of so-called integer-valued generalized autoregressive conditional heteroskedasticity (INGARCH) models and might be used as a semi-parametric extension thereof. We derive important stochastic properties of different types of INGARCH-CMEM as well as relevant estimation approaches, namely types of quasi-maximum likelihood and weighted least squares estimation. The performance and application are demonstrated with simulations as well as with two real-world data examples.
This paper makes 3 contributions. First, it generalizes the Lindeberg\textendash Feller and Lyapunov Central Limit Theorems to Hilbert Spaces by way of $L^2$. Second, it generalizes these results to spaces in which sample failure and missingness can occur. Finally, it shows that satisfaction of the Lindeberg\textendash Feller Condition in such spaces guarantees the consistency of all inferences from the partial functional data with respect to the completely observed data. These latter two results are especially important given the increasing attention to statistical inference with partially observed functional data. This paper goes beyond previous research by providing simple boundedness conditions which guarantee that \textit{all} inferences, as opposed to some proper subset of them, will be consistently estimated. This is shown primarily by aggregating conditional expectations with respect to the space of missingness patterns. This paper appears to be the first to apply this technique.
We investigate the problem of recovering a partially observed high-rank matrix whose columns obey a nonlinear structure such as a union of subspaces, an algebraic variety or grouped in clusters. The recovery problem is formulated as the rank minimization of a nonlinear feature map applied to the original matrix, which is then further approximated by a constrained non-convex optimization problem involving the Grassmann manifold. We propose two sets of algorithms, one arising from Riemannian optimization and the other as an alternating minimization scheme, both of which include first- and second-order variants. Both sets of algorithms have theoretical guarantees. In particular, for the alternating minimization, we establish global convergence and worst-case complexity bounds. Additionally, using the Kurdyka-Lojasiewicz property, we show that the alternating minimization converges to a unique limit point. We provide extensive numerical results for the recovery of union of subspaces and clustering under entry sampling and dense Gaussian sampling. Our methods are competitive with existing approaches and, in particular, high accuracy is achieved in the recovery using Riemannian second-order methods.
Sampling methods (e.g., node-wise, layer-wise, or subgraph) has become an indispensable strategy to speed up training large-scale Graph Neural Networks (GNNs). However, existing sampling methods are mostly based on the graph structural information and ignore the dynamicity of optimization, which leads to high variance in estimating the stochastic gradients. The high variance issue can be very pronounced in extremely large graphs, where it results in slow convergence and poor generalization. In this paper, we theoretically analyze the variance of sampling methods and show that, due to the composite structure of empirical risk, the variance of any sampling method can be decomposed into \textit{embedding approximation variance} in the forward stage and \textit{stochastic gradient variance} in the backward stage that necessities mitigating both types of variance to obtain faster convergence rate. We propose a decoupled variance reduction strategy that employs (approximate) gradient information to adaptively sample nodes with minimal variance, and explicitly reduces the variance introduced by embedding approximation. We show theoretically and empirically that the proposed method, even with smaller mini-batch sizes, enjoys a faster convergence rate and entails a better generalization compared to the existing methods.
With the rapid increase of large-scale, real-world datasets, it becomes critical to address the problem of long-tailed data distribution (i.e., a few classes account for most of the data, while most classes are under-represented). Existing solutions typically adopt class re-balancing strategies such as re-sampling and re-weighting based on the number of observations for each class. In this work, we argue that as the number of samples increases, the additional benefit of a newly added data point will diminish. We introduce a novel theoretical framework to measure data overlap by associating with each sample a small neighboring region rather than a single point. The effective number of samples is defined as the volume of samples and can be calculated by a simple formula $(1-\beta^{n})/(1-\beta)$, where $n$ is the number of samples and $\beta \in [0,1)$ is a hyperparameter. We design a re-weighting scheme that uses the effective number of samples for each class to re-balance the loss, thereby yielding a class-balanced loss. Comprehensive experiments are conducted on artificially induced long-tailed CIFAR datasets and large-scale datasets including ImageNet and iNaturalist. Our results show that when trained with the proposed class-balanced loss, the network is able to achieve significant performance gains on long-tailed datasets.