This work puts forth low-complexity Riemannian subspace descent algorithms for the minimization of functions over the symmetric positive definite (SPD) manifold. Different from the existing Riemannian gradient descent variants, the proposed approach utilizes carefully chosen subspaces that allow the update to be written as a product of the Cholesky factor of the iterate and a sparse matrix. The resulting updates avoid the costly matrix operations like matrix exponentiation and dense matrix multiplication, which are generally required in almost all other Riemannian optimization algorithms on SPD manifold. We further identify a broad class of functions, arising in diverse applications, such as kernel matrix learning, covariance estimation of Gaussian distributions, maximum likelihood parameter estimation of elliptically contoured distributions, and parameter estimation in Gaussian mixture model problems, over which the Riemannian gradients can be calculated efficiently. The proposed uni-directional and multi-directional Riemannian subspace descent variants incur per-iteration complexities of $\O(n)$ and $\O(n^2)$ respectively, as compared to the $\O(n^3)$ or higher complexity incurred by all existing Riemannian gradient descent variants. The superior runtime and low per-iteration complexity of the proposed algorithms is also demonstrated via numerical tests on large-scale covariance estimation and matrix square root problems.
Faithfully summarizing the knowledge encoded by a deep neural network (DNN) into a few symbolic primitive patterns without losing much information represents a core challenge in explainable AI. To this end, Ren et al. (2023c) have derived a series of theorems to prove that the inference score of a DNN can be explained as a small set of interactions between input variables. However, the lack of generalization power makes it still hard to consider such interactions as faithful primitive patterns encoded by the DNN. Therefore, given different DNNs trained for the same task, we develop a new method to extract interactions that are shared by these DNNs. Experiments show that the extracted interactions can better reflect common knowledge shared by different DNNs.
In this paper, we develop a class of high-order conservative methods for simulating non-equilibrium radiation diffusion problems. Numerically, this system poses significant challenges due to strong nonlinearity within the stiff source terms and the degeneracy of nonlinear diffusion terms. Explicit methods require impractically small time steps, while implicit methods, which offer stability, come with the challenge to guarantee the convergence of nonlinear iterative solvers. To overcome these challenges, we propose a predictor-corrector approach and design proper implicit-explicit time discretizations. In the predictor step, the system is reformulated into a nonconservative form and linear diffusion terms are introduced as a penalization to mitigate strong nonlinearities. We then employ a Picard iteration to secure convergence in handling the nonlinear aspects. The corrector step guarantees the conservation of total energy, which is vital for accurately simulating the speeds of propagating sharp fronts in this system. For spatial approximations, we utilize local discontinuous Galerkin finite element methods, coupled with positive-preserving and TVB limiters. We validate the orders of accuracy, conservation properties, and suitability of using large time steps for our proposed methods, through numerical experiments conducted on one- and two-dimensional spatial problems. In both homogeneous and heterogeneous non-equilibrium radiation diffusion problems, we attain a time stability condition comparable to that of a fully implicit time discretization. Such an approach is also applicable to many other reaction-diffusion systems.
The scheduling problem is a key class of optimization problems and has various kinds of applications both in practical and theoretical scenarios. In the scheduling problem, probabilistic analysis is a basic tool for investigating performance of scheduling algorithms, and therefore has been carried out by plenty amount of prior works. However, probabilistic analysis has several potential problems. For example, current research interest in the scheduling problem is limited to i.i.d. scenarios, due to its simplicity for analysis. This paper provides a new framework for probabilistic analysis in the scheduling problem and aims to deal with such problems. As a consequence, we obtain several theorems including a theoretical limit of the scheduling problem which can be applied to \emph{general, non-i.i.d. probability distributions}. Several information theoretic techniques, such as \emph{information-spectrum method}, turned out to be useful to prove our results. Since the scheduling problem has relations to many other research fields, our framework hopefully yields other interesting applications in the future.
This manuscript examines the problem of nonlinear stochastic fractional neutral integro-differential equations with weakly singular kernels. Our focus is on obtaining precise estimates to cover all possible cases of Abel-type singular kernels. Initially, we establish the existence, uniqueness, and continuous dependence on the initial value of the true solution, assuming a local Lipschitz condition and linear growth condition. Additionally, we develop the Euler-Maruyama method for the numerical solution of the equation and prove its strong convergence under the same conditions as the well-posedness. Moreover, we determine the accurate convergence rate of this method under global Lipschitz conditions and linear growth conditions. And also we have proven generalized Gronwall inequality with a multi-weakly singularity.
It is known that standard stochastic Galerkin methods encounter challenges when solving partial differential equations with high-dimensional random inputs, which are typically caused by the large number of stochastic basis functions required. It becomes crucial to properly choose effective basis functions, such that the dimension of the stochastic approximation space can be reduced. In this work, we focus on the stochastic Galerkin approximation associated with generalized polynomial chaos (gPC), and explore the gPC expansion based on the analysis of variance (ANOVA) decomposition. A concise form of the gPC expansion is presented for each component function of the ANOVA expansion, and an adaptive ANOVA procedure is proposed to construct the overall stochastic Galerkin system. Numerical results demonstrate the efficiency of our proposed adaptive ANOVA stochastic Galerkin method for both diffusion and Helmholtz problems.
We study the problem of parameter estimation for large exchangeable interacting particle systems when a sample of discrete observations from a single particle is known. We propose a novel method based on martingale estimating functions constructed by employing the eigenvalues and eigenfunctions of the generator of the mean field limit, where the law of the process is replaced by the (unique) invariant measure of the mean field dynamics. We then prove that our estimator is asymptotically unbiased and asymptotically normal when the number of observations and the number of particles tend to infinity, and we provide a rate of convergence towards the exact value of the parameters. Finally, we present several numerical experiments which show the accuracy of our estimator and corroborate our theoretical findings, even in the case the mean field dynamics exhibit more than one steady states.
A new sparse semiparametric model is proposed, which incorporates the influence of two functional random variables in a scalar response in a flexible and interpretable manner. One of the functional covariates is included through a single-index structure, while the other is included linearly through the high-dimensional vector formed by its discretised observations. For this model, two new algorithms are presented for selecting relevant variables in the linear part and estimating the model. Both procedures utilise the functional origin of linear covariates. Finite sample experiments demonstrated the scope of application of both algorithms: the first method is a fast algorithm that provides a solution (without loss in predictive ability) for the significant computational time required by standard variable selection methods for estimating this model, and the second algorithm completes the set of relevant linear covariates provided by the first, thus improving its predictive efficiency. Some asymptotic results theoretically support both procedures. A real data application demonstrated the applicability of the presented methodology from a predictive perspective in terms of the interpretability of outputs and low computational cost.
In this work, a Generalized Finite Difference (GFD) scheme is presented for effectively computing the numerical solution of a parabolic-elliptic system modelling a bacterial strain with density-suppressed motility. The GFD method is a meshless method known for its simplicity for solving non-linear boundary value problems over irregular geometries. The paper first introduces the basic elements of the GFD method, and then an explicit-implicit scheme is derived. The convergence of the method is proven under a bound for the time step, and an algorithm is provided for its computational implementation. Finally, some examples are considered comparing the results obtained with a regular mesh and an irregular cloud of points.
In arXiv:2305.03945 [math.NA], a first-order optimization algorithm has been introduced to solve time-implicit schemes of reaction-diffusion equations. In this research, we conduct theoretical studies on this first-order algorithm equipped with a quadratic regularization term. We provide sufficient conditions under which the proposed algorithm and its time-continuous limit converge exponentially fast to a desired time-implicit numerical solution. We show both theoretically and numerically that the convergence rate is independent of the grid size, which makes our method suitable for large-scale problems. The efficiency of our algorithm has been verified via a series of numerical examples conducted on various types of reaction-diffusion equations. The choice of optimal hyperparameters as well as comparisons with some classical root-finding algorithms are also discussed in the numerical section.
Asymptotic analysis for related inference problems often involves similar steps and proofs. These intermediate results could be shared across problems if each of them is made self-contained and easily identified. However, asymptotic analysis using Taylor expansions is limited for result borrowing because it is a step-to-step procedural approach. This article introduces EEsy, a modular system for estimating finite and infinitely dimensional parameters in related inference problems. It is based on the infinite-dimensional Z-estimation theorem, Donsker and Glivenko-Cantelli preservation theorems, and weight calibration techniques. This article identifies the systematic nature of these tools and consolidates them into one system containing several modules, which can be built, shared, and extended in a modular manner. This change to the structure of method development allows related methods to be developed in parallel and complex problems to be solved collaboratively, expediting the development of new analytical methods. This article considers four related inference problems -- estimating parameters with random sampling, two-phase sampling, auxiliary information incorporation, and model misspecification. We illustrate this modular approach by systematically developing 9 parameter estimators and 18 variance estimators for the four related inference problems regarding semi-parametric additive hazards models. Simulation studies show the obtained asymptotic results for these 27 estimators are valid. In the end, I describe how this system can simplify the use of empirical process theory, a powerful but challenging tool to be adopted by the broad community of methods developers. I discuss challenges and the extension of this system to other inference problems.