The paper studies asymptotic properties of estimators of multidimensional stochastic differential equations driven by Brownian motions from high-frequency discrete data. Consistency and central limit properties of a class of estimators of the diffusion parameter and an approximate maximum likelihood estimator of the drift parameter based on a discretized likelihood function have been established in a suitable scaling regime involving the time-gap between the observations and the overall time span. Our framework is more general than that typically considered in the literature and, thus, has the potential to be applicable to a wider range of stochastic models.
In this paper, a highly parallel and derivative-free martingale neural network learning method is proposed to solve Hamilton-Jacobi-Bellman (HJB) equations arising from stochastic optimal control problems (SOCPs), as well as general quasilinear parabolic partial differential equations (PDEs). In both cases, the PDEs are reformulated into a martingale formulation such that loss functions will not require the computation of the gradient or Hessian matrix of the PDE solution, while its implementation can be parallelized in both time and spatial domains. Moreover, the martingale conditions for the PDEs are enforced using a Galerkin method in conjunction with adversarial learning techniques, eliminating the need for direct computation of the conditional expectations associated with the martingale property. For SOCPs, a derivative-free implementation of the maximum principle for optimal controls is also introduced. The numerical results demonstrate the effectiveness and efficiency of the proposed method, which is capable of solving HJB and quasilinear parabolic PDEs accurately in dimensions as high as 10,000.
This work studies the parameter-dependent diffusion equation in a two-dimensional domain consisting of locally mirror symmetric layers. It is assumed that the diffusion coefficient is a constant in each layer. The goal is to find approximate parameter-to-solution maps that have a small number of terms. It is shown that in the case of two layers one can find a solution formula consisting of three terms with explicit dependencies on the diffusion coefficient. The formula is based on decomposing the solution into orthogonal parts related to both of the layers and the interface between them. This formula is then expanded to an approximate one for the multi-layer case. We give an analytical formula for square layers and use the finite element formulation for more general layers. The results are illustrated with numerical examples and have applications for reduced basis methods by analyzing the Kolmogorov n-width.
The regularity of solutions to the stochastic nonlinear wave equation plays a critical role in the accuracy and efficiency of numerical algorithms. Rough or discontinuous initial conditions pose significant challenges, often leading to a loss of accuracy and reduced computational efficiency in existing methods. In this study, we address these challenges by developing a novel and efficient numerical algorithm specifically designed for computing rough solutions of the stochastic nonlinear wave equation, while significantly relaxing the regularity requirements on the initial data. By leveraging the intrinsic structure of the stochastic nonlinear wave equation and employing advanced tools from harmonic analysis, we construct a time discretization method that achieves robust convergence for initial values \((u^{0}, v^{0}) \in H^{\gamma} \times H^{\gamma-1}\) for all \(\gamma > 0\). Notably, our method attains an improved error rate of \(O(\tau^{2\gamma-})\) in one and two dimensions for \(\gamma \in (0, \frac{1}{2}]\), and \(O(\tau^{\max(\gamma, 2\gamma - \frac{1}{2}-)})\) in three dimensions for \(\gamma \in (0, \frac{3}{4}]\), where \(\tau\) denotes the time step size. These convergence rates surpass those of existing numerical methods under the same regularity conditions, underscoring the advantage of our approach. To validate the performance of our method, we present extensive numerical experiments that demonstrate its superior accuracy and computational efficiency compared to state-of-the-art methods. These results highlight the potential of our approach to enable accurate and efficient simulations of stochastic wave phenomena even in the presence of challenging initial conditions.
In this manuscript we present the tensor-train reduced basis method, a novel projection-based reduced-order model for the efficient solution of parameterized partial differential equations. Despite their popularity and considerable computational advantages with respect to their full order counterparts, reduced-order models are typically characterized by a considerable offline computational cost. The proposed approach addresses this issue by efficiently representing high dimensional finite element quantities with the tensor train format. This method entails numerous benefits, namely, the smaller number of operations required to compute the reduced subspaces, the cheaper hyper-reduction strategy employed to reduce the complexity of the PDE residual and Jacobian, and the decreased dimensionality of the projection subspaces for a fixed accuracy. We provide a posteriori estimates that demonstrate the accuracy of the proposed method, we test its computational performance for the heat equation and transient linear elasticity on three-dimensional Cartesian geometries.
We present a novel class of projected gradient (PG) methods for minimizing a smooth but not necessarily convex function over a convex compact set. We first provide a novel analysis of the "vanilla" PG method, achieving the best-known iteration complexity for finding an approximate stationary point of the problem. We then develop an "auto-conditioned" projected gradient (AC-PG) variant that achieves the same iteration complexity without requiring the input of the Lipschitz constant of the gradient or any line search procedure. The key idea is to estimate the Lipschitz constant using first-order information gathered from the previous iterations, and to show that the error caused by underestimating the Lipschitz constant can be properly controlled. We then generalize the PG methods to the stochastic setting, by proposing a stochastic projected gradient (SPG) method and a variance-reduced stochastic gradient (VR-SPG) method, achieving new complexity bounds in different oracle settings. We also present auto-conditioned stepsize policies for both stochastic PG methods and establish comparable convergence guarantees.
The notion of a non-deterministic logical matrix (where connectives are interpreted as multi-functions) extends the traditional semantics for propositional logics based on logical matrices (where connectives are interpreted as functions). This extension allows for finitely characterizing a much wider class of logics, and has proven decisive in a myriad of recent compositionality results. In this paper we show that the added expressivity brought by non-determinism also has its drawbacks, and in particular that the problem of determining whether two given finite non-deterministic matrices are equivalent, in the sense that they induce the same logic, becomes undecidable. We also discuss some workable sufficient conditions and particular cases, namely regarding rexpansion homomorphisms and bridges to calculi.
Gradient Descent (GD) and Conjugate Gradient (CG) methods are among the most effective iterative algorithms for solving unconstrained optimization problems, particularly in machine learning and statistical modeling, where they are employed to minimize cost functions. In these algorithms, tunable parameters, such as step sizes or conjugate parameters, play a crucial role in determining key performance metrics, like runtime and solution quality. In this work, we introduce a framework that models algorithm selection as a statistical learning problem, and thus learning complexity can be estimated by the pseudo-dimension of the algorithm group. We first propose a new cost measure for unconstrained optimization algorithms, inspired by the concept of primal-dual integral in mixed-integer linear programming. Based on the new cost measure, we derive an improved upper bound for the pseudo-dimension of gradient descent algorithm group by discretizing the set of step size configurations. Moreover, we generalize our findings from gradient descent algorithm to the conjugate gradient algorithm group for the first time, and prove the existence a learning algorithm capable of probabilistically identifying the optimal algorithm with a sufficiently large sample size.
We prove, for stably computably enumerable formal systems, direct analogues of the first and second incompleteness theorems of G\"odel. A typical stably computably enumerable set is the set of Diophantine equations with no integer solutions, and in particular such sets are generally not computably enumerable. And so this gives the first extension of the second incompleteness theorem to non classically computable formal systems. Let's motivate this with a somewhat physical application. Let $\mathcal{H} $ be the suitable infinite time limit (stabilization in the sense of the paper) of the mathematical output of humanity, specializing to first order sentences in the language of arithmetic (for simplicity), and understood as a formal system. Suppose that all the relevant physical processes in the formation of $\mathcal{H} $ are Turing computable. Then as defined $\mathcal{H} $ may \emph{not} be computably enumerable, but it is stably computably enumerable. Thus, the classical G\"odel disjunction applied to $\mathcal{H} $ is meaningless, but applying our incompleteness theorems to $\mathcal{H} $ we then get a sharper version of G\"odel's disjunction: assume $\mathcal{H} \vdash PA$ then either $\mathcal{H} $ is not stably computably enumerable or $\mathcal{H} $ is not 1-consistent (in particular is not sound) or $\mathcal{H} $ cannot prove a certain true statement of arithmetic (and cannot disprove it if in addition $\mathcal{H} $ is 2-consistent).
This manuscript studies the numerical solution of the time-fractional Burgers-Huxley equation in a reproducing kernel Hilbert space. The analytical solution of the equation is obtained in terms of a convergent series with easily computable components. It is observed that the approximate solution uniformly converges to the exact solution for the aforementioned equation. Also, the convergence of the proposed method is investigated. Numerical examples are given to demonstrate the validity and applicability of the presented method. The numerical results indicate that the proposed method is powerful and effective with a small computational overhead.
This paper presents an analysis of properties of two hybrid discretization methods for Gaussian derivatives, based on convolutions with either the normalized sampled Gaussian kernel or the integrated Gaussian kernel followed by central differences. The motivation for studying these discretization methods is that in situations when multiple spatial derivatives of different order are needed at the same scale level, they can be computed significantly more efficiently compared to more direct derivative approximations based on explicit convolutions with either sampled Gaussian kernels or integrated Gaussian kernels. While these computational benefits do also hold for the genuinely discrete approach for computing discrete analogues of Gaussian derivatives, based on convolution with the discrete analogue of the Gaussian kernel followed by central differences, the underlying mathematical primitives for the discrete analogue of the Gaussian kernel, in terms of modified Bessel functions of integer order, may not be available in certain frameworks for image processing, such as when performing deep learning based on scale-parameterized filters in terms of Gaussian derivatives, with learning of the scale levels. In this paper, we present a characterization of the properties of these hybrid discretization methods, in terms of quantitative performance measures concerning the amount of spatial smoothing that they imply, as well as the relative consistency of scale estimates obtained from scale-invariant feature detectors with automatic scale selection, with an emphasis on the behaviour for very small values of the scale parameter, which may differ significantly from corresponding results obtained from the fully continuous scale-space theory, as well as between different types of discretization methods.