This paper mainly investigates the strong convergence and stability of the truncated Euler-Maruyama (EM) method for stochastic differential delay equations with variable delay whose coefficients can be growing super-linearly. By constructing appropriate truncated functions to control the super-linear growth of the original coefficients, we present a type of the truncated EM method for such SDDEs with variable delay, which is proposed to be approximated by the value taken at the nearest grid points on the left of the delayed argument. The strong convergence result (without order) of the method is established under the local Lipschitz plus generalized Khasminskii-type conditions and the optimal strong convergence order $1/2$ can be obtained if the global monotonicity with U function and polynomial growth conditions are added to the assumptions. Moreover, the partially truncated EM method is proved to preserve the mean-square and H_\infty stabilities of the true solutions. Compared with the known results on the truncated EM method for SDDEs, a better order of strong convergence is obtained under more relaxing conditions on the coefficients, and more refined technical estimates are developed so as to overcome the challenges arising due to variable delay. Lastly, some numerical examples are utilized to confirm the effectiveness of the theoretical results.
We consider a generic and explicit tamed Euler--Maruyama scheme for multidimensional time-inhomogeneous stochastic differential equations with multiplicative Brownian noise. The diffusion coefficient is uniformly elliptic, H\"older continuous and weakly differentiable in the spatial variables while the drift satisfies the Ladyzhenskaya--Prodi--Serrin condition, as considered by Krylov and R\"ockner (2005). In the discrete scheme, the drift is tamed by replacing it by an approximation. A strong rate of convergence of the scheme is provided in terms of the approximation error of the drift in a suitable and possibly very weak topology. A few examples of approximating drifts are discussed in detail. The parameters of the approximating drifts can vary and be fine-tuned to achieve the standard $1/2$-strong convergence rate with a logarithmic factor.
We present an energy-preserving mechanic formulation for dynamic quasi-brittle fracture in an Eulerian-Lagrangian formulation, where a second-order phase-field equation controls the damage evolution. The numerical formulation adapts in space and time to bound the errors, solving the mesh-bias issues these models typically suffer. The time-step adaptivity estimates the temporal truncation error of the partial differential equation that governs the solid equilibrium. The second-order generalized-$\alpha$ time-marching scheme evolves the dynamic system. We estimate the temporal error by extrapolating a first-order approximation of the present time-step solution using previous ones with backward difference formulas; the estimate compares the extrapolation with the time-marching solution. We use an adaptive scheme built on a residual minimization formulation in space. We estimate the spatial error by enriching the discretization with elemental bubbles; then, we localize an error indicator norm to guide the mesh refinement as the fracture propagates. The combined space and time adaptivity allows us to use low-order linear elements in problems involving complex stress paths. We efficiently and robustly use low-order spatial discretizations while avoiding mesh bias in structured and unstructured meshes. We demonstrate the method's efficiency with numerical experiments that feature dynamic crack branching, where the capacity of the adaptive space-time scheme is apparent. The adaptive method delivers accurate and reproducible crack paths on meshes with fewer elements.
We develop a new method of online inference for a vector of parameters estimated by the Polyak-Ruppert averaging procedure of stochastic gradient descent (SGD) algorithms. We leverage insights from time series regression in econometrics and construct asymptotically pivotal statistics via random scaling. Our approach is fully operational with online data and is rigorously underpinned by a functional central limit theorem. Our proposed inference method has a couple of key advantages over the existing methods. First, the test statistic is computed in an online fashion with only SGD iterates and the critical values can be obtained without any resampling methods, thereby allowing for efficient implementation suitable for massive online data. Second, there is no need to estimate the asymptotic variance and our inference method is shown to be robust to changes in the tuning parameters for SGD algorithms in simulation experiments with synthetic data.
An explicit numerical method is developed for a class of time-changed stochastic differential equations, whose the coefficients obey H\"older's continuity in terms of the time variable and are allowed to grow super-linearly in terms of the state variable. The strong convergence of the method in a finite time interval is proved and the convergence rate is obtained. Numerical simulations are provided, which are in line with those theoretical results.
We study the problem of the nonparametric estimation for the density $\pi$ of the stationary distribution of a $d$-dimensional stochastic differential equation $(X_t)_{t \in [0, T]}$. From the continuous observation of the sampling path on $[0, T]$, we study the rate of estimation of $\pi(x)$ as $T$ goes to infinity. One finding is that, for $d \ge 3$, the rate of estimation depends on the smoothness $\beta = (\beta_1, ... , \beta_d)$ of $\pi$. In particular, having ordered the smoothness such that $\beta_1 \le ... \le \beta_d$, it depends on the fact that $\beta_2 < \beta_3$ or $\beta_2 = \beta_3$. We show that kernel density estimators achieve the rate $(\frac{\log T}{T})^\gamma$ in the first case and $(\frac{1}{T})^\gamma$ in the second, for an explicit exponent $\gamma$ depending on the dimension and on $\bar{\beta}_3$, the harmonic mean of the smoothness over the $d$ directions after having removed $\beta_1$ and $\beta_2$, the smallest ones. Moreover, we obtain a minimax lower bound on the $\mathbf{L}^2$-risk for the pointwise estimation with the same rates $(\frac{\log T}{T})^\gamma$ or $(\frac{1}{T})^\gamma$, depending on the value of $\beta_2$ and $\beta_3$.
In this work we consider a class of non-linear eigenvalue problems that admit a spectrum similar to that of a Hamiltonian matrix, in the sense that the spectrum is symmetric with respect to both the real and imaginary axis. More precisely, we present a method to iteratively approximate the eigenvalues of such non-linear eigenvalue problems closest to a given purely real or imaginary shift, while preserving the symmetries of the spectrum. To this end the presented method exploits the equivalence between the considered non-linear eigenvalue problem and the eigenvalue problem associated with a linear but infinite-dimensional operator. To compute the eigenvalues closest to the given shift, we apply a specifically chosen shift-invert transformation to this linear operator and compute the eigenvalues with the largest modulus of the new shifted and inverted operator using an (infinite) Arnoldi procedure. The advantage of the chosen shift-invert transformation is that the spectrum of the transformed operator has a "real skew-Hamiltonian"-like structure. Furthermore, it is proven that the Krylov space constructed by applying this operator, satisfies an orthogonality property in terms of a specifically chosen bilinear form. By taking this property into account in the orthogonalization process, it is ensured that even in the presence of rounding errors, the obtained approximation for, e.g., a simple, purely imaginary eigenvalue is simple and purely imaginary. The presented work can thus be seen as an extension of [V. Mehrmann and D. Watkins, "Structure-Preserving Methods for Computing Eigenpairs of Large Sparse Skew-Hamiltonian/Hamiltonian Pencils", SIAM J. Sci. Comput. (22.6), 2001], to the considered class of non-linear eigenvalue problems. Although the presented method is initially defined on function spaces, it can be implemented using finite dimensional linear algebra operations.
The dynamic behaviour of periodic thermodiffusive multi-layered media excited by harmonic oscillations is studied. In the framework of linear thermodiffusive elasticity, periodic laminates, whose elementary cell is composed by an arbitrary number of layers, are considered. The generalized Floquet-Bloch conditions are imposed, and the universal dispersion relation of the composite is obtained by means of an approach based on the formal solution for a single layer together with the transfer matrix method. The eigenvalue problem associated with the dispersion equation is solved by means of an analytical procedure based on the symplecticity properties of the transfer matrix to which corresponds a palindromic characteristic polynomial, and the frequency band structure associated to wave propagating inside the medium are finally derived. The proposed approach is tested through illustrative examples where thermodiffusive multilayered structures of interest for renewable energy devices fabrication are analyzed. The effects of thermodiffusion coupling on both the propagation and attenuation of Bloch waves in these systems are investigated in detail.
We propose a verified computation method for eigenvalues in a region and the corresponding eigenvectors of generalized Hermitian eigenvalue problems. The proposed method uses complex moments to extract the eigencomponents of interest from a random matrix and uses the Rayleigh--Ritz procedure to project a given eigenvalue problem into a reduced eigenvalue problem. The complex moment is given by contour integral and approximated by using numerical quadrature. We split the error in the complex moment into the truncation error of the quadrature and rounding errors and evaluate each. This idea for error evaluation inherits our previous Hankel matrix approach, whereas the proposed method requires half the number of quadrature points for the previous approach to reduce the truncation error to the same order. Moreover, the Rayleigh--Ritz procedure approach forms a transformation matrix that enables verification of the eigenvectors. Numerical experiments show that the proposed method is faster than previous methods while maintaining verification performance.
Federated learning (FL) is an attractive paradigm for making use of rich distributed data while protecting data privacy. Nonetheless, nonideal communication links and limited transmission resources may hinder the implementation of fast and accurate FL. In this paper, we study joint optimization of communications and FL based on analog aggregation transmission in realistic wireless networks. We first derive closed-form expressions for the expected convergence rate of FL over the air, which theoretically quantify the impact of analog aggregation on FL. Based on the analytical results, we develop a joint optimization model for accurate FL implementation, which allows a parameter server to select a subset of workers and determine an appropriate power scaling factor. Since the practical setting of FL over the air encounters unobservable parameters, we reformulate the joint optimization of worker selection and power allocation using controlled approximation. Finally, we efficiently solve the resulting mixed-integer programming problem via a simple yet optimal finite-set search method by reducing the search space. Simulation results show that the proposed solutions developed for realistic wireless analog channels outperform a benchmark method, and achieve comparable performance of the ideal case where FL is implemented over noise-free wireless channels.
We study the problem of training deep neural networks with Rectified Linear Unit (ReLU) activiation function using gradient descent and stochastic gradient descent. In particular, we study the binary classification problem and show that for a broad family of loss functions, with proper random weight initialization, both gradient descent and stochastic gradient descent can find the global minima of the training loss for an over-parameterized deep ReLU network, under mild assumption on the training data. The key idea of our proof is that Gaussian random initialization followed by (stochastic) gradient descent produces a sequence of iterates that stay inside a small perturbation region centering around the initial weights, in which the empirical loss function of deep ReLU networks enjoys nice local curvature properties that ensure the global convergence of (stochastic) gradient descent. Our theoretical results shed light on understanding the optimization of deep learning, and pave the way to study the optimization dynamics of training modern deep neural networks.