This paper is devoted to the development and convergence analysis of greedy reconstruction algorithms based on the strategy presented in [Y. Maday and J. Salomon, Joint Proceedings of the 48th IEEE Conference on Decision and Control and the 28th Chinese Control Conference, 2009, pp. 375--379]. These procedures allow the design of a sequence of control functions that ease the identification of unknown operators in nonlinear dynamical systems. The original strategy of greedy reconstruction algorithms is based on an offline/online decomposition of the reconstruction process and an ansatz for the unknown operator obtained by an a priori chosen set of linearly independent matrices. In the previous work [S. Buchwald, G. Ciaramella and J. Salomon, SIAM J. Control Optim., 59(6), pp. 4511-4537], convergence results were obtained in the case of linear identification problems. We tackle here the more general case of nonlinear systems. More precisely, we introduce a new greedy algorithm based on the linearized system. Then, we show that the controls obtained with this new algorithm lead to the local convergence of the classical Gauss-Newton method applied to the online nonlinear identification problem. We then extend this result to the controls obtained on nonlinear systems where a local convergence result is also proved. The main convergence results are obtained for the reconstruction of drift operators in dynamical systems with linear and bilinear control structures.
In this paper we consider the numerical solution of fractional differential equations. In particular, we study a step-by-step graded mesh procedure based on an expansion of the vector field using orthonormal Jacobi polynomials. Under mild hypotheses, the proposed procedure is capable of getting spectral accuracy. A few numerical examples are reported to confirm the theoretical findings.
In this study, we present a precise anisotropic interpolation error estimate for the Morley finite element method (FEM) and apply it to fourth-order elliptical equations. We did not impose a shape-regularity mesh condition for the analysis. Therefore, anisotropic meshes can be used. The main contributions of this study include providing new proof of the consistency term. This enabled us to obtain an anisotropic consistency error estimate. The core idea of the proof involves using the relationship between the Raviart--Thomas and Morley finite element spaces. Our results show optimal convergence rates and imply that the modified Morley FEM may be effective for errors.
We introduce a lower bounding technique for the min max correlation clustering problem and, based on this technique, a combinatorial 4-approximation algorithm for complete graphs. This improves upon the previous best known approximation guarantees of 5, using a linear program formulation (Kalhan et al., 2019), and 4, for a combinatorial algorithm (Davies et al., 2023). We extend this algorithm by a greedy joining heuristic and show empirically that it improves the state of the art in solution quality and runtime on several benchmark datasets.
Linear logic has provided new perspectives on proof-theory, denotational semantics and the study of programming languages. One of its main successes are proof-nets, canonical representations of proofs that lie at the intersection between logic and graph theory. In the case of the minimalist proof-system of multiplicative linear logic without units (MLL), these two aspects are completely fused: proof-nets for this system are graphs satisfying a correctness criterion that can be fully expressed in the language of graphs. For more expressive logical systems (containing logical constants, quantifiers and exponential modalities), this is not completely the case. The purely graphical approach of proof-nets deprives them of any sequential structure that is crucial to represent the order in which arguments are presented, which is necessary for these extensions. Rebuilding this order of presentation - sequentializing the graph - is thus a requirement for a graph to be logical. Presentations and study of the artifacts ensuring that sequentialization can be done, such as boxes or jumps, are an integral part of researches on linear logic. Jumps, extensively studied by Faggian and di Giamberardino, can express intermediate degrees of sequentialization between a sequent calculus proof and a fully desequentialized proof-net. We propose to analyze the logical strength of jumps by internalizing them in an extention of MLL where axioms on a specific formula, the jumping formula, introduce constrains on the possible sequentializations. The jumping formula needs to be treated non-linearly, which we do either axiomatically, or by embedding it in a very controlled fragment of multiplicative-exponential linear logic, uncovering the exponential logic of sequentialization.
The present work is concerned with the extension of modified potential operator splitting methods to specific classes of nonlinear evolution equations. The considered partial differential equations of Schr{\"o}dinger and parabolic type comprise the Laplacian, a potential acting as multiplication operator, and a cubic nonlinearity. Moreover, an invariance principle is deduced that has a significant impact on the efficient realisation of the resulting modified operator splitting methods for the Schr{\"o}dinger case.} Numerical illustrations for the time-dependent Gross--Pitaevskii equation in the physically most relevant case of three space dimensions and for its parabolic counterpart related to ground state and excited state computations confirm the benefits of the proposed fourth-order modified operator splitting method in comparison with standard splitting methods. The presented results are novel and of particular interest from both, a theoretical perspective to inspire future investigations of modified operator splitting methods for other classes of nonlinear evolution equations and a practical perspective to advance the reliable and efficient simulation of Gross--Pitaevskii systems in real and imaginary time.
Stiff systems of ordinary differential equations (ODEs) and sparse training data are common in scientific problems. This paper describes efficient, implicit, vectorized methods for integrating stiff systems of ordinary differential equations through time and calculating parameter gradients with the adjoint method. The main innovation is to vectorize the problem both over the number of independent times series and over a batch or "chunk" of sequential time steps, effectively vectorizing the assembly of the implicit system of ODEs. The block-bidiagonal structure of the linearized implicit system for the backward Euler method allows for further vectorization using parallel cyclic reduction (PCR). Vectorizing over both axes of the input data provides a higher bandwidth of calculations to the computing device, allowing even problems with comparatively sparse data to fully utilize modern GPUs and achieving speed ups of greater than 100x, compared to standard, sequential time integration. We demonstrate the advantages of implicit, vectorized time integration with several example problems, drawn from both analytical stiff and non-stiff ODE models as well as neural ODE models. We also describe and provide a freely available open-source implementation of the methods developed here.
This paper begins with a study of both the exact distribution and the asymptotic distribution of the empirical correlation of two independent AR(1) processes with Gaussian innovations. We proceed to develop rates of convergence for the distribution of the scaled empirical correlation %(i.e. the empirical correlation times the square root of the number of data points times a normalized constant) to the standard Gaussian distribution in both Wasserstein distance and in Kolmogorov distance. Given $n$ data points, we prove the convergence rate in Wasserstein distance is $n^{-1/2}$ and the convergence rate in Kolmogorov distance is $n^{-1/2} \sqrt{\ln n}$. We then compute rates of convergence of the scaled empirical correlation to the standard Gaussian distribution for two additional classes of AR(1) processes: (i) two AR(1) processes with correlated Gaussian increments and (ii) two independent AR(1) processes driven by white noise in the second Wiener chaos.
We propose a novel surrogate modelling approach to efficiently and accurately approximate the response of complex dynamical systems driven by time-varying exogenous excitations over extended time periods. Our approach, namely manifold nonlinear autoregressive modelling with exogenous input (mNARX), involves constructing a problem-specific exogenous input manifold that is optimal for constructing autoregressive surrogates. The manifold, which forms the core of mNARX, is constructed incrementally by incorporating the physics of the system, as well as prior expert- and domain- knowledge. Because mNARX decomposes the full problem into a series of smaller sub-problems, each with a lower complexity than the original, it scales well with the complexity of the problem, both in terms of training and evaluation costs of the final surrogate. Furthermore, mNARX synergizes well with traditional dimensionality reduction techniques, making it highly suitable for modelling dynamical systems with high-dimensional exogenous inputs, a class of problems that is typically challenging to solve. Since domain knowledge is particularly abundant in physical systems, such as those found in civil and mechanical engineering, mNARX is well suited for these applications. We demonstrate that mNARX outperforms traditional autoregressive surrogates in predicting the response of a classical coupled spring-mass system excited by a one-dimensional random excitation. Additionally, we show that mNARX is well suited for emulating very high-dimensional time- and state-dependent systems, even when affected by active controllers, by surrogating the dynamics of a realistic aero-servo-elastic onshore wind turbine simulator. In general, our results demonstrate that mNARX offers promising prospects for modelling complex dynamical systems, in terms of accuracy and efficiency.
We derive information-theoretic generalization bounds for supervised learning algorithms based on the information contained in predictions rather than in the output of the training algorithm. These bounds improve over the existing information-theoretic bounds, are applicable to a wider range of algorithms, and solve two key challenges: (a) they give meaningful results for deterministic algorithms and (b) they are significantly easier to estimate. We show experimentally that the proposed bounds closely follow the generalization gap in practical scenarios for deep learning.
When and why can a neural network be successfully trained? This article provides an overview of optimization algorithms and theory for training neural networks. First, we discuss the issue of gradient explosion/vanishing and the more general issue of undesirable spectrum, and then discuss practical solutions including careful initialization and normalization methods. Second, we review generic optimization methods used in training neural networks, such as SGD, adaptive gradient methods and distributed methods, and theoretical results for these algorithms. Third, we review existing research on the global issues of neural network training, including results on bad local minima, mode connectivity, lottery ticket hypothesis and infinite-width analysis.