The usage of numerical homogenization to obtain structure-property relations using the finite element method at both the micro and macro scale has gained much interest in the research community. However the computational cost of this so called FE\textsuperscript{2} method is so high, that algorithmic modifications and reduction methods are essential. Currently the authors proposed a monolithic algorithm. Now this algorithm is combined with ROM and ECM hyper integration, applied at finite deformations and complemented by a clustered training strategy, which lowers the training effort and the number of necessary ROM modes immensely. An implementation in terms of an extension for the already established MonolithFE\textsuperscript{2} code is provided. Numerical examples show the efficiency and accuracy of the monolithic hyper ROM FE\textsuperscript{2} method and the advantages of the clustered training strategy. The applied methods are modularly combinable as aimed in finite element approaches.
The minimax theorem for zero-sum games is easily proved from the strong duality theorem of linear programming. For the converse direction, the standard proof by Dantzig (1951) is known to be incomplete. We explain and combine classical theorems about solving linear equations with nonnegative variables to give a correct alternative proof, more directly than Adler (2013). We also extend Dantzig's game so that any max-min strategy gives either an optimal LP solution or shows that none exists.
A technique is described in this paper to avoid order reduction when integrating reaction-diffusion initial boundary value problems with explicit exponential Rosenbrock methods. The technique is valid for any Rosenbrock method, without having to impose any stiff order conditions, and for general time-dependent boundary values. An analysis on the global error is thoroughly performed and some numerical experiments are shown which corroborate the theoretical results, and in which a big gain in efficiency with respect to applying the standard method of lines can be observed.
The crossed random-effects model is widely used in applied statistics, finding applications in various fields such as longitudinal studies, e-commerce, and recommender systems, among others. However, these models encounter scalability challenges, as the computational time grows disproportionately with the number of data points, typically following a cubic root relationship (N^(3/2) or worse) with N. Our inspiration for addressing this issue comes from observing the recommender system employed by an online clothing retailer. Our dataset comprises over 700,000 clients, 5,000 items, and 5,000,000 measurements. When applying the maximum likelihood approach to fit crossed random effects, computational inefficiency becomes a significant concern, limiting the applicability of this approach in large-scale settings. To tackle the scalability issues, previous research by Ghosh et al. (2022a) and Ghosh et al. (2022b) has explored linear and logistic regression models utilizing fixed-effect features based on client and item variables, while incorporating random intercept terms for clients and items. In this study, we present a more generalized version of the problem, allowing random effect sizes/slopes. This extension enables us to capture the variability in effect size among both clients and items. Importantly, we have developed a scalable solution to address the aforementioned problem and have empirically demonstrated the consistency of our estimates. Specifically, as the number of data points increases, our estimates converge towards the true parameters. To validate our approach, we implement the proposed algorithm using Stitch Fix data.
Summation-by-parts (SBP) operators allow us to systematically develop energy-stable and high-order accurate numerical methods for time-dependent differential equations. Until recently, the main idea behind existing SBP operators was that polynomials can accurately approximate the solution, and SBP operators should thus be exact for them. However, polynomials do not provide the best approximation for some problems, with other approximation spaces being more appropriate. We recently addressed this issue and developed a theory for one-dimensional SBP operators based on general function spaces, coined function-space SBP (FSBP) operators. In this paper, we extend the theory of FSBP operators to multiple dimensions. We focus on their existence, connection to quadratures, construction, and mimetic properties. A more exhaustive numerical demonstration of multi-dimensional FSBP (MFSBP) operators and their application will be provided in future works. Similar to the one-dimensional case, we demonstrate that most of the established results for polynomial-based multi-dimensional SBP (MSBP) operators carry over to the more general class of MFSBP operators. Our findings imply that the concept of SBP operators can be applied to a significantly larger class of methods than is currently done. This can increase the accuracy of the numerical solutions and/or provide stability to the methods.
This paper considers the Cauchy problem for the nonlinear dynamic string equation of Kirchhoff-type with time-varying coefficients. The objective of this work is to develop a temporal discretization algorithm capable of approximating a solution to this initial-boundary value problem. To this end, a symmetric three-layer semi-discrete scheme is employed with respect to the temporal variable, wherein the value of a nonlinear term is evaluated at the middle node point. This approach enables the numerical solutions per temporal step to be obtained by inverting the linear operators, yielding a system of second-order linear ordinary differential equations. Local convergence of the proposed scheme is established, and it achieves quadratic convergence concerning the step size of the discretization of time on the local temporal interval. We have conducted several numerical experiments using the proposed algorithm for various test problems to validate its performance. It can be said that the obtained numerical results are in accordance with the theoretical findings.
Perfect synchronization in distributed machine learning problems is inefficient and even impossible due to the existence of latency, package losses and stragglers. We propose a Robust Fully-Asynchronous Stochastic Gradient Tracking method (R-FAST), where each device performs local computation and communication at its own pace without any form of synchronization. Different from existing asynchronous distributed algorithms, R-FAST can eliminate the impact of data heterogeneity across devices and allow for packet losses by employing a robust gradient tracking strategy that relies on properly designed auxiliary variables for tracking and buffering the overall gradient vector. More importantly, the proposed method utilizes two spanning-tree graphs for communication so long as both share at least one common root, enabling flexible designs in communication architectures. We show that R-FAST converges in expectation to a neighborhood of the optimum with a geometric rate for smooth and strongly convex objectives; and to a stationary point with a sublinear rate for general non-convex settings. Extensive experiments demonstrate that R-FAST runs 1.5-2 times faster than synchronous benchmark algorithms, such as Ring-AllReduce and D-PSGD, while still achieving comparable accuracy, and outperforms existing asynchronous SOTA algorithms, such as AD-PSGD and OSGP, especially in the presence of stragglers.
A numerical procedure providing guaranteed two-sided bounds on the effective coefficients of elliptic partial differential operators is presented. The upper bounds are obtained in a standard manner through the variational formulation of the problem and by applying the finite element method. To obtain the lower bounds we formulate the dual variational problem and introduce appropriate approximation spaces employing the finite element method as well. We deal with the 3D setting, which has been rarely considered in the literature so far. The theoretical justification of the procedure is presented and supported with illustrative examples.
Consider the community detection problem in random hypergraphs under the non-uniform hypergraph stochastic block model (HSBM), where each hyperedge appears independently with some given probability depending only on the labels of its vertices. We establish, for the first time in the literature, a sharp threshold for exact recovery under this non-uniform case, subject to minor constraints; in particular, we consider the model with multiple communities ($K \geq 2$). One crucial point here is that by aggregating information from all the uniform layers, we may obtain exact recovery even in cases when this may appear impossible if each layer were considered alone. Two efficient algorithms that successfully achieve exact recovery above the threshold are provided. The theoretical analysis of our algorithms relies on the concentration and regularization of the adjacency matrix for non-uniform random hypergraphs, which could be of independent interest. We also address some open problems regarding parameter knowledge and estimation.
Understanding dynamics in complex systems is challenging because there are many degrees of freedom, and those that are most important for describing events of interest are often not obvious. The leading eigenfunctions of the transition operator are useful for visualization, and they can provide an efficient basis for computing statistics such as the likelihood and average time of events (predictions). Here we develop inexact iterative linear algebra methods for computing these eigenfunctions (spectral estimation) and making predictions from a data set of short trajectories sampled at finite intervals. We demonstrate the methods on a low-dimensional model that facilitates visualization and a high-dimensional model of a biomolecular system. Implications for the prediction problem in reinforcement learning are discussed.
Accurate error estimation is crucial in model order reduction, both to obtain small reduced-order models and to certify their accuracy when deployed in downstream applications such as digital twins. In existing a posteriori error estimation approaches, knowledge about the time integration scheme is mandatory, e.g., the residual-based error estimators proposed for the reduced basis method. This poses a challenge when automatic ordinary differential equation solver libraries are used to perform the time integration. To address this, we present a data-enhanced approach for a posteriori error estimation. Our new formulation enables residual-based error estimators to be independent of any time integration method. To achieve this, we introduce a corrected reduced-order model which takes into account a data-driven closure term for improved accuracy. The closure term, subject to mild assumptions, is related to the local truncation error of the corresponding time integration scheme. We propose efficient computational schemes for approximating the closure term, at the cost of a modest amount of training data. Furthermore, the new error estimator is incorporated within a greedy process to obtain parametric reduced-order models. Numerical results on three different systems show the accuracy of the proposed error estimation approach and its ability to produce ROMs that generalize well.