The widespread use of maximum Jeffreys'-prior penalized likelihood in binomial-response generalized linear models, and in logistic regression, in particular, are supported by the results of Kosmidis and Firth (2021, Biometrika), who show that the resulting estimates are also always finite-valued, even in cases where the maximum likelihood estimates are not, which is a practical issue regardless of the size of the data set. In logistic regression, the implied adjusted score equations are formally bias-reducing in asymptotic frameworks with a fixed number of parameters and appear to deliver a substantial reduction in the persistent bias of the maximum likelihood estimator in high-dimensional settings where the number of parameters grows asymptotically linearly and slower than the number of observations. In this work, we develop and present two new variants of iteratively reweighted least squares for estimating generalized linear models with adjusted score equations for mean bias reduction and maximization of the likelihood penalized by a positive power of the Jeffreys-prior penalty, which eliminate the requirement of storing $O(n)$ quantities in memory, and can operate with data sets that exceed computer memory or even hard drive capacity. We achieve that through incremental QR decompositions, which enable IWLS iterations to have access only to data chunks of predetermined size. We assess the procedures through a real-data application with millions of observations.
This paper introduces a new theoretical and computational framework for a data driven Koopman mode analysis of nonlinear dynamics. To alleviate the potential problem of ill-conditioned eigenvectors in the existing implementations of the Dynamic Mode Decomposition (DMD) and the Extended Dynamic Mode Decomposition (EDMD), the new method introduces a Koopman-Schur decomposition that is entirely based on unitary transformations. The analysis in terms of the eigenvectors as modes of a Koopman operator compression is replaced with a modal decomposition in terms of a flag of invariant subspaces that correspond to selected eigenvalues. The main computational tool from the numerical linear algebra is the partial ordered Schur decomposition that provides convenient orthonormal bases for these subspaces. In the case of real data, a real Schur form is used and the computation is based on real orthogonal transformations. The new computational scheme is presented in the framework of the Extended DMD and the kernel trick is used.
This work deals with developing two fast randomized algorithms for computing the generalized tensor singular value decomposition (GTSVD) based on the tubal product (t-product). The random projection method is utilized to compute the important actions of the underlying data tensors and use them to get small sketches of the original data tensors, which are easier to be handled. Due to the small size of the sketch tensors, deterministic approaches are applied to them to compute their GTSVDs. Then, from the GTSVD of the small sketch tensors, the GTSVD of the original large-scale data tensors is recovered. Some experiments are conducted to show the effectiveness of the proposed approach.
In this paper, we present a stochastic method for the simulation of Laplace's equation with a mixed boundary condition in planar domains that are polygonal or bounded by circular arcs. We call this method the Reflected Walk on Spheres algorithm. The method combines a traditional Walk on Spheres algorithm with use of reflections at the Neumann boundaries. We apply our algorithm to simulate numerical conformal mappings from certain quadrilaterals to the corresponding canonical domains, and to compute their conformal moduli. Finally, we give examples of the method on three dimensional polyhedral domains, and use it to simulate the heat flow on an L-shaped insulated polyhedron.
Busy-waiting is an important, low-level synchronization pattern that is used to implement higher-level abstractions for synchronization. Its termination depends on cooperation by other threads as well as a fair thread scheduler. We present a general approach for modularly verifying busy-waiting concurrent programs based on higher-order separation logic. The approach combines two strands of prior work. First, the Jacobs and Piessens (2011) higher-order-programming perspective for verifying concurrent modules. Second, the Reinhard and Jacobs (2021) ghost signals approach to verify busy-waiting. The latter uses classical specifications for synchronization constructs where the module creates and discharges obligations. Such specifications, however, fix particular client patterns and would in general require "obligation transfer" to handle more intricate wait dependencies. This precludes clients from performing lock handoffs, an important mechanism to control (un)fairness in the design of locks. Our contribution -- inspired by D'Osualdo, Sutherland, Farzan and Gardner (2021)'s TaDA Live -- is to require the client to create and discharge obligations as necessary to satisfy the module's liveness requirements. However, instead of building these liveness requirements into the logic, we express them by having the module's operations take auxiliary code as arguments whose job it is to generate the call permissions the module needs for its busy-waiting. In the paper we present specifications and proofs in Iris. We validated our approach by developing a (non-foundational) machine-checked proof of a cohort lock -- to the best of our knowledge the first of its kind -- using an encoding of our approach in the VeriFast program verifier for C and Java. This fair lock is implemented on top of another fair lock module and involves lock handoff, thus exercising the asserted contributions.
This paper presents the first application of the direct parametrisation method for invariant manifolds to a fully coupled multiphysics problem involving the nonlinear vibrations of deformable structures subjected to an electrostatic field. The formulation proposed is intended for model order reduction of electrostatically actuated resonating Micro-Electro-Mechanical Systems (MEMS). The continuous problem is first rewritten in a manner that can be directly handled by the parametrisation method, which relies upon automated asymptotic expansions. A new mixed fully Lagrangian formulation is thus proposed which contains only explicit polynomial nonlinearities, which is then discretised in the framework of finite element procedures. Validation is performed on the classical parallel plate configuration, where different formulations using either the general framework, or an approximation of the electrostatic field due to the geometric configuration selected, are compared. Reduced-order models along these formulations are also compared to full-order simulations operated with a time integration approach. Numerical results show a remarkable performance both in terms of accuracy and wealth of nonlinear effects that can be accounted for. In particular, the transition from hardening to softening behaviour of the primary resonance while increasing the constant voltage component of the electric actuation, is recovered. Secondary resonances leading to superharmonic and parametric resonances are also investigated with the reduced-order model.
Many interesting physical problems described by systems of hyperbolic conservation laws are stiff, and thus impose a very small time-step because of the restrictive CFL stability condition. In this case, one can exploit the superior stability properties of implicit time integration which allows to choose the time-step only from accuracy requirements, and thus avoid the use of small time-steps. We discuss an efficient framework to devise high order implicit schemes for stiff hyperbolic systems without tailoring it to a specific problem. The nonlinearity of high order schemes, due to space- and time-limiting procedures which control nonphysical oscillations, makes the implicit time integration difficult, e.g.~because the discrete system is nonlinear also on linear problems. This nonlinearity of the scheme is circumvented as proposed in (Puppo et al., Comm.~Appl.~Math.~\& Comput., 2023) for scalar conservation laws, where a first order implicit predictor is computed to freeze the nonlinear coefficients of the essentially non-oscillatory space reconstruction, and also to assist limiting in time. In addition, we propose a novel conservative flux-centered a-posteriori time-limiting procedure using numerical entropy indicators to detect troubled cells. The numerical tests involve classical and artificially devised stiff problems using the Euler's system of gas-dynamics.
As a crossover frontier of physics and mechanics, quantum computing is showing its great potential in computational mechanics. However, quantum hardware noise remains a critical barrier to achieving accurate simulation results due to the limitation of the current hardware level. In this paper, we integrate error-mitigated quantum computing in data-driven computational mechanics, where the zero-noise extrapolation (ZNE) technique is employed to improve the accuracy of quantum computing. Numerical examples including multiscale simulation of a composite L-shaped beam are conducted with the quantum computer simulator Qpanda, and the results validate the effectiveness of the proposed method. We believe this work presents a promising step towards using the power of quantum computing in computational mechanics.
Deep neural networks (DNNs) often fail silently with over-confident predictions on out-of-distribution (OOD) samples, posing risks in real-world deployments. Existing techniques predominantly emphasize either the feature representation space or the gradient norms computed with respect to DNN parameters, yet they overlook the intricate gradient distribution and the topology of classification regions. To address this gap, we introduce GRadient-aware Out-Of-Distribution detection in interpolated manifolds (GROOD), a novel framework that relies on the discriminative power of gradient space to distinguish between in-distribution (ID) and OOD samples. To build this space, GROOD relies on class prototypes together with a prototype that specifically captures OOD characteristics. Uniquely, our approach incorporates a targeted mix-up operation at an early intermediate layer of the DNN to refine the separation of gradient spaces between ID and OOD samples. We quantify OOD detection efficacy using the distance to the nearest neighbor gradients derived from the training set, yielding a robust OOD score. Experimental evaluations substantiate that the introduction of targeted input mix-upamplifies the separation between ID and OOD in the gradient space, yielding impressive results across diverse datasets. Notably, when benchmarked against ImageNet-1k, GROOD surpasses the established robustness of state-of-the-art baselines. Through this work, we establish the utility of leveraging gradient spaces and class prototypes for enhanced OOD detection for DNN in image classification.
Partitioned neural network functions are used to approximate the solution of partial differential equations. The problem domain is partitioned into non-overlapping subdomains and the partitioned neural network functions are defined on the given non-overlapping subdomains. Each neural network function then approximates the solution in each subdomain. To obtain the convergent neural network solution, certain continuity conditions on the partitioned neural network functions across the subdomain interface need to be included in the loss function, that is used to train the parameters in the neural network functions. In our work, by introducing suitable interface values, the loss function is reformulated into a sum of localized loss functions and each localized loss function is used to train the corresponding local neural network parameters. In addition, to accelerate the neural network solution convergence, the localized loss function is enriched with an augmented Lagrangian term, where the interface condition and the boundary condition are enforced as constraints on the local solutions by using Lagrange multipliers. The local neural network parameters and Lagrange multipliers are then found by optimizing the localized loss function. To take the advantage of the localized loss function for the parallel computation, an iterative algorithm is also proposed. For the proposed algorithms, their training performance and convergence are numerically studied for various test examples.
Conventional computing paradigm struggles to fulfill the rapidly growing demands from emerging applications, especially those for machine intelligence, because much of the power and energy is consumed by constant data transfers between logic and memory modules. A new paradigm, called "computational random-access memory (CRAM)" has emerged to address this fundamental limitation. CRAM performs logic operations directly using the memory cells themselves, without having the data ever leave the memory. The energy and performance benefits of CRAM for both conventional and emerging applications have been well established by prior numerical studies. However, there lacks an experimental demonstration and study of CRAM to evaluate its computation accuracy, which is a realistic and application-critical metrics for its technological feasibility and competitiveness. In this work, a CRAM array based on magnetic tunnel junctions (MTJs) is experimentally demonstrated. First, basic memory operations as well as 2-, 3-, and 5-input logic operations are studied. Then, a 1-bit full adder with two different designs is demonstrated. Based on the experimental results, a suite of modeling has been developed to characterize the accuracy of CRAM computation. Further analysis of scalar addition, multiplication, and matrix multiplication shows promising results. These results are then applied to a complete application: a neural network based handwritten digit classifier, as an example to show the connection between the application performance and further MTJ development. The classifier achieved almost-perfect classification accuracy, with reasonable projections of future MTJ development. With the confirmation of MTJ-based CRAM's accuracy, there is a strong case that this technology will have a significant impact on power- and energy-demanding applications of machine intelligence.