The proximal Galerkin finite element method is a high-order, low iteration complexity, nonlinear numerical method that preserves the geometric and algebraic structure of pointwise bound constraints in infinite-dimensional function spaces. This paper introduces the proximal Galerkin method and applies it to solve free boundary problems, enforce discrete maximum principles, and develop a scalable, mesh-independent algorithm for optimal design problems with pointwise bound constraints. This paper also provides a derivation of the latent variable proximal point (LVPP) algorithm, an unconditionally stable alternative to the interior point method. LVPP is an infinite-dimensional optimization algorithm that may be viewed as having an adaptive barrier function that is updated with a new informative prior at each (outer loop) optimization iteration. One of its main benefits is witnessed when analyzing the classical obstacle problem. Therein, we find that the original variational inequality can be replaced by a sequence of partial differential equations (PDEs) that are readily discretized and solved with, e.g., high-order finite elements. Throughout this work, we arrive at several unexpected contributions that may be of independent interest. These include (1) a semilinear PDE we refer to as the entropic Poisson equation; (2) an algebraic/geometric connection between high-order positivity-preserving discretizations and certain infinite-dimensional Lie groups; and (3) a gradient-based, bound-preserving algorithm for two-field density-based topology optimization. The complete latent variable proximal Galerkin methodology combines ideas from nonlinear programming, functional analysis, tropical algebra, and differential geometry and can potentially lead to new synergies among these areas as well as within variational and numerical analysis.
Sparse attention as a efficient method can significantly decrease the computation cost, but current sparse attention tend to rely on window self attention which block the global information flow. For this problem, we present Shifted Cross Chunk Attention (SCCA), using different KV shifting strategy to extend respective field in each attention layer. Except, we combine Dilated Attention(DA) and Dilated Neighborhood Attention(DNA) to present Shifted Dilated Attention(SDA). Both SCCA and SDA can accumulate attention results in multi head attention to obtain approximate respective field in full attention. In this paper, we conduct language modeling experiments using different pattern of SCCA and combination of SCCA and SDA. The proposed shifted cross chunk attention (SCCA) can effectively extend large language models (LLMs) to longer context combined with Positional interpolation(PI) and LoRA than current sparse attention. Notably, SCCA adopts LLaMA2 7B from 4k context to 8k in single V100. This attention pattern can provide a Plug-and-play fine-tuning method to extend model context while retaining their original architectures, and is compatible with most existing techniques.
Calibration tests based on the probability integral transform (PIT) are routinely used to assess the quality of univariate distributional forecasts. However, PIT-based calibration tests for multivariate distributional forecasts face various challenges. We propose two new types of tests based on proper scoring rules, which overcome these challenges. They arise from a general framework for calibration testing in the multivariate case, introduced in this work. The new tests have good size and power properties in simulations and solve various problems of existing tests. We apply the tests to forecast distributions for macroeconomic and financial time series data.
Linear complementary pairs (LCPs) of codes have been studied since they were introduced in the context of discussing mitigation measures against possible hardware attacks to integrated circuits. Since the security parameters for LCPs of codes are defined from the (Hamming) distance and the dual distance of the codes in the pair, and the additional algebraic structure of skew constacyclic codes provides tools for studying the the dual and the distance of a code, we study the properties of LCPs of skew constacyclic codes. As a result, we give a characterization for those pairs, as well as multiple results that lead to constructing pairs with designed security parameters. We extend skew BCH codes to a constacyclic context and show that an LCP of codes can be immediately constructed from a skew BCH constacyclic code. Additionally, we describe a Hamming weight-preserving automorphism group in the set of skew constacyclic codes, which can be used for constructing LCPs of codes.
In this work, an efficient and robust isogeometric three-dimensional solid-beam finite element is developed for large deformations and finite rotations with merely displacements as degrees of freedom. The finite strain theory and hyperelastic constitutive models are considered and B-Spline and NURBS are employed for the finite element discretization. Similar to finite elements based on Lagrange polynomials, also NURBS-based formulations are affected by the non-physical phenomena of locking, which constrains the field variables and negatively impacts the solution accuracy and deteriorates convergence behavior. To avoid this problem within the context of a Solid-Beam formulation, the Assumed Natural Strain (ANS) method is applied to alleviate membrane and transversal shear locking and the Enhanced Assumed Strain (EAS) method against Poisson thickness locking. Furthermore, the Mixed Integration Point (MIP) method is employed to make the formulation more efficient and robust. The proposed novel isogeometric solid-beam element is tested on several single-patch and multi-patch benchmark problems, and it is validated against classical solid finite elements and isoparametric solid-beam elements. The results show that the proposed formulation can alleviate the locking effects and significantly improve the performance of the isogeometric solid-beam element. With the developed element, efficient and accurate predictions of mechanical properties of lattice-based structured materials can be achieved. The proposed solid-beam element inherits both the merits of solid elements e.g. flexible boundary conditions and of the beam elements i.e. higher computational efficiency.
In survival analysis, complex machine learning algorithms have been increasingly used for predictive modeling. Given a collection of features available for inclusion in a predictive model, it may be of interest to quantify the relative importance of a subset of features for the prediction task at hand. In particular, in HIV vaccine trials, participant baseline characteristics are used to predict the probability of infection over the intended follow-up period, and investigators may wish to understand how much certain types of predictors, such as behavioral factors, contribute toward overall predictiveness. Time-to-event outcomes such as time to infection are often subject to right censoring, and existing methods for assessing variable importance are typically not intended to be used in this setting. We describe a broad class of algorithm-agnostic variable importance measures for prediction in the context of survival data. We propose a nonparametric efficient estimation procedure that incorporates flexible learning of nuisance parameters, yields asymptotically valid inference, and enjoys double-robustness. We assess the performance of our proposed procedure via numerical simulations and analyze data from the HVTN 702 study to inform enrollment strategies for future HIV vaccine trials.
Singularly perturbed boundary value problems pose a significant challenge for their numerical approximations because of the presence of sharp boundary layers. These sharp boundary layers are responsible for the stiffness of solutions, which leads to large computational errors, if not properly handled. It is well-known that the classical numerical methods as well as the Physics-Informed Neural Networks (PINNs) require some special treatments near the boundary, e.g., using extensive mesh refinements or finer collocation points, in order to obtain an accurate approximate solution especially inside of the stiff boundary layer. In this article, we modify the PINNs and construct our new semi-analytic SL-PINNs suitable for singularly perturbed boundary value problems. Performing the boundary layer analysis, we first find the corrector functions describing the singular behavior of the stiff solutions inside boundary layers. Then we obtain the SL-PINN approximations of the singularly perturbed problems by embedding the explicit correctors in the structure of PINNs or by training the correctors together with the PINN approximations. Our numerical experiments confirm that our new SL-PINN methods produce stable and accurate approximations for stiff solutions.
In this work we consider the two dimensional instationary Navier-Stokes equations with homogeneous Dirichlet/no-slip boundary conditions. We show error estimates for the fully discrete problem, where a discontinuous Galerkin method in time and inf-sup stable finite elements in space are used. Recently, best approximation type error estimates for the Stokes problem in the $L^\infty(I;L^2(\Omega))$, $L^2(I;H^1(\Omega))$ and $L^2(I;L^2(\Omega))$ norms have been shown. The main result of the present work extends the error estimate in the $L^\infty(I;L^2(\Omega))$ norm to the Navier-Stokes equations, by pursuing an error splitting approach and an appropriate duality argument. In order to discuss the stability of solutions to the discrete primal and dual equations, a specially tailored discrete Gronwall lemma is presented. The techniques developed towards showing the $L^\infty(I;L^2(\Omega))$ error estimate, also allow us to show best approximation type error estimates in the $L^2(I;H^1(\Omega))$ and $L^2(I;L^2(\Omega))$ norms, which complement this work.
The accurate and efficient evaluation of Newtonian potentials over general 2-D domains is important for the numerical solution of Poisson's equation and volume integral equations. In this paper, we present a simple and efficient high-order algorithm for computing the Newtonian potential over a planar domain discretized by an unstructured mesh. The algorithm is based on the use of Green's third identity for transforming the Newtonian potential into a collection of layer potentials over the boundaries of the mesh elements, which can be easily evaluated by the Helsing-Ojala method. One important component of our algorithm is the use of high-order (up to order 20) bivariate polynomial interpolation in the monomial basis, for which we provide extensive justification. The performance of our algorithm is illustrated through several numerical experiments.
Finite-dimensional truncations are routinely used to approximate partial differential equations (PDEs), either to obtain numerical solutions or to derive reduced-order models. The resulting discretized equations are known to violate certain physical properties of the system. In particular, first integrals of the PDE may not remain invariant after discretization. Here, we use the method of reduced-order nonlinear solutions (RONS) to ensure that the conserved quantities of the PDE survive its finite-dimensional truncation. In particular, we develop two methods: Galerkin RONS and finite volume RONS. Galerkin RONS ensures the conservation of first integrals in Galerkin-type truncations, whether used for direct numerical simulations or reduced-order modeling. Similarly, finite volume RONS conserves any number of first integrals of the system, including its total energy, after finite volume discretization. Both methods are applicable to general time-dependent PDEs and can be easily incorporated in existing Galerkin-type or finite volume code. We demonstrate the efficacy of our methods on two examples: direct numerical simulations of the shallow water equation and a reduced-order model of the nonlinear Schrodinger equation. As a byproduct, we also generalize RONS to phenomena described by a system of PDEs.
We consider the numerical approximation of a continuum model of antiferromagnetic and ferrimagnetic materials. The state of the material is described in terms of two unit-length vector fields, which can be interpreted as the magnetizations averaging the spins of two sublattices. For the static setting, which requires the solution of a constrained energy minimization problem, we introduce a discretization based on first-order finite elements and prove its $\Gamma$-convergence. Then, we propose and analyze two iterative algorithms for the computation of low-energy stationary points. The algorithms are obtained from (semi-)implicit time discretizations of gradient flows of the energy. Finally, we extend the algorithms to the dynamic setting, which consists of a nonlinear system of two Landau-Lifshitz-Gilbert equations solved by the two fields, and we prove unconditional stability and convergence of the finite element approximations toward a weak solution of the problem. Numerical experiments assess the performance of the algorithms and demonstrate their applicability for the simulation of physical processes involving antiferromagnetic and ferrimagnetic materials.