This paper proposes a new parameterized enhanced shift-splitting (PESS) preconditioner to solve the three-by-three block saddle point problem (SPP). Additionally, we introduce a local PESS (LPESS) preconditioner by relaxing the PESS preconditioner. Necessary and sufficient criteria are established for the convergence of the proposed PESS iterative process for any initial guess. Furthermore, we meticulously investigate the spectral bounds of the PESS and LPESS preconditioned matrices. Moreover, empirical investigations have been performed for the sensitivity analysis of the proposed PESS preconditioner, which unveils its robustness. Numerical experiments are carried out to demonstrate the enhanced efficiency and robustness of the proposed PESS and LPESS preconditioners compared to the existing state-of-the-art preconditioners.
Zero-shot subject-driven image generation aims to produce images that incorporate a subject from a given example image. The challenge lies in preserving the subject's identity while aligning with the text prompt which often requires modifying certain aspects of the subject's appearance. Despite advancements in diffusion model based methods, existing approaches still struggle to balance identity preservation with text prompt alignment. In this study, we conducted an in-depth investigation into this issue and uncovered key insights for achieving effective identity preservation while maintaining a strong balance. Our key findings include: (1) the design of the subject image encoder significantly impacts identity preservation quality, and (2) separating text and subject guidance is crucial for both text alignment and identity preservation. Building on these insights, we introduce a new approach called EZIGen, which employs two main strategies: a carefully crafted subject image Encoder based on the pretrained UNet of the Stable Diffusion model to ensure high-quality identity transfer, following a process that decouples the guidance stages and iteratively refines the initial image layout. Through these strategies, EZIGen achieves state-of-the-art results on multiple subject-driven benchmarks with a unified model and 100 times less training data. The demo page is available at: //zichengduan.github.io/pages/EZIGen/index.html.
This paper introduces a novel approach that combines Proper Orthogonal Decomposition (POD) with Thermodynamics-based Artificial Neural Networks (TANN) to capture the macroscopic behavior of complex inelastic systems and derive macroelements in geomechanics. The methodology leverages POD to extract macroscopic Internal State Variables from microscopic state information, thereby enriching the macroscopic state description used to train an energy potential network within the TANN framework. The thermodynamic consistency provided by TANN, combined with the hierarchical nature of POD, allows to reproduce complex, non-linear inelastic material behaviors as well as macroscopic geomechanical systems responses. The approach is validated through applications of increasing complexity, demonstrating its capability to reproduce high-fidelity simulation data. The applications proposed include the homogenization of continuous inelastic representative unit cells and the derivation of a macroelement for a geotechnical system involving a monopile in a clay layer subjected to horizontal loading. Eventually, the projection operators directly obtained via POD, are exploit to easily reconstruct the microscopic fields. The results indicate that the POD-TANN approach not only offers accuracy in reproducing the studied constitutive responses, but also reduces computational costs, making it a practical tool for the multiscale modeling of heterogeneous inelastic geomechanical systems.
We propose a CPU-GPU heterogeneous computing method for solving time-evolution partial differential equation problems many times with guaranteed accuracy, in short time-to-solution and low energy-to-solution. On a single-GH200 node, the proposed method improved the computation speed by 86.4 and 8.67 times compared to the conventional method run only on CPU and only on GPU, respectively. Furthermore, the energy-to-solution was reduced by 32.2-fold (from 9944 J to 309 J) and 7.01-fold (from 2163 J to 309 J) when compared to using only the CPU and GPU, respectively. Using the proposed method on the Alps supercomputer, a 51.6-fold and 6.98-fold speedup was attained when compared to using only the CPU and GPU, respectively, and a high weak scaling efficiency of 94.3% was obtained up to 1,920 compute nodes. These implementations were realized using directive-based parallel programming models while enabling portability, indicating that directives are highly effective in analyses in heterogeneous computing environments.
In this article, we propose and study a stochastic and relaxed preconditioned Douglas--Rachford splitting method to solve saddle-point problems that have separable dual variables. We prove the almost sure convergence of the iteration sequences in Hilbert spaces for a class of convex-concave and nonsmooth saddle-point problems. We also provide the sublinear convergence rate for the ergodic sequence concerning the expectation of the restricted primal-dual gap functions. Numerical experiments show the high efficiency of the proposed stochastic and relaxed preconditioned Douglas--Rachford splitting methods.
Finding vertex-to-vertex correspondences in real-world graphs is a challenging task with applications in a wide variety of domains. Structural matching based on graphs connectivities has attracted considerable attention, while the integration of all the other information stemming from vertices and edges attributes has been mostly left aside. Here we present the Graph Attributes and Structure Matching (GASM) algorithm, which provides high-quality solutions by integrating all the available information in a unified framework. Parameters quantifying the reliability of the attributes can tune how much the solutions should rely on the structure or on the attributes. We further show that even without attributes GASM consistently finds as-good-as or better solutions than state-of-the-art algorithms, with similar processing times.
Matrix perturbation bounds (such as Weyl and Davis-Kahan) are frequently used in many branches of mathematics. Most of the classical results in this area are optimal, in the worst-case analysis. However, in modern applications, both the ground and the nose matrices frequently have extra structural properties. For instance, it is often assumed that the ground matrix is essentially low rank, and the nose matrix is random or pseudo-random. We aim to rebuild a part of perturbation theory, adapting to these modern assumptions. We will do this using a contour expansion argument, which enables us to exploit the skewness among the leading eigenvectors of the ground and the noise matrix (which is significant when the two are uncorrelated) to our advantage. In the current paper, we focus on the perturbation of eigenspaces. This helps us to introduce the arguments in the cleanest way, avoiding the more technical consideration of the general case. In applications, this case is also one of the most useful. More general results appear in a subsequent paper. Our method has led to several improvements, which have direct applications in central problems. Among others, we derive a sharp result for the perturbation of a low rank matrix with random perturbation, answering an open question in this area. Next, we derive new results concerning the spike model, an important model in statistics, bridging two different directions of current research. Finally, we use our results on the perturbation of eigenspaces to derive new results concerning eigenvalues of deterministic and random matrices. In particular, we obtain new results concerning the outliers in the deformed Wigner model and the least singular value of random matrices with non-zero mean.
In this paper, we apply quasi-Monte Carlo (QMC) methods with an initial preintegration step to estimate cumulative distribution functions and probability density functions in uncertainty quantification (UQ). The distribution and density functions correspond to a quantity of interest involving the solution to an elliptic partial differential equation (PDE) with a lognormally distributed coefficient and a normally distributed source term. There is extensive previous work on using QMC to compute expected values in UQ, which have proven very successful in tackling a range of different PDE problems. However, the use of QMC for density estimation applied to UQ problems will be explored here for the first time. Density estimation presents a more difficult challenge compared to computing the expected value due to discontinuities present in the integral formulations of both the distribution and density. Our strategy is to use preintegration to eliminate the discontinuity by integrating out a carefully selected random parameter, so that QMC can be used to approximate the remaining integral. First, we establish regularity results for the PDE quantity of interest that are required for smoothing by preintegration to be effective. We then show that an $N$-point lattice rule can be constructed for the integrands corresponding to the distribution and density, such that after preintegration the QMC error is of order $\mathcal{O}(N^{-1+\epsilon})$ for arbitrarily small $\epsilon>0$. This is the same rate achieved for computing the expected value of the quantity of interest. Numerical results are presented to reaffirm our theory.
The main purpose of this paper is to design a local discontinuous Galerkin (LDG) method for the Benjamin-Ono equation. We analyze the stability and error estimates for the semi-discrete LDG scheme. We prove that the scheme is $L^2$-stable and it converges at a rate $\mathcal{O}(h^{k+1/2})$ for general nonlinear flux. Furthermore, we develop a fully discrete LDG scheme using the four-stage fourth order Runge-Kutta method and ensure the devised scheme is strongly stable in case of linear flux using two-step and three-step stability approach under an appropriate time step constraint. Numerical examples are provided to validate the efficiency and accuracy of the method.
We present a fully discrete Crank-Nicolson Fourier-spectral-Galerkin (FSG) scheme for approximating solutions of the fractional Korteweg-de Vries (KdV) equation, which involves a fractional Laplacian with exponent $\alpha \in [1,2]$ and a small dispersion coefficient of order $\varepsilon^2$. The solution in the limit as $\varepsilon \to 0$ is known as the zero dispersion limit. We demonstrate that the semi-discrete FSG scheme conserves the first three integral invariants, thereby structure preserving, and that the fully discrete FSG scheme is $L^2$-conservative, ensuring stability. Using a compactness argument, we constructively prove the convergence of the approximate solution to the unique solution of the fractional KdV equation in $C([0,T]; H_p^{1+\alpha}(\mathbb{R}))$ for the periodic initial data in $H_p^{1+\alpha}(\mathbb{R})$. The devised scheme achieves spectral accuracy for the initial data in $H_p^r,$ $r \geq 1+\alpha$ and exponential accuracy for the analytic initial data. Additionally, we establish that the approximation of the zero dispersion limit obtained from the fully discrete FSG scheme converges to the solution of the Hopf equation in $L^2$ as $\varepsilon \to 0$, up to the gradient catastrophe time $t_c$. Beyond $t_c$, numerical investigations reveal that the approximation converges to the asymptotic solution, which is weakly described by the Whitham's averaged equation within the oscillatory zone for $\alpha = 2$. Numerical results are provided to demonstrate the convergence of the scheme and to validate the theoretical findings.
This paper addresses a special Perspective-n-Point (PnP) problem: estimating the optimal pose to align 3D and 2D shapes in real-time without correspondences, termed as correspondence-free PnP. While several studies have focused on 3D and 2D shape registration, achieving both real-time and accurate performance remains challenging. This study specifically targets the 3D-2D geometric shape registration tasks, applying the recently developed Reproducing Kernel Hilbert Space (RKHS) to address the "big-to-small" issue. An iterative reweighted least squares method is employed to solve the RKHS-based formulation efficiently. Moreover, our work identifies a unique and interesting observability issue in correspondence-free PnP: the numerical ambiguity between rotation and translation. To address this, we proposed DynaWeightPnP, introducing a dynamic weighting sub-problem and an alternative searching algorithm designed to enhance pose estimation and alignment accuracy. Experiments were conducted on a typical case, that is, a 3D-2D vascular centerline registration task within Endovascular Image-Guided Interventions (EIGIs). Results demonstrated that the proposed algorithm achieves registration processing rates of 60 Hz (without post-refinement) and 31 Hz (with post-refinement) on modern single-core CPUs, with competitive accuracy comparable to existing methods. These results underscore the suitability of DynaWeightPnP for future robot navigation tasks like EIGIs.