In this paper, we develop a domain-decomposition method for the generalized Poisson-Boltzmann equation based on a solvent-excluded surface which is widely used in computational chemistry. The solver requires to solve a generalized screened Poisson (GSP) equation defined in $\mathbb{R}^3$ with a space-dependent dielectric permittivity and an ion-exclusion function that accounts for Steric effects. Potential theory arguments transform the GSP equation into two-coupled equations defined in a bounded domain. Then, the Schwarz decomposition method is used to formulate local problems by decomposing the cavity into overlapping balls and only solving a set of coupled sub-equations in each ball in which, the spherical harmonics and the Legendre polynomials are used as basis functions in the angular and radial directions. A series of numerical experiments are presented to test the method.
Causal representation learning algorithms discover lower-dimensional representations of data that admit a decipherable interpretation of cause and effect; as achieving such interpretable representations is challenging, many causal learning algorithms utilize elements indicating prior information, such as (linear) structural causal models, interventional data, or weak supervision. Unfortunately, in exploratory causal representation learning, such elements and prior information may not be available or warranted. Alternatively, scientific datasets often have multiple modalities or physics-based constraints, and the use of such scientific, multimodal data has been shown to improve disentanglement in fully unsupervised settings. Consequently, we introduce a causal representation learning algorithm (causalPIMA) that can use multimodal data and known physics to discover important features with causal relationships. Our innovative algorithm utilizes a new differentiable parametrization to learn a directed acyclic graph (DAG) together with a latent space of a variational autoencoder in an end-to-end differentiable framework via a single, tractable evidence lower bound loss function. We place a Gaussian mixture prior on the latent space and identify each of the mixtures with an outcome of the DAG nodes; this novel identification enables feature discovery with causal relationships. Tested against a synthetic and a scientific dataset, our results demonstrate the capability of learning an interpretable causal structure while simultaneously discovering key features in a fully unsupervised setting.
We introduce a novel structure-preserving method in order to approximate the compressible ideal Magnetohydrodynamics (MHD) equations. This technique addresses the MHD equations using a non-divergence formulation, where the contributions of the magnetic field to the momentum and total mechanical energy are treated as source terms. Our approach uses the Marchuk-Strang splitting technique and involves three distinct components: a compressible Euler solver, a source-system solver, and an update procedure for the total mechanical energy. The scheme allows for significant freedom on the choice of Euler's equation solver, while the magnetic field is discretized using a curl-conforming finite element space, yielding exact preservation of the involution constraints. We prove that the method preserves invariant domain properties, including positivity of density, positivity of internal energy, and the minimum principle of the specific entropy. If the scheme used to solve Euler's equation conserves total energy, then the resulting MHD scheme can be proven to preserve total energy. Similarly, if the scheme used to solve Euler's equation is entropy-stable, then the resulting MHD scheme is entropy stable as well. In our approach, the CFL condition does not depend on magnetosonic wave-speeds, but only on the usual maximum wave speed from Euler's system. To validate the effectiveness of our method, we solve a variety of ideal MHD problems, showing that the method is capable of delivering high-order accuracy in space for smooth problems, while also offering unconditional robustness in the shock hydrodynamics regime as well.
This work presents a comparative study to numerically compute impulse approximate controls for parabolic equations with various boundary conditions. Theoretical controllability results have been recently investigated using a logarithmic convexity estimate at a single time based on a Carleman commutator approach. We propose a numerical algorithm for computing the impulse controls with minimal $L^2$-norms by adapting a penalized Hilbert Uniqueness Method (HUM) combined with a Conjugate Gradient (CG) method. We consider static boundary conditions (Dirichlet and Neumann) and dynamic boundary conditions. Some numerical experiments based on our developed algorithm are given to validate and compare the theoretical impulse controllability results.
Conformal inference is a fundamental and versatile tool that provides distribution-free guarantees for many machine learning tasks. We consider the transductive setting, where decisions are made on a test sample of $m$ new points, giving rise to $m$ conformal $p$-values. {While classical results only concern their marginal distribution, we show that their joint distribution follows a P\'olya urn model, and establish a concentration inequality for their empirical distribution function.} The results hold for arbitrary exchangeable scores, including {\it adaptive} ones that can use the covariates of the test+calibration samples at training stage for increased accuracy. We demonstrate the usefulness of these theoretical results through uniform, in-probability guarantees for two machine learning tasks of current interest: interval prediction for transductive transfer learning and novelty detection based on two-class classification.
Deep learning algorithms have been widely used to solve linear Kolmogorov partial differential equations~(PDEs) in high dimensions, where the loss function is defined as a mathematical expectation. We propose to use the randomized quasi-Monte Carlo (RQMC) method instead of the Monte Carlo (MC) method for computing the loss function. In theory, we decompose the error from empirical risk minimization~(ERM) into the generalization error and the approximation error. Notably, the approximation error is independent of the sampling methods. We prove that the convergence order of the mean generalization error for the RQMC method is $O(n^{-1+\epsilon})$ for arbitrarily small $\epsilon>0$, while for the MC method it is $O(n^{-1/2+\epsilon})$ for arbitrarily small $\epsilon>0$. Consequently, we find that the overall error for the RQMC method is asymptotically smaller than that for the MC method as $n$ increases. Our numerical experiments show that the algorithm based on the RQMC method consistently achieves smaller relative $L^{2}$ error than that based on the MC method.
In this paper we consider the finite element approximation of Maxwell's problem and analyse the prescription of essential boundary conditions in a weak sense using Nitsche's method. To avoid indefiniteness of the problem, the original equations are augmented with the gradient of a scalar field that allows one to impose the zero divergence of the magnetic induction, even if the exact solution for this scalar field is zero. Two finite element approximations are considered, namely, one in which the approximation spaces are assumed to satisfy the appropriate inf-sup condition that render the standard Galerkin method stable, and another augmented and stabilised one that permits the use of finite element interpolations of arbitrary order. Stability and convergence results are provided for the two finite element formulations considered.
In this paper, we propose a modification of an acoustic-transport operator splitting Lagrange-projection method for simulating compressible flows with gravity. The original method involves two steps that respectively account for acoustic and transport effects. Our work proposes a simple modification of the transport step, and the resulting modified scheme turns out to be a flux-splitting method. This new numerical method is less computationally expensive, more memory efficient, and easier to implement than the original one. We prove stability properties for this new scheme by showing that under classical CFL conditions, the method is positivity preserving for mass, energy and entropy satisfying. The flexible flux-splitting structure of the method enables straightforward extensions of the method to multi-dimensional problems (with respect to space) and high-order discretizations that are presented in this work. We also propose an interpretation of the flux-splitting solver as a relaxation approximation. Both the stability and the accuracy of the new method are tested against one-dimensional and two-dimensional numerical experiments that involve highly compressible flows and low-Mach regimes.
In this study, we propose a novel multi-objective Bayesian optimization (MOBO) method to efficiently identify the Pareto front (PF) defined by risk measures for black-box functions under the presence of input uncertainty (IU). Existing BO methods for Pareto optimization in the presence of IU are risk-specific or without theoretical guarantees, whereas our proposed method addresses general risk measures and has theoretical guarantees. The basic idea of the proposed method is to assume a Gaussian process (GP) model for the black-box function and to construct high-probability bounding boxes for the risk measures using the GP model. Furthermore, in order to reduce the uncertainty of non-dominated bounding boxes, we propose a method of selecting the next evaluation point using a maximin distance defined by the maximum value of a quasi distance based on bounding boxes. As theoretical analysis, we prove that the algorithm can return an arbitrary-accurate solution in a finite number of iterations with high probability, for various risk measures such as Bayes risk, worst-case risk, and value-at-risk. We also give a theoretical analysis that takes into account approximation errors because there exist non-negligible approximation errors (e.g., finite approximation of PFs and sampling-based approximation of bounding boxes) in practice. We confirm that the proposed method outperforms compared with existing methods not only in the setting with IU but also in the setting of ordinary MOBO through numerical experiments.
This paper reexamines the research on out-of-distribution (OOD) robustness in the field of NLP. We find that the distribution shift settings in previous studies commonly lack adequate challenges, hindering the accurate evaluation of OOD robustness. To address these issues, we propose a benchmark construction protocol that ensures clear differentiation and challenging distribution shifts. Then we introduce BOSS, a Benchmark suite for Out-of-distribution robustneSS evaluation covering 5 tasks and 20 datasets. Based on BOSS, we conduct a series of experiments on pre-trained language models for analysis and evaluation of OOD robustness. First, for vanilla fine-tuning, we examine the relationship between in-distribution (ID) and OOD performance. We identify three typical types that unveil the inner learning mechanism, which could potentially facilitate the forecasting of OOD robustness, correlating with the advancements on ID datasets. Then, we evaluate 5 classic methods on BOSS and find that, despite exhibiting some effectiveness in specific cases, they do not offer significant improvement compared to vanilla fine-tuning. Further, we evaluate 5 LLMs with various adaptation paradigms and find that when sufficient ID data is available, fine-tuning domain-specific models outperform LLMs on ID examples significantly. However, in the case of OOD instances, prioritizing LLMs with in-context learning yields better results. We identify that both fine-tuned small models and LLMs face challenges in effectively addressing downstream tasks. The code is public at \url{//github.com/lifan-yuan/OOD_NLP}.
Recently, ensemble has been applied to deep metric learning to yield state-of-the-art results. Deep metric learning aims to learn deep neural networks for feature embeddings, distances of which satisfy given constraint. In deep metric learning, ensemble takes average of distances learned by multiple learners. As one important aspect of ensemble, the learners should be diverse in their feature embeddings. To this end, we propose an attention-based ensemble, which uses multiple attention masks, so that each learner can attend to different parts of the object. We also propose a divergence loss, which encourages diversity among the learners. The proposed method is applied to the standard benchmarks of deep metric learning and experimental results show that it outperforms the state-of-the-art methods by a significant margin on image retrieval tasks.