Generalized Additive Runge-Kutta schemes have shown to be a suitable tool for solving ordinary differential equations with additively partitioned right-hand sides. This work develops symplectic GARK schemes for additively partitioned Hamiltonian systems. In a general setting, we derive conditions for symplecticness, as well as symmetry and time-reversibility. We show how symplectic and symmetric schemes can be constructed based on schemes which are only symplectic, or only symmetric. Special attention is given to the special case of partitioned schemes for Hamiltonians split into multiple potential and kinetic energies. Finally we show how symplectic GARK schemes can leverage different time scales and evaluation costs for different potentials, and provide efficient numerical solutions by using different order for these parts.
Solving high dimensional partial differential equations (PDEs) has historically posed a considerable challenge when utilizing conventional numerical methods, such as those involving domain meshes. Recent advancements in the field have seen the emergence of neural PDE solvers, leveraging deep networks to effectively tackle high dimensional PDE problems. This study introduces Inf-SupNet, a model-based unsupervised learning approach designed to acquire solutions for a specific category of elliptic PDEs. The fundamental concept behind Inf-SupNet involves incorporating the inf-sup formulation of the underlying PDE into the loss function. The analysis reveals that the global solution error can be bounded by the sum of three distinct errors: the numerical integration error, the duality gap of the loss function (training error), and the neural network approximation error for functions within Sobolev spaces. To validate the efficacy of the proposed method, numerical experiments conducted in high dimensions demonstrate its stability and accuracy across various boundary conditions, as well as for both semi-linear and nonlinear PDEs.
Modern machine learning applications have witnessed the remarkable success of optimization algorithms that are designed to find flat minima. Motivated by this design choice, we undertake a formal study that (i) formulates the notion of flat minima, and (ii) studies the complexity of finding them. Specifically, we adopt the trace of the Hessian of the cost function as a measure of flatness, and use it to formally define the notion of approximate flat minima. Under this notion, we then analyze algorithms that find approximate flat minima efficiently. For general cost functions, we discuss a gradient-based algorithm that finds an approximate flat local minimum efficiently. The main component of the algorithm is to use gradients computed from randomly perturbed iterates to estimate a direction that leads to flatter minima. For the setting where the cost function is an empirical risk over training data, we present a faster algorithm that is inspired by a recently proposed practical algorithm called sharpness-aware minimization, supporting its success in practice.
We consider an atomic congestion game in which each player $i$ either participates in the game with an exogenous and known probability $p_{i}\in(0,1]$, independently of everybody else, or stays out and incurs no cost. We compute the parameterized price of anarchy to characterize the impact of demand uncertainty on the efficiency of selfish behavior, considering two different notions of a social planner. A prophet planner knows the realization of the random participation in the game; the ordinary planner does not. As a consequence, a prophet planner can compute an adaptive social optimum that selects different solutions depending on the players that turn out to be active, whereas an ordinary planner faces the same uncertainty as the players and can only compute social optima with respect to the player participation distribution. For both planners, we derive the precise price of anarchy, which arises from an optimization problem parameterized by the maximum participation probability $q=\max_{i} p_{i}$. In the case of affine costs, we provide an analytic expression for the ordinary and prophet price of anarchy, parameterized as a function of $q$.
Neural operators (NO) are discretization invariant deep learning methods with functional output and can approximate any continuous operator. NO have demonstrated the superiority of solving partial differential equations (PDEs) over other deep learning methods. However, the spatial domain of its input function needs to be identical to its output, which limits its applicability. For instance, the widely used Fourier neural operator (FNO) fails to approximate the operator that maps the boundary condition to the PDE solution. To address this issue, we propose a novel framework called resolution-invariant deep operator (RDO) that decouples the spatial domain of the input and output. RDO is motivated by the Deep operator network (DeepONet) and it does not require retraining the network when the input/output is changed compared with DeepONet. RDO takes functional input and its output is also functional so that it keeps the resolution invariant property of NO. It can also resolve PDEs with complex geometries whereas NO fail. Various numerical experiments demonstrate the advantage of our method over DeepONet and FNO.
Partial differential equations (PDEs) are widely used for the description of physical and engineering phenomena. Some key parameters involved in PDEs, which represent certain physical properties with important scientific interpretations, are difficult or even impossible to measure directly. Estimating these parameters from noisy and sparse experimental data of related physical quantities is an important task. Many methods for PDE parameter inference involve a large number of evaluations for numerical solutions to PDE through algorithms such as the finite element method, which can be time-consuming, especially for nonlinear PDEs. In this paper, we propose a novel method for the inference of unknown parameters in PDEs, called the PDE-Informed Gaussian Process (PIGP) based parameter inference method. Through modeling the PDE solution as a Gaussian process (GP), we derive the manifold constraints induced by the (linear) PDE structure such that, under the constraints, the GP satisfies the PDE. For nonlinear PDEs, we propose an augmentation method that transforms the nonlinear PDE into an equivalent PDE system linear in all derivatives, which our PIGP-based method can handle. The proposed method can be applied to a broad spectrum of nonlinear PDEs. The PIGP-based method can be applied to multi-dimensional PDE systems and PDE systems with unobserved components. Like conventional Bayesian approaches, the method can provide uncertainty quantification for both the unknown parameters and the PDE solution. The PIGP-based method also completely bypasses the numerical solver for PDEs. The proposed method is demonstrated through several application examples from different areas.
Several physical problems modeled by second-order elliptic equations can be efficiently solved using mixed finite elements of the Raviart-Thomas family RTk for N-simplexes, introduced in the seventies. In case Neumann conditions are prescribed on a curvilinear boundary, the normal component of the flux variable should preferably not take up values at nodes shifted to the boundary of the approximating polytope in the corresponding normal direction. This is because the method's accuracy downgrades, which was shown in previous papers by the first author et al. In that work an order-preserving technique was studied, based on a parametric version of these elements with curved simplexes. In this article an alternative with straight-edged triangles for two-dimensional problems is proposed. The key point of this method is a Petrov-Galerkin formulation of the mixed problem, in which the test-flux space is a little different from the shape-flux space. After describing the underlying variant of RTk we show that it gives rise to a uniformly stable and optimally convergent method, taking the Poisson equation as a model problem.
We introduce a hull operator on Poisson point processes, the easiest example being the convex hull of the support of a point process in Euclidean space. Assuming that the intensity measure of the process is known on the set generated by the hull operator, we discuss estimation of an expected linear statistic built on the Poisson process. In special cases, our general scheme yields an estimator of the volume of a convex body or an estimator of an integral of a H\"older function. We show that the estimation error is given by the Kabanov--Skorohod integral with respect to the underlying Poisson process. A crucial ingredient of our approach is a spatial strong Markov property of the underlying Poisson process with respect to the hull. We derive the rate of normal convergence for the estimation error, and illustrate it on an application to estimators of integrals of a H\"older function. We also discuss estimation of higher order symmetric statistics.
The Na\"ive Bayes has proven to be a tractable and efficient method for classification in multivariate analysis. However, features are usually correlated, a fact that violates the Na\"ive Bayes' assumption of conditional independence, and may deteriorate the method's performance. Moreover, datasets are often characterized by a large number of features, which may complicate the interpretation of the results as well as slow down the method's execution. In this paper we propose a sparse version of the Na\"ive Bayes classifier that is characterized by three properties. First, the sparsity is achieved taking into account the correlation structure of the covariates. Second, different performance measures can be used to guide the selection of features. Third, performance constraints on groups of higher interest can be included. Our proposal leads to a smart search, which yields competitive running times, whereas the flexibility in terms of performance measure for classification is integrated. Our findings show that, when compared against well-referenced feature selection approaches, the proposed sparse Na\"ive Bayes obtains competitive results regarding accuracy, sparsity and running times for balanced datasets. In the case of datasets with unbalanced (or with different importance) classes, a better compromise between classification rates for the different classes is achieved.
Diffusion models have recently emerged as a promising framework for Image Restoration (IR), owing to their ability to produce high-quality reconstructions and their compatibility with established methods. Existing methods for solving noisy inverse problems in IR, considers the pixel-wise data-fidelity. In this paper, we propose SaFaRI, a spatial-and-frequency-aware diffusion model for IR with Gaussian noise. Our model encourages images to preserve data-fidelity in both the spatial and frequency domains, resulting in enhanced reconstruction quality. We comprehensively evaluate the performance of our model on a variety of noisy inverse problems, including inpainting, denoising, and super-resolution. Our thorough evaluation demonstrates that SaFaRI achieves state-of-the-art performance on both the ImageNet datasets and FFHQ datasets, outperforming existing zero-shot IR methods in terms of LPIPS and FID metrics.
Due to their inherent capability in semantic alignment of aspects and their context words, attention mechanism and Convolutional Neural Networks (CNNs) are widely applied for aspect-based sentiment classification. However, these models lack a mechanism to account for relevant syntactical constraints and long-range word dependencies, and hence may mistakenly recognize syntactically irrelevant contextual words as clues for judging aspect sentiment. To tackle this problem, we propose to build a Graph Convolutional Network (GCN) over the dependency tree of a sentence to exploit syntactical information and word dependencies. Based on it, a novel aspect-specific sentiment classification framework is raised. Experiments on three benchmarking collections illustrate that our proposed model has comparable effectiveness to a range of state-of-the-art models, and further demonstrate that both syntactical information and long-range word dependencies are properly captured by the graph convolution structure.