Polynomial spectral methods provide fast, accurate, and flexible solvers for broad ranges of PDEs with one bounded dimension, where the incorporation of general boundary conditions is well understood. However, automating extensions to domains with multiple bounded dimensions is challenging because of difficulties in implementing boundary conditions and imposing compatibility conditions at shared edges and corners. Past work has included various workarounds, such as the anisotropic inclusion of partial boundary data at shared edges or approaches that only work for specific boundary conditions. Here we present a general system for imposing boundary and compatibility conditions for elliptic equations on hypercubes. We take an approach based on the generalized tau method, which allows for a wide range of boundary conditions for many types of spectral methods. The generalized tau method has the distinct advantage that the specified polynomial residual determines the exact algebraic solution; afterwards, any stable numerical scheme will find the same result. We can, therefore, provide one-to-one comparisons to traditional collocation and Galerkin methods within the tau framework. As an essential requirement, we add specific tau corrections to the boundary conditions in addition to the bulk PDE. We then impose additional mutual compatibility conditions to ensure boundary conditions match at shared subsurfaces. Our approach works with general boundary conditions that commute on intersecting subsurfaces, including Dirichlet, Neumann, Robin, and any combination of these on all boundaries. The tau corrections and compatibility conditions can be fully isotropic and easily incorporated into existing solvers. We present the method explicitly for the Poisson equation in two and three dimensions and describe its extension to arbitrary elliptic equations (e.g. biharmonic) in any dimension.
The Causality field aims to find systematic methods for uncovering cause-effect relationships. Such methods can find applications in many research fields, justifying a great interest in this domain. Machine Learning models have shown success in a large variety of tasks by extracting correlation patterns from high-dimensional data but still struggle when generalizing out of their initial distribution. As causal engines aim to learn mechanisms that are independent from a data distribution, combining Machine Learning with Causality has the potential to bring benefits to the two fields. In our work, we motivate this assumption and provide applications. We first perform an extensive overview of the theories and methods for Causality from different perspectives. We then provide a deeper look at the connections between Causality and Machine Learning and describe the challenges met by the two domains. We show the early attempts to bring the fields together and the possible perspectives for the future. We finish by providing a large variety of applications for techniques from Causality.
We expect the generalization error to improve with more samples from a similar task, and to deteriorate with more samples from an out-of-distribution (OOD) task. In this work, we show a counter-intuitive phenomenon: the generalization error of a task can be a non-monotonic function of the number of OOD samples. As the number of OOD samples increases, the generalization error on the target task improves before deteriorating beyond a threshold. In other words, there is value in training on small amounts of OOD data. We use Fisher's Linear Discriminant on synthetic datasets and deep networks on computer vision benchmarks such as MNIST, CIFAR-10, CINIC-10, PACS and DomainNet to demonstrate and analyze this phenomenon. In the idealistic setting where we know which samples are OOD, we show that these non-monotonic trends can be exploited using an appropriately weighted objective of the target and OOD empirical risk. While its practical utility is limited, this does suggest that if we can detect OOD samples, then there may be ways to benefit from them. When we do not know which samples are OOD, we show how a number of go-to strategies such as data-augmentation, hyper-parameter optimization, and pre-training are not enough to ensure that the target generalization error does not deteriorate with the number of OOD samples in the dataset.
In the dominant paradigm for designing equitable machine learning systems, one works to ensure that model predictions satisfy various fairness criteria, such as parity in error rates across race, gender, and other legally protected traits. That approach, however, typically ignores the downstream decisions and outcomes that predictions affect, and, as a result, can induce unexpected harms. Here we present an alternative framework for fairness that directly anticipates the consequences of decisions. Stakeholders first specify preferences over the possible outcomes of an algorithmically informed decision-making process. For example, lenders may prefer extending credit to those most likely to repay a loan, while also preferring similar lending rates across neighborhoods. One then searches the space of decision policies to maximize the specified utility. We develop and describe a method for efficiently learning these optimal policies from data for a large family of expressive utility functions, facilitating a more holistic approach to equitable decision-making.
The well-known discrete Fourier transform (DFT) can easily be generalized to arbitrary nodes in the spatial domain. The fast procedure for this generalization is referred to as nonequispaced fast Fourier transform (NFFT). Various applications such as MRI, solution of PDEs, etc., are interested in the inverse problem, i.,e., computing Fourier coefficients from given nonequispaced data. In this paper we survey different kinds of approaches to tackle this problem. In contrast to iterative procedures, where multiple iteration steps are needed for computing a solution, we focus especially on so-called direct inversion methods. We review density compensation techniques and introduce a new scheme that leads to an exact reconstruction for trigonometric polynomials. In addition, we consider a matrix optimization approach using Frobenius norm minimization to obtain an inverse NFFT.
Multi-dimensional direct numerical simulation (DNS) of the Schr\"odinger equation is needed for design and analysis of quantum nanostructures that offer numerous applications in biology, medicine, materials, electronic/photonic devices, etc. In large-scale nanostructures, extensive computational effort needed in DNS may become prohibitive due to the high degrees of freedom (DoF). This study employs a reduced-order learning algorithm, enabled by the first principles, for simulation of the Schr\"odinger equation to achieve high accuracy and efficiency. The proposed simulation methodology is applied to investigate two quantum-dot structures; one operates under external electric field, and the other is influenced by internal potential variation with periodic boundary conditions. The former is similar to typical operations of nanoelectronic devices, and the latter is of interest to simulation and design of nanostructures and materials, such as applications of density functional theory. Using the proposed methodology, a very accurate prediction can be realized with a reduction in the DoF by more than 3 orders of magnitude and in the computational time by 2 orders, compared to DNS. The proposed physics-informed learning methodology is also able to offer an accurate prediction beyond the training conditions, including higher external field and larger internal potential in untrained quantum states.
Considering the large amount of content created online by the minute, slang-aware automatic tools are critically needed to promote social good, and assist policymakers and moderators in restricting the spread of offensive language, abuse, and hate speech. Despite the success of large language models and the spontaneous emergence of slang dictionaries, it is unclear how far their combination goes in terms of slang understanding for downstream social good tasks. In this paper, we provide a framework to study different combinations of representation learning models and knowledge resources for a variety of downstream tasks that rely on slang understanding. Our experiments show the superiority of models that have been pre-trained on social media data, while the impact of dictionaries is positive only for static word embeddings. Our error analysis identifies core challenges for slang representation learning, including out-of-vocabulary words, polysemy, variance, and annotation disagreements, which can be traced to characteristics of slang as a quickly evolving and highly subjective language.
Traditional, numerical discretization-based solvers of partial differential equations (PDEs) are fundamentally agnostic to domains, boundary conditions and coefficients. In contrast, machine learnt solvers have a limited generalizability across these elements of boundary value problems. This is strongly true in the case of surrogate models that are typically trained on direct numerical simulations of PDEs applied to one specific boundary value problem. In a departure from this direct approach, the label-free machine learning of solvers is centered on a loss function that incorporates the PDE and boundary conditions in residual form. However, their generalization across boundary conditions is limited and they remain strongly domain-dependent. Here, we present a framework that generalizes across domains, boundary conditions and coefficients simultaneously with learning the PDE in weak form. Our work explores the ability of simple, convolutional neural network (CNN)-based encoder-decoder architectures to learn to solve a PDE in greater generality than its restriction to a particular boundary value problem. In this first communication, we consider the elliptic PDEs of Fickean diffusion, linear and nonlinear elasticity. Importantly, the learning happens independently of any labelled field data from either experiments or direct numerical solutions. Extensive results for these problem classes demonstrate the framework's ability to learn PDE solvers that generalize across hundreds of thousands of domains, boundary conditions and coefficients, including extrapolation beyond the learning regime. Once trained, the machine learning solvers are orders of magnitude faster than discretization-based solvers. We place our work in the context of recent continuous operator learning frameworks, and note extensions to transfer learning, active learning and reinforcement learning.
In this paper, we consider a semiconducting device with an active zone made of a single-layer material. The associated Poisson equation for the electrostatic potential (to be solved in order to perform self-consistent computations) is characterized by a surface particle density and an out-of-plane dielectric permittivity in the region surrounding the single-layer. To avoid mesh refinements in such a region, we propose an interface problem based on the natural domain decomposition suggested by the physical device. Two different interface continuity conditions are discussed. Then, we write the corresponding variational formulations adapting the so called three-fields formulation for domain decomposition and we approximate them using a proper finite element method. Finally, numerical experiments are performed to illustrate some specific features of this interface approach.
We introduce a meta-learning algorithm for adversarially robust classification. The proposed method tries to be as model agnostic as possible and optimizes a dataset prior to its deployment in a machine learning system, aiming to effectively erase its non-robust features. Once the dataset has been created, in principle no specialized algorithm (besides standard gradient descent) is needed to train a robust model. We formulate the data optimization procedure as a bi-level optimization problem on kernel regression, with a class of kernels that describe infinitely wide neural nets (Neural Tangent Kernels). We present extensive experiments on standard computer vision benchmarks using a variety of different models, demonstrating the effectiveness of our method, while also pointing out its current shortcomings. In parallel, we revisit prior work that also focused on the problem of data optimization for robust classification \citep{Ily+19}, and show that being robust to adversarial attacks after standard (gradient descent) training on a suitable dataset is more challenging than previously thought.
Numerical approximations of partial differential equations (PDEs) are routinely employed to formulate the solution of physics, engineering and mathematical problems involving functions of several variables, such as the propagation of heat or sound, fluid flow, elasticity, electrostatics, electrodynamics, and more. While this has led to solving many complex phenomena, there are still significant limitations. Conventional approaches such as Finite Element Methods (FEMs) and Finite Differential Methods (FDMs) require considerable time and are computationally expensive. In contrast, machine learning-based methods such as neural networks are faster once trained, but tend to be restricted to a specific discretization. This article aims to provide a comprehensive summary of conventional methods and recent machine learning-based methods to approximate PDEs numerically. Furthermore, we highlight several key architectures centered around the neural operator, a novel and fast approach (1000x) to learning the solution operator of a PDE. We will note how these new computational approaches can bring immense advantages in tackling many problems in fundamental and applied physics.