Numerical solutions of partial differential equations (PDEs) require expensive simulations, limiting their application in design optimization routines, model-based control, or solution of large-scale inverse problems. Existing Convolutional Neural Network-based frameworks for surrogate modeling require lossy pixelization and data-preprocessing, which is not suitable for realistic engineering applications. Therefore, we propose non-linear independent dual system (NIDS), which is a deep learning surrogate model for discretization-independent, continuous representation of PDE solutions, and can be used for prediction over domains with complex, variable geometries and mesh topologies. NIDS leverages implicit neural representations to develop a non-linear mapping between problem parameters and spatial coordinates to state predictions by combining evaluations of a case-wise parameter network and a point-wise spatial network in a linear output layer. The input features of the spatial network include physical coordinates augmented by a minimum distance function evaluation to implicitly encode the problem geometry. The form of the overall output layer induces a dual system, where each term in the map is non-linear and independent. Further, we propose a minimum distance function-driven weighted sum of NIDS models using a shared parameter network to enforce boundary conditions by construction under certain restrictions. The framework is applied to predict solutions around complex, parametrically-defined geometries on non-parametrically-defined meshes with solutions obtained many orders of magnitude faster than the full order models. Test cases include a vehicle aerodynamics problem with complex geometry and data scarcity, enabled by a training method in which more cases are gradually added as training progresses.
Generalization to out-of-distribution (OOD) data is one of the central problems in modern machine learning. Recently, there is a surge of attempts to propose algorithms that mainly build upon the idea of extracting invariant features. Although intuitively reasonable, theoretical understanding of what kind of invariance can guarantee OOD generalization is still limited, and generalization to arbitrary out-of-distribution is clearly impossible. In this work, we take the first step towards rigorous and quantitative definitions of 1) what is OOD; and 2) what does it mean by saying an OOD problem is learnable. We also introduce a new concept of expansion function, which characterizes to what extent the variance is amplified in the test domains over the training domains, and therefore give a quantitative meaning of invariant features. Based on these, we prove OOD generalization error bounds. It turns out that OOD generalization largely depends on the expansion function. As recently pointed out by Gulrajani and Lopez-Paz (2020), any OOD learning algorithm without a model selection module is incomplete. Our theory naturally induces a model selection criterion. Extensive experiments on benchmark OOD datasets demonstrate that our model selection criterion has a significant advantage over baselines.
Diagnostic classification models (DCMs) offer statistical tools to inspect the fined-grained attribute of respondents' strengths and weaknesses. However, the diagnosis accuracy deteriorates when misspecification occurs in the predefined item-attribute relationship, which is encoded into a Q-matrix. To prevent such misspecification, methodologists have recently developed several Bayesian Q-matrix estimation methods for greater estimation flexibility. However, these methods become infeasible in the case of large-scale assessments with a large number of attributes and items. In this study, we focused on the deterministic inputs, noisy ``and'' gate (DINA) model and proposed a new framework for the Q-matrix estimation to find the Q-matrix with the maximum marginal likelihood. Based on this framework, we developed a scalable estimation algorithm for the DINA Q-matrix by constructing an iteration algorithm that utilizes stochastic optimization and variational inference. The simulation and empirical studies reveal that the proposed method achieves high-speed computation, good accuracy, and robustness to potential misspecifications, such as initial value's choices and hyperparameter settings. Thus, the proposed method can be a useful tool for estimating a Q-matrix in large-scale settings.
A version of the convexification globally convergent numerical method is constructed for a coefficient inverse problem for a wave-like partial differential equation. The presence of the Carleman Weight Function in the corresponding Tikhonov-like cost functional ensures the global strict convexity of this functional. Numerical results are presented to illustrate the effectiveness and efficiency of the proposed method.
The thermal radiative transfer (TRT) equations form an integro-differential system that describes the propagation and collisional interactions of photons. Computing accurate and efficient numerical solutions TRT are challenging for several reasons, the first of which is that TRT is defined on a high-dimensional phase. In order to reduce the dimensionality of the phase space, classical approaches such as the P$_N$ (spherical harmonics) or the S$_N$ (discrete ordinates) ansatz are often used in the literature. In this work, we introduce a novel approach: the hybrid discrete (H$^T_N$) approximation to the radiative thermal transfer equations. This approach acquires desirable properties of both P$_N$ and S$_N$, and indeed reduces to each of these approximations in various limits: H$^1_N$ $\equiv$ P$_N$ and H$^T_0$ $\equiv$ S$_T$. We prove that H$^T_N$ results in a system of hyperbolic equations for all $T\ge 1$ and $N\ge 0$. Another challenge in solving the TRT system is the inherent stiffness due to the large timescale separation between propagation and collisions, especially in the diffusive (i.e., highly collisional) regime. This stiffness challenge can be partially overcome via implicit time integration, although fully implicit methods may become computationally expensive due to the strong nonlinearity and system size. On the other hand, explicit time-stepping schemes that are not also asymptotic-preserving in the highly collisional limit require resolving the mean-free path between collisions, making such schemes prohibitively expensive. In this work we develop a numerical method that is based on a nodal discontinuous Galerkin discretization in space, coupled with a semi-implicit discretization in time. We conduct several numerical experiments to verify the accuracy, efficiency, and robustness of the H$^T_N$ ansatz and the numerical discretizations.
We study the problem of multi-compression and reconstructing a stochastic signal observed by several independent sensors (or compressors) that transmit compressed information to a fusion center. { The key aspect of this problem is to find models of the sensors and fusion center that are optimized in the sense of an error minimization under a certain criterion, such as the mean square error (MSE).} { A novel technique to solve this problem is developed. The novelty is as follows. First, the multi-compressors are non-linear and modeled using second degree polynomials. This may increase the accuracy of the signal estimation through the optimization in a higher dimensional parameter space compared to the linear case. Second, the required models are determined by a method based on a combination of the second degree transform (SDT) with the maximum block improvement (MBI) method and the generalized rank-constrained matrix approximation. It allows us to use the advantages of known methods to further increase the estimation accuracy of the source signal. Third, the proposed method is justified in terms of pseudo-inverse matrices. As a result, the models of compressors and fusion center always exist and are numerically stable.} In other words, the proposed models may provide compression, de-noising and reconstruction of distributed signals in cases when known methods either are not applicable or may produce larger associated errors.
As graph data collected from the real world is merely noise-free, a practical representation of graphs should be robust to noise. Existing research usually focuses on feature smoothing but leaves the geometric structure untouched. Furthermore, most work takes L2-norm that pursues a global smoothness, which limits the expressivity of graph neural networks. This paper tailors regularizers for graph data in terms of both feature and structure noises, where the objective function is efficiently solved with the alternating direction method of multipliers (ADMM). The proposed scheme allows to take multiple layers without the concern of over-smoothing, and it guarantees convergence to the optimal solutions. Empirical study proves that our model achieves significantly better performance compared with popular graph convolutions even when the graph is heavily contaminated.
We show that the mass matrix derived from finite elements can be effectively used as a preconditioner for iteratively solving the linear system arising from finite-difference discretization of the Poisson equation, using the conjugate gradient method. We derive analytically the condition number of the preconditioned operator. Theoretical analysis shows that the ratio of the condition number of the Laplacian to the preconditioned operator is $8/3$ in one dimension, $9/2$ in two dimensions, and $2^9/3^4 \approx 6.3$ in three dimensions. From this it follows that the expected iteration count for achieving a fixed reduction of the norm of the residual is smaller than a half of the number of the iterations of unpreconditioned CG in 2D and 3D. The scheme is easy to implement, and numerical experiments show its efficiency.
We propose a novel method for sampling and optimization tasks based on a stochastic interacting particle system. We explain how this method can be used for the following two goals: (i) generating approximate samples from a given target distribution; (ii) optimizing a given objective function. The approach is derivative-free and affine invariant, and is therefore well-suited for solving inverse problems defined by complex forward models: (i) allows generation of samples from the Bayesian posterior and (ii) allows determination of the maximum a posteriori estimator. We investigate the properties of the proposed family of methods in terms of various parameter choices, both analytically and by means of numerical simulations. The analysis and numerical simulation establish that the method has potential for general purpose optimization tasks over Euclidean space; contraction properties of the algorithm are established under suitable conditions, and computational experiments demonstrate wide basins of attraction for various specific problems. The analysis and experiments also demonstrate the potential for the sampling methodology in regimes in which the target distribution is unimodal and close to Gaussian; indeed we prove that the method recovers a Laplace approximation to the measure in certain parametric regimes and provide numerical evidence that this Laplace approximation attracts a large set of initial conditions in a number of examples.
Biomembranes adopt varying morphologies that are vital to cellular functions. Many studies use computational modeling to understand how various mechanochemical factors contribute to membrane shape transformations. Compared to approximation-based methods (e.g., finite element method), the class of discrete mesh models offers greater flexibility to simulate complex physics and shapes in three dimensions; its formulation produces an efficient algorithm while maintaining coordinate-free geometric descriptions. However, ambiguities in geometric definitions in the discrete context have led to a lack of consensus on which discrete mesh model is theoretically and numerically optimal; a bijective relationship between the terms contributing to both the energy and forces from the discrete and smooth geometric theories remains to be established. We address this and present an extensible framework, $\texttt{Mem3DG}$, for modeling 3D mechanochemical dynamics of membranes based on Discrete Differential Geometry (DDG) on triangulated meshes. The formalism of DDG resolves the inconsistency and provides a unifying perspective on how to relate the smooth and discrete energy and forces. To demonstrate, $\texttt{Mem3DG}$ is used to model a sequence of examples with increasing mechanochemical complexity: recovering classical shape transformations such as 1) biconcave disk, dumbbell, and unduloid and 2) spherical bud on spherical, flat-patch membrane; investigating how the coupling of membrane mechanics with protein mobility jointly affects phase and shape transformation. As high-resolution 3D imaging of membrane ultrastructure becomes more readily available, we envision Mem3DG to be applied as an end-to-end tool to simulate realistic cell geometry under user-specified mechanochemical conditions.
The aim of this work is to develop a fully-distributed algorithmic framework for training graph convolutional networks (GCNs). The proposed method is able to exploit the meaningful relational structure of the input data, which are collected by a set of agents that communicate over a sparse network topology. After formulating the centralized GCN training problem, we first show how to make inference in a distributed scenario where the underlying data graph is split among different agents. Then, we propose a distributed gradient descent procedure to solve the GCN training problem. The resulting model distributes computation along three lines: during inference, during back-propagation, and during optimization. Convergence to stationary solutions of the GCN training problem is also established under mild conditions. Finally, we propose an optimization criterion to design the communication topology between agents in order to match with the graph describing data relationships. A wide set of numerical results validate our proposal. To the best of our knowledge, this is the first work combining graph convolutional neural networks with distributed optimization.