The Cahn--Hilliard equations are a versatile model for describing the evolution of complex morphologies. In this paper we present a computational pipeline for the numerical solution of a ternary phase-field model for describing the nanomorphology of donor--acceptor semiconductor blends used in organic photovoltaic devices. The model consists of two coupled fourth-order partial differential equations that are discretized using a finite element approach. In order to solve the resulting large-scale linear systems efficiently, we propose a preconditioning strategy that is based on efficient approximations of the Schur-complement of a saddle point system. We show that this approach performs robustly with respect to variations in the discretization parameters. Finally, we outline that the computed morphologies can be used for the computation of charge generation, recombination, and transport in organic solar cells.
Estimating the expectations of functionals applied to sums of random variables (RVs) is a well-known problem encountered in many challenging applications. Generally, closed-form expressions of these quantities are out of reach. A naive Monte Carlo simulation is an alternative approach. However, this method requires numerous samples for rare event problems. Therefore, it is paramount to use variance reduction techniques to develop fast and efficient estimation methods. In this work, we use importance sampling (IS), known for its efficiency in requiring fewer computations to achieve the same accuracy requirements. We propose a state-dependent IS scheme based on a stochastic optimal control formulation, where the control is dependent on state and time. We aim to calculate rare event quantities that could be written as an expectation of a functional of the sums of independent RVs. The proposed algorithm is generic and can be applied without restrictions on the univariate distributions of RVs or the functional applied to the sum. We apply this approach to the log-normal distribution to compute the left tail and cumulative distribution of the ratio of independent RVs. For each case, we numerically demonstrate that the proposed state-dependent IS algorithm compares favorably to most well-known estimators dealing with similar problems.
Dendrites are one of the most widely observed patterns in nature and occur across a wide spectrum of physical phenomena. In solidification and growth patterns in metals and crystals, the multi-level branching structures of dendrites pose a modeling challenge, and a full resolution of these structures is computationally demanding. In the literature, theoretical models of dendritic formation and evolution, essentially as extensions of the classical moving boundary Stefan problem exist. Much of this understanding is from the analysis of dendrites occurring during the solidification of metallic alloys. Motivated by the problem of modeling microstructure evolution from liquid melts of pure metals and alloys during MAM, we developed a comprehensive numerical framework for modeling a large variety of dendritic structures that are relevant to metal solidification. In this work, we present a numerical framework encompassing the modeling of Stefan problem formulations relevant to dendritic evolution using a phase-field approach and a finite element method implementation. Using this framework, we model numerous complex dendritic morphologies that are physically relevant to the solidification of pure melts and binary alloys. The distinguishing aspects of this work are - a unified treatment of both pure metals and alloys; novel numerical error estimates of dendritic tip velocity; and the convergence of error for the primal fields of temperature and the order parameter with respect to numerical discretization. To the best of our knowledge, this is a first-of-its-kind study of numerical convergence of the phase-field equations of dendritic growth in a finite element method setting. Further, we modeled various types of physically relevant dendritic solidification patterns in 2D and 3D computational domains.
Sampling from a target measure whose density is only known up to a normalization constant is a fundamental problem in computational statistics and machine learning. In this paper, we present a new optimization-based method for sampling called mollified interaction energy descent (MIED). MIED minimizes a new class of energies on probability measures called mollified interaction energies (MIEs). These energies rely on mollifier functions -- smooth approximations of the Dirac delta originated from PDE theory. We show that as the mollifier approaches the Dirac delta, the MIE converges to the chi-square divergence with respect to the target measure and the gradient flow of the MIE agrees with that of the chi-square divergence. Optimizing this energy with proper discretization yields a practical first-order particle-based algorithm for sampling in both unconstrained and constrained domains. We show experimentally that for unconstrained sampling problems our algorithm performs on par with existing particle-based algorithms like SVGD, while for constrained sampling problems our method readily incorporates constrained optimization techniques to handle more flexible constraints with strong performance compared to alternatives.
The current best approximation algorithms for $k$-median rely on first obtaining a structured fractional solution known as a bi-point solution, and then rounding it to an integer solution. We improve this second step by unifying and refining previous approaches. We describe a hierarchy of increasingly-complex partitioning schemes for the facilities, along with corresponding sets of algorithms and factor-revealing non-linear programs. We prove that the third layer of this hierarchy is a $2.613$-approximation, improving upon the current best ratio of $2.675$, while no layer can be proved better than $2.588$ under the proposed analysis. On the negative side, we give a family of bi-point solutions which cannot be approximated better than the square root of the golden ratio, even if allowed to open $k+o(k)$ facilities. This gives a barrier to current approaches for obtaining an approximation better than $2 \sqrt{\phi} \approx 2.544$. Altogether we reduce the approximation gap of bi-point solutions by two thirds.
We investigate the approximation formulas that were proposed by Tanaka & Sugihara (2019), in weighted Hardy spaces, which are analytic function spaces with certain asymptotic decay. Under the criterion of minimum worst error of $n$-point approximation formulas, we demonstrate that the formulas are nearly optimal. We also obtain the upper bounds of the approximation errors that coincide with the existing heuristic bounds in asymptotic order by duality theorem for the minimization problem of potential energy.
Graphical models, used to express conditional dependence between random variables observed at various nodes, are used extensively in many fields such as genetics, neuroscience, and social network analysis. While most current statistical methods for estimating graphical models focus on scalar data, there is interest in estimating analogous dependence structures when the data observed at each node are functional, such as signals or images. In this paper, we propose a fully Bayesian regularization scheme for estimating functional graphical models. We first consider a direct Bayesian analog of the functional graphical lasso proposed by Qiao et al. (2019). We then propose a regularization strategy via the graphical horseshoe. We compare these approaches via simulation study and apply our proposed functional graphical horseshoe to two motivating applications, electroencephalography data for comparing brain activation between an alcoholic group and controls, as well as changes in structural connectivity in the presence of traumatic brain injury (TBI). Our results yield insight into how the brain attempts to compensate for disconnected networks after injury.
Motion planning and control are crucial components of robotics applications. Here, spatio-temporal hard constraints like system dynamics and safety boundaries (e.g., obstacles in automated driving) restrict the robot's motions. Direct methods from optimal control solve a constrained optimization problem. However, in many applications finding a proper cost function is inherently difficult because of the weighting of partially conflicting objectives. On the other hand, Imitation Learning (IL) methods such as Behavior Cloning (BC) provide a intuitive framework for learning decision-making from offline demonstrations and constitute a promising avenue for planning and control in complex robot applications. Prior work primarily relied on soft-constraint approaches, which use additional auxiliary loss terms describing the constraints. However, catastrophic safety-critical failures might occur in out-of-distribution (OOD) scenarios. This work integrates the flexibility of IL with hard constraint handling in optimal control. Our approach constitutes a general framework for constraint robotic motion planning and control using offline IL. Hard constraints are integrated into the learning problem in a differentiable manner, via explicit completion and gradient-based correction. Simulated experiments of mobile robot navigation and automated driving provide evidence for the performance of the proposed method.
In 1954, Alston S. Householder published Principles of Numerical Analysis, one of the first modern treatments on matrix decomposition that favored a (block) LU decomposition-the factorization of a matrix into the product of lower and upper triangular matrices. And now, matrix decomposition has become a core technology in machine learning, largely due to the development of the back propagation algorithm in fitting a neural network. The sole aim of this survey is to give a self-contained introduction to concepts and mathematical tools in numerical linear algebra and matrix analysis in order to seamlessly introduce matrix decomposition techniques and their applications in subsequent sections. However, we clearly realize our inability to cover all the useful and interesting results concerning matrix decomposition and given the paucity of scope to present this discussion, e.g., the separated analysis of the Euclidean space, Hermitian space, Hilbert space, and things in the complex domain. We refer the reader to literature in the field of linear algebra for a more detailed introduction to the related fields.
The growing energy and performance costs of deep learning have driven the community to reduce the size of neural networks by selectively pruning components. Similarly to their biological counterparts, sparse networks generalize just as well, if not better than, the original dense networks. Sparsity can reduce the memory footprint of regular networks to fit mobile devices, as well as shorten training time for ever growing networks. In this paper, we survey prior work on sparsity in deep learning and provide an extensive tutorial of sparsification for both inference and training. We describe approaches to remove and add elements of neural networks, different training strategies to achieve model sparsity, and mechanisms to exploit sparsity in practice. Our work distills ideas from more than 300 research papers and provides guidance to practitioners who wish to utilize sparsity today, as well as to researchers whose goal is to push the frontier forward. We include the necessary background on mathematical methods in sparsification, describe phenomena such as early structure adaptation, the intricate relations between sparsity and the training process, and show techniques for achieving acceleration on real hardware. We also define a metric of pruned parameter efficiency that could serve as a baseline for comparison of different sparse networks. We close by speculating on how sparsity can improve future workloads and outline major open problems in the field.
Since deep neural networks were developed, they have made huge contributions to everyday lives. Machine learning provides more rational advice than humans are capable of in almost every aspect of daily life. However, despite this achievement, the design and training of neural networks are still challenging and unpredictable procedures. To lower the technical thresholds for common users, automated hyper-parameter optimization (HPO) has become a popular topic in both academic and industrial areas. This paper provides a review of the most essential topics on HPO. The first section introduces the key hyper-parameters related to model training and structure, and discusses their importance and methods to define the value range. Then, the research focuses on major optimization algorithms and their applicability, covering their efficiency and accuracy especially for deep learning networks. This study next reviews major services and toolkits for HPO, comparing their support for state-of-the-art searching algorithms, feasibility with major deep learning frameworks, and extensibility for new modules designed by users. The paper concludes with problems that exist when HPO is applied to deep learning, a comparison between optimization algorithms, and prominent approaches for model evaluation with limited computational resources.