Dendrites are one of the most widely observed patterns in nature and occur across a wide spectrum of physical phenomena. In solidification and growth patterns in metals and crystals, the multi-level branching structures of dendrites pose a modeling challenge, and a full resolution of these structures is computationally demanding. In the literature, theoretical models of dendritic formation and evolution, essentially as extensions of the classical moving boundary Stefan problem exist. Much of this understanding is from the analysis of dendrites occurring during the solidification of metallic alloys. Motivated by the problem of modeling microstructure evolution from liquid melts of pure metals and alloys during MAM, we developed a comprehensive numerical framework for modeling a large variety of dendritic structures that are relevant to metal solidification. In this work, we present a numerical framework encompassing the modeling of Stefan problem formulations relevant to dendritic evolution using a phase-field approach and a finite element method implementation. Using this framework, we model numerous complex dendritic morphologies that are physically relevant to the solidification of pure melts and binary alloys. The distinguishing aspects of this work are - a unified treatment of both pure metals and alloys; novel numerical error estimates of dendritic tip velocity; and the convergence of error for the primal fields of temperature and the order parameter with respect to numerical discretization. To the best of our knowledge, this is a first-of-its-kind study of numerical convergence of the phase-field equations of dendritic growth in a finite element method setting. Further, we modeled various types of physically relevant dendritic solidification patterns in 2D and 3D computational domains.
In this paper, we consider a fully-discrete approximation of an abstract evolution equation deploying a non-conforming spatial approximation and finite differences in time (Rothe-Galerkin method). The main result is the convergence of the discrete solutions to a weak solution of the continuous problem. Therefore, the result can be interpreted either as a justification of the numerical method or as an alternative way of constructing weak solutions. We formulate the problem in the very general and abstract setting of so-called non-conforming Bochner pseudo-monotone operators, which allows for a unified treatment of several evolution problems. Our abstract results for non-conforming Bochner pseudo-monotone operators allow to establish (weak) convergence just by verifying a few natural assumptions on the operators time-by-time and on the discretization spaces. Hence, applications and extensions to several other evolution problems can be performed easily. We exemplify the applicability of our approach on several DG schemes for the unsteady $p$-Navier-Stokes problem. The results of some numerical experiments are reported in the final section.
Stochastic volatility often implies increasing risks that are difficult to capture given the dynamic nature of real-world applications. We propose using arc length, a mathematical concept, to quantify cumulative variations (the total variability over time) to more fully characterize stochastic volatility. The hazard rate, as defined by the Cox proportional hazards model in survival analysis, is assumed to be impacted by the instantaneous value of a longitudinal variable. However, when cumulative variations pose a significant impact on the hazard, this assumption is questionable. Our proposed Bayesian Arc Length Survival Analysis Model (BALSAM) infuses arc length into a united statistical framework by synthesizing three parallel components (joint models, distributed lag models, and arc length). We illustrate the use of BALSAM in simulation studies and also apply it to an HIV/AIDS clinical trial to assess the impact of cumulative variations of CD4 count (a critical longitudinal biomarker) on mortality while accounting for measurement errors and relevant variables.
Wearable Cognitive Assistance (WCA) applications present a challenge to benchmark and characterize due to their human-in-the-loop nature. Employing user testing to optimize system parameters is generally not feasible, given the scope of the problem and the number of observations needed to detect small but important effects in controlled experiments. Considering the intended mass-scale deployment of WCA applications in the future, there exists a need for tools enabling human-independent benchmarking. We present in this paper the first model for the complete end-to-end emulation of humans in WCA. We build this model through statistical analysis of data collected from previous work in this field, and demonstrate its utility by studying application task durations. Compared to first-order approximations, our model shows a ~36% larger gap between step execution times at high system impairment versus low. We further introduce a novel framework for stochastic optimization of resource consumption-responsiveness tradeoffs in WCA, and show that by combining this framework with our realistic model of human behavior, significant reductions of up to 50% in number processed frame samples and 20% in energy consumption can be achieved with respect to the state-of-the-art.
Topological data analysis has emerged as a powerful tool for extracting the metric, geometric and topological features underlying the data as a multi-resolution summary statistic, and has found applications in several areas where data arises from complex sources. In this paper, we examine the use of topological summary statistics through the lens of statistical inference. We investigate necessary and sufficient conditions under which \textit{topological inference} is possible and provide examples of models which admit invariance to topological summaries.
We propose a principled way to define Gaussian process priors on various sets of unweighted graphs: directed or undirected, with or without loops. We endow each of these sets with a geometric structure, inducing the notions of closeness and symmetries, by turning them into a vertex set of an appropriate metagraph. Building on this, we describe the class of priors that respect this structure and are analogous to the Euclidean isotropic processes, like squared exponential or Mat\'ern. We propose an efficient computational technique for the ostensibly intractable problem of evaluating these priors' kernels, making such Gaussian processes usable within the usual toolboxes and downstream applications. We go further to consider sets of equivalence classes of unweighted graphs and define the appropriate versions of priors thereon. We prove a hardness result, showing that in this case, exact kernel computation cannot be performed efficiently. However, we propose a simple Monte Carlo approximation for handling moderately sized cases. Inspired by applications in chemistry, we illustrate the proposed techniques on a real molecular property prediction task in the small data regime.
Optimal balance is a non-asymptotic numerical method to compute a point on the slow manifold for certain two-scale dynamical systems. It works by solving a modified version of the system as a boundary value problem in time, where the nonlinear terms are adiabatically ramped up from zero to the fully nonlinear dynamics. A dedicated boundary value solver, however, is often not directly available. The most natural alternative is a nudging solver, where the problem is repeatedly solved forward and backward in time and the respective boundary conditions are restored whenever one of the temporal end points is visited. In this paper, we show quasi-convergence of this scheme in the sense that the termination residual of the nudging iteration is as small as the asymptotic error of the method itself, i.e., under appropriate assumptions exponentially small. This confirms that optimal balance in its nudging formulation is an effective algorithm. Further, it shows that the boundary value problem formulation of optimal balance is well posed up at most a residual error as small as the asymptotic error of the method itself. The key step in our proof is a careful two-component Gronwall inequality.
Recent years have seen rapid progress at the intersection between causality and machine learning. Motivated by scientific applications involving high-dimensional data, in particular in biomedicine, we propose a deep neural architecture for learning causal relationships between variables from a combination of empirical data and prior causal knowledge. We combine convolutional and graph neural networks within a causal risk framework to provide a flexible and scalable approach. Empirical results include linear and nonlinear simulations (where the underlying causal structures are known and can be directly compared against), as well as a real biological example where the models are applied to high-dimensional molecular data and their output compared against entirely unseen validation experiments. These results demonstrate the feasibility of using deep learning approaches to learn causal networks in large-scale problems spanning thousands of variables.
We introduce a novel approach to inference on parameters that take values in a Riemannian manifold embedded in a Euclidean space. Parameter spaces of this form are ubiquitous across many fields, including chemistry, physics, computer graphics, and geology. This new approach uses generalized fiducial inference to obtain a posterior-like distribution on the manifold, without needing to know a parameterization that maps the constrained space to an unconstrained Euclidean space. The proposed methodology, called the constrained generalized fiducial distribution (CGFD), is obtained by using mathematical tools from Riemannian geometry. A Bernstein-von Mises-type result for the CGFD, which provides intuition for how the desirable asymptotic qualities of the unconstrained generalized fiducial distribution are inherited by the CGFD, is provided. To demonstrate the practical use of the CGFD, we provide three proof-of-concept examples: inference for data from a multivariate normal density with the mean parameters on a sphere, a linear logspline density estimation problem, and a reimagined approach to the AR(1) model, all of which exhibit desirable coverages via simulation. We discuss two Markov chain Monte Carlo algorithms for the exploration of these constrained parameter spaces and adapt them for the CGFD.
Class Incremental Learning (CIL) aims at learning a multi-class classifier in a phase-by-phase manner, in which only data of a subset of the classes are provided at each phase. Previous works mainly focus on mitigating forgetting in phases after the initial one. However, we find that improving CIL at its initial phase is also a promising direction. Specifically, we experimentally show that directly encouraging CIL Learner at the initial phase to output similar representations as the model jointly trained on all classes can greatly boost the CIL performance. Motivated by this, we study the difference between a na\"ively-trained initial-phase model and the oracle model. Specifically, since one major difference between these two models is the number of training classes, we investigate how such difference affects the model representations. We find that, with fewer training classes, the data representations of each class lie in a long and narrow region; with more training classes, the representations of each class scatter more uniformly. Inspired by this observation, we propose Class-wise Decorrelation (CwD) that effectively regularizes representations of each class to scatter more uniformly, thus mimicking the model jointly trained with all classes (i.e., the oracle model). Our CwD is simple to implement and easy to plug into existing methods. Extensive experiments on various benchmark datasets show that CwD consistently and significantly improves the performance of existing state-of-the-art methods by around 1\% to 3\%. Code will be released.
Deep learning is usually described as an experiment-driven field under continuous criticizes of lacking theoretical foundations. This problem has been partially fixed by a large volume of literature which has so far not been well organized. This paper reviews and organizes the recent advances in deep learning theory. The literature is categorized in six groups: (1) complexity and capacity-based approaches for analyzing the generalizability of deep learning; (2) stochastic differential equations and their dynamic systems for modelling stochastic gradient descent and its variants, which characterize the optimization and generalization of deep learning, partially inspired by Bayesian inference; (3) the geometrical structures of the loss landscape that drives the trajectories of the dynamic systems; (4) the roles of over-parameterization of deep neural networks from both positive and negative perspectives; (5) theoretical foundations of several special structures in network architectures; and (6) the increasingly intensive concerns in ethics and security and their relationships with generalizability.