Model-free data-driven computational mechanics replaces phenomenological constitutive functions by numerical simulations based on data sets of representative samples in stress-strain space. The distance of strain and stress pairs from the data set is minimized, subject to equilibrium and compatibility constraints. Although this method operates well for non-linear elastic problems, there are challenges dealing with history-dependent materials, since one and the same point in stress-strain space might correspond to different material behaviour.In recent literature, this issue has been treated by including local histories into the data set. However, there is still the necessity to include models for the evolution of specific internal variables. Thus, a mixed formulation of classical and data-driven modeling is obtained. In the presented approach, the data set is augmented with directions in the tangent space of points in stress-strain space. Moreover, the data set is divided into subsets corresponding to different material behaviour. Based on this classification, transition rules map the modeling points to the various subsets. The approach will be applied to non-linear elasticity and elasto-plasticity with isotropic hardening.
In this paper, high-order numerical integrators on homogeneous spaces will be presented as an application of nonholonomic partitioned Runge-Kutta Munthe-Kaas (RKMK) methods on Lie groups. A homogeneous space $M$ is a manifold where a group $G$ acts transitively. Such a space can be understood as a quotient $M \cong G/H$, where $H$ a closed Lie subgroup, is the isotropy group of each point of $M$. The Lie algebra of $G$ may be decomposed into $\mathfrak{g} = \mathfrak{m} \oplus \mathfrak{h}$, where $\mathfrak{h}$ is the subalgebra that generates $H$ and $\mathfrak{m}$ is a subspace. Thus, variational problems on $M$ can be treated as nonholonomically constrained problems on $G$, by requiring variations to remain on $\mathfrak{m}$. Nonholonomic partitioned RKMK integrators are derived as a modification of those obtained by a discrete variational principle on Lie groups, and can be interpreted as obeying a discrete Chetaev principle. These integrators tend to preserve several properties of their purely variational counterparts.
Directed Acyclic Graphs (DAGs) provide a powerful framework to model causal relationships among variables in multivariate settings; in addition, through the do-calculus theory, they allow for the identification and estimation of causal effects between variables also from pure observational data. In this setting, the process of inferring the DAG structure from the data is referred to as causal structure learning or causal discovery. We introduce BCDAG, an R package for Bayesian causal discovery and causal effect estimation from Gaussian observational data, implementing the Markov chain Monte Carlo (MCMC) scheme proposed by Castelletti & Mascaro (2021). Our implementation scales efficiently with the number of observations and, whenever the DAGs are sufficiently sparse, with the number of variables in the dataset. The package also provides functions for convergence diagnostics and for visualizing and summarizing posterior inference. In this paper, we present the key features of the underlying methodology along with its implementation in BCDAG. We then illustrate the main functions and algorithms on both real and simulated datasets.
We revisit the Ravine method of Gelfand and Tsetlin from a dynamical system perspective, study its convergence properties, and highlight its similarities and differences with the Nesterov accelerated gradient method. The two methods are closely related. They can be deduced from each other by reversing the order of the extrapolation and gradient operations in their definitions. They benefit from similar fast convergence of values and convergence of iterates for general convex objective functions. We will also establish the high resolution ODE of the Ravine and Nesterov methods, and reveal an additional geometric damping term driven by the Hessian for both methods. This will allow us to prove fast convergence towards zero of the gradients not only for the Ravine method but also for the Nesterov method for the first time. We also highlight connections to other algorithms stemming from more subtle discretization schemes, and finally describe a Ravine version of the proximal-gradient algorithms for general structured smooth + non-smooth convex optimization problems.
Isogeometric Analysis generalizes classical finite element analysis and intends to integrate it with the field of Computer-Aided Design. A central problem in achieving this objective is the reconstruction of analysis-suitable models from Computer-Aided Design models, which is in general a non-trivial and time-consuming task. In this article, we present a novel spline construction, that enables model reconstruction as well as simulation of high-order PDEs on the reconstructed models. The proposed almost-$C^1$ are biquadratic splines on fully unstructured quadrilateral meshes (without restrictions on placements or number of extraordinary vertices). They are $C^1$ smooth almost everywhere, that is, at all vertices and across most edges, and in addition almost (i.e. approximately) $C^1$ smooth across all other edges. Thus, the splines form $H^2$-nonconforming analysis-suitable discretization spaces. This is the lowest-degree unstructured spline construction that can be used to solve fourth-order problems. The associated spline basis is non-singular and has several B-spline-like properties (e.g., partition of unity, non-negativity, local support), the almost-$C^1$ splines are described in an explicit B\'ezier-extraction-based framework that can be easily implemented. Numerical tests suggest that the basis is well-conditioned and exhibits optimal approximation behavior.
An incremental approach for computation of convex hull for data points in two-dimensions is presented. The algorithm is not output-sensitive and costs a time that is linear in the size of data points at input. Graham's scan is applied only on a subset of the data points, represented at the extremal of the dataset. Points are classified for extremal, in proportion with the modular distance, about an imaginary point interior to the region bounded by convex hull of the dataset assumed for origin or center in polar coordinate. A subset of the data is arrived by terminating at until an event of no change in maximal points is observed per bin, for iteratively and exponentially decreasing intervals.
Weighting methods are a common tool to de-bias estimates of causal effects. And though there are an increasing number of seemingly disparate methods, many of them can be folded into one unifying regime: Causal Optimal Transport. This new method directly targets distributional balance by minimizing optimal transport distances between treatment and control groups or, more generally, between a source and target population. Our approach is semiparametrically efficient and model-free but can also incorporate moments or any other important functions of covariates that the researcher desires to balance. We find that Causal Optimal Transport outperforms competitor methods when both the propensity score and outcome models are misspecified, indicating it is a robust alternative to common weighting methods. Finally, we demonstrate the utility of our method in an external control study examining the effect of misoprostol versus oxytocin for the treatment of post-partum hemorrhage.
A Bayesian multivariate model with a structured covariance matrix for multi-way nested data is proposed. This flexible modeling framework allows for positive and for negative associations among clustered observations, and generalizes the well-known dependence structure implied by random effects. A conjugate shifted-inverse gamma prior is proposed for the covariance parameters which ensures that the covariance matrix remains positive definite under posterior analysis. A numerically efficient Gibbs sampling procedure is defined for balanced nested designs, and is validated using two simulation studies. For a top-layer unbalanced nested design, the procedure requires an additional data augmentation step. The proposed data augmentation procedure facilitates sampling latent variables from (truncated) univariate normal distributions, and avoids numerical computation of the inverse of the structured covariance matrix. The Bayesian multivariate (linear transformation) model is applied to two-way nested interval-censored event times to analyze differences in adverse events between three groups of patients, who were randomly allocated to treatment with different stents (BIO-RESORT). The parameters of the structured covariance matrix represent unobserved heterogeneity in treatment effects and are examined to detect differential treatment effects.
Training datasets for machine learning often have some form of missingness. For example, to learn a model for deciding whom to give a loan, the available training data includes individuals who were given a loan in the past, but not those who were not. This missingness, if ignored, nullifies any fairness guarantee of the training procedure when the model is deployed. Using causal graphs, we characterize the missingness mechanisms in different real-world scenarios. We show conditions under which various distributions, used in popular fairness algorithms, can or can not be recovered from the training data. Our theoretical results imply that many of these algorithms can not guarantee fairness in practice. Modeling missingness also helps to identify correct design principles for fair algorithms. For example, in multi-stage settings where decisions are made in multiple screening rounds, we use our framework to derive the minimal distributions required to design a fair algorithm. Our proposed algorithm decentralizes the decision-making process and still achieves similar performance to the optimal algorithm that requires centralization and non-recoverable distributions.
Graph neural networks (GNNs) are a popular class of machine learning models whose major advantage is their ability to incorporate a sparse and discrete dependency structure between data points. Unfortunately, GNNs can only be used when such a graph-structure is available. In practice, however, real-world graphs are often noisy and incomplete or might not be available at all. With this work, we propose to jointly learn the graph structure and the parameters of graph convolutional networks (GCNs) by approximately solving a bilevel program that learns a discrete probability distribution on the edges of the graph. This allows one to apply GCNs not only in scenarios where the given graph is incomplete or corrupted but also in those where a graph is not available. We conduct a series of experiments that analyze the behavior of the proposed method and demonstrate that it outperforms related methods by a significant margin.
Alternating Direction Method of Multipliers (ADMM) is a widely used tool for machine learning in distributed settings, where a machine learning model is trained over distributed data sources through an interactive process of local computation and message passing. Such an iterative process could cause privacy concerns of data owners. The goal of this paper is to provide differential privacy for ADMM-based distributed machine learning. Prior approaches on differentially private ADMM exhibit low utility under high privacy guarantee and often assume the objective functions of the learning problems to be smooth and strongly convex. To address these concerns, we propose a novel differentially private ADMM-based distributed learning algorithm called DP-ADMM, which combines an approximate augmented Lagrangian function with time-varying Gaussian noise addition in the iterative process to achieve higher utility for general objective functions under the same differential privacy guarantee. We also apply the moments accountant method to bound the end-to-end privacy loss. The theoretical analysis shows that DP-ADMM can be applied to a wider class of distributed learning problems, is provably convergent, and offers an explicit utility-privacy tradeoff. To our knowledge, this is the first paper to provide explicit convergence and utility properties for differentially private ADMM-based distributed learning algorithms. The evaluation results demonstrate that our approach can achieve good convergence and model accuracy under high end-to-end differential privacy guarantee.