We conducted a comparative analysis of the performance of modularity-based methods for clustering nodes in binary hypergraphs. Statistical analysis and node clustering in hypergraphs constitute an emerging topic suffering from a lack of standardization. In contrast to the case of graphs, the concept of nodes' community in hypergraphs is not unique and encompasses various distinct situations. To address this, we begin by presenting, within a unified framework, the various hypergraph modularity criteria proposed in the literature, emphasizing their differences and respective focuses. Subsequently, we provide an overview of the state-of-the-art codes available to maximize hypergraph modularities for detecting node communities in binary hypergraphs. Through exploration of various simulation settings with controlled ground truth clustering, we offer a comparison of these methods using different quality measures, including true clustering recovery, running time, (local) maximization of the objective, and the number of clusters detected. Our contribution marks the first attempt to clarify the advantages and drawbacks of these newly available methods. This effort lays the foundation for a better understanding of the primary objectives of modularity-based node clustering methods for binary hypergraphs.
We propose a numerical method to solve parameter-dependent hyperbolic partial differential equations (PDEs) with a moment approach, based on a previous work from Marx et al. (2020). This approach relies on a very weak notion of solution of nonlinear equations, namely parametric entropy measure-valued (MV) solutions, satisfying linear equations in the space of Borel measures. The infinite-dimensional linear problem is approximated by a hierarchy of convex, finite-dimensional, semidefinite programming problems, called Lasserre's hierarchy. This gives us a sequence of approximations of the moments of the occupation measure associated with the parametric entropy MV solution, which is proved to converge. In the end, several post-treatments can be performed from this approximate moments sequence. In particular, the graph of the solution can be reconstructed from an optimization of the Christoffel-Darboux kernel associated with the approximate measure, that is a powerful approximation tool able to capture a large class of irregular functions. Also, for uncertainty quantification problems, several quantities of interest can be estimated, sometimes directly such as the expectation of smooth functionals of the solutions. The performance of our approach is evaluated through numerical experiments on the inviscid Burgers equation with parametrised initial conditions or parametrised flux function.
Several mixed-effects models for longitudinal data have been proposed to accommodate the non-linearity of late-life cognitive trajectories and assess the putative influence of covariates on it. No prior research provides a side-by-side examination of these models to offer guidance on their proper application and interpretation. In this work, we examined five statistical approaches previously used to answer research questions related to non-linear changes in cognitive aging: the linear mixed model (LMM) with a quadratic term, LMM with splines, the functional mixed model, the piecewise linear mixed model, and the sigmoidal mixed model. We first theoretically describe the models. Next, using data from two prospective cohorts with annual cognitive testing, we compared the interpretation of the models by investigating associations of education on cognitive change before death. Lastly, we performed a simulation study to empirically evaluate the models and provide practical recommendations. Except for the LMM-quadratic, the fit of all models was generally adequate to capture non-linearity of cognitive change and models were relatively robust. Although spline-based models have no interpretable nonlinearity parameters, their convergence was easier to achieve, and they allow graphical interpretation. In contrast, piecewise and sigmoidal models, with interpretable non-linear parameters, may require more data to achieve convergence.
Multiphysics simulations frequently require transferring solution fields between subproblems with non-matching spatial discretizations, typically using interpolation techniques. Standard methods are usually based on measuring the closeness between points by means of the Euclidean distance, which does not account for curvature, cuts, cavities or other non-trivial geometrical or topological features of the domain. This may lead to spurious oscillations in the interpolant in proximity to these features. To overcome this issue, we propose a modification to rescaled localized radial basis function (RL-RBF) interpolation to account for the geometry of the interpolation domain, by yielding conformity and fidelity to geometrical and topological features. The proposed method, referred to as RL-RBF-G, relies on measuring the geodesic distance between data points. RL-RBF-G removes spurious oscillations appearing in the RL-RBF interpolant, resulting in increased accuracy in domains with complex geometries. We demonstrate the effectiveness of RL-RBF-G interpolation through a convergence study in an idealized setting. Furthermore, we discuss the algorithmic aspects and the implementation of RL-RBF-G interpolation in a distributed-memory parallel framework, and present the results of a strong scalability test yielding nearly ideal results. Finally, we show the effectiveness of RL-RBF-G interpolation in multiphysics simulations by considering an application to a whole-heart cardiac electromecanics model.
This study addresses a class of linear mixed-integer programming (MILP) problems that involve uncertainty in the objective function parameters. The parameters are assumed to form a random vector, whose probability distribution can only be observed through a finite training data set. Unlike most of the related studies in the literature, we also consider uncertainty in the underlying data set. The data uncertainty is described by a set of linear constraints for each random sample, and the uncertainty in the distribution (for a fixed realization of data) is defined using a type-1 Wasserstein ball centered at the empirical distribution of the data. The overall problem is formulated as a three-level distributionally robust optimization (DRO) problem. First, we prove that the three-level problem admits a single-level MILP reformulation, if the class of loss functions is restricted to biaffine functions. Secondly, it turns out that for several particular forms of data uncertainty, the outlined problem can be solved reasonably fast by leveraging the nominal MILP problem. Finally, we conduct a computational study, where the out-of-sample performance of our model and computational complexity of the proposed MILP reformulation are explored numerically for several application domains.
For problems of time-harmonic scattering by rational polygonal obstacles, embedding formulae express the far-field pattern induced by any incident plane wave in terms of the far-field patterns for a relatively small (frequency-independent) set of canonical incident angles. Although these remarkable formulae are exact in theory, here we demonstrate that: (i) they are highly sensitive to numerical errors in practice, and (ii) direct calculation of the coefficients in these formulae may be impossible for particular sets of canonical incident angles, even in exact arithmetic. Only by overcoming these practical issues can embedding formulae provide a highly efficient approach to computing the far-field pattern induced by a large number of incident angles. Here we address challenges (i) and (ii), supporting our theory with numerical experiments. Challenge (i) is solved using techniques from computational complex analysis: we reformulate the embedding formula as a complex contour integral and prove that this is much less sensitive to numerical errors. In practice, this contour integral can be efficiently evaluated by residue calculus. Challenge (ii) is addressed using techniques from numerical linear algebra: we oversample, considering more canonical incident angles than are necessary, thus expanding the set of valid coefficient vectors. The coefficient vector can then be selected using either a least squares approach or column subset selection.
Iterated conditional expectation (ICE) g-computation is an estimation approach for addressing time-varying confounding for both longitudinal and time-to-event data. Unlike other g-computation implementations, ICE avoids the need to specify models for each time-varying covariate. For variance estimation, previous work has suggested the bootstrap. However, bootstrapping can be computationally intense and sensitive to the number of resamples used. Here, we present ICE g-computation as a set of stacked estimating equations. Therefore, the variance for the ICE g-computation estimator can be consistently estimated using the empirical sandwich variance estimator. Performance of the variance estimator was evaluated empirically with a simulation study. The proposed approach is also demonstrated with an illustrative example on the effect of cigarette smoking on the prevalence of hypertension. In the simulation study, the empirical sandwich variance estimator appropriately estimated the variance. When comparing runtimes between the sandwich variance estimator and the bootstrap for the applied example, the sandwich estimator was substantially faster, even when bootstraps were run in parallel. The empirical sandwich variance estimator is a viable option for variance estimation with ICE g-computation.
Statistical analysis and node clustering in hypergraphs constitute an emerging topic suffering from a lack of standardization. In contrast to the case of graphs, the concept of nodes' community in hypergraphs is not unique and encompasses various distinct situations. In this work, we conducted a comparative analysis of the performance of modularity-based methods for clustering nodes in binary hypergraphs. To address this, we begin by presenting, within a unified framework, the various hypergraph modularity criteria proposed in the literature, emphasizing their differences and respective focuses. Subsequently, we provide an overview of the state-of-the-art codes available to maximize hypergraph modularities for detecting node communities in binary hypergraphs. Through exploration of various simulation settings with controlled ground truth clustering, we offer a comparison of these methods using different quality measures, including true clustering recovery, running time, (local) maximization of the objective, and the number of clusters detected. Our contribution marks the first attempt to clarify the advantages and drawbacks of these newly available methods. This effort lays the foundation for a better understanding of the primary objectives of modularity-based node clustering methods for binary hypergraphs.
Classical tests are available for the two-sample test of correspondence of distribution functions. From these, the Kolmogorov-Smirnov test provides also the graphical interpretation of the test results, in different forms. Here, we propose modifications of the Kolmogorov-Smirnov test with higher power. The proposed tests are based on the so-called global envelope test which allows for graphical interpretation, similarly as the Kolmogorov-Smirnov test. The tests are based on rank statistics and are suitable also for the comparison of $n$ samples, with $n \geq 2$. We compare the alternatives for the two-sample case through an extensive simulation study and discuss their interpretation. Finally, we apply the tests to real data. Specifically, we compare the height distributions between boys and girls at different ages, as well as sepal length distributions of different flower species using the proposed methodologies.
Weakly modular graphs are defined as the class of graphs that satisfy the \emph{triangle condition ($TC$)} and the \emph{quadrangle condition ($QC$)}. We study an interesting subclass of weakly modular graphs that satisfies a stronger version of the triangle condition, known as the \emph{triangle diamond condition ($TDC$)}. and term this subclass of weakly modular graphs as the \emph{diamond-weakly modular graphs}. It is observed that this class contains the class of bridged graphs and the class of weakly bridged graphs. The interval function $I_G$ of a connected graph $G$ with vertex set $V$ is an important concept in metric graph theory and is one of the prime example of a transit function; a set function defined on the Cartesian product $V\times V$ to the power set of $V$ satisfying the expansive, symmetric and idempotent axioms. In this paper, we derive an interesting axiom denoted as $(J0')$, obtained from a well-known axiom introduced by Marlow Sholander in 1952, denoted as $(J0)$. It is proved that the axiom $(J0')$ is a characterizing axiom of the diamond-weakly modular graphs. We propose certain types of independent first-order betweenness axioms on an arbitrary transit function $R$ and prove that an arbitrary transit function becomes the interval function of a diamond-weakly modular graph if and only if $R$ satisfies these betweenness axioms. Similar characterizations are obtained for the interval function of bridged graphs and weakly bridged graphs.
The main respiratory muscle, the diaphragm, is an example of a thin structure. We aim to perform detailed numerical simulations of the muscle mechanics based on individual patient data. This requires a representation of the diaphragm geometry extracted from medical image data. We design an adaptive reconstruction method based on a least-squares radial basis function partition of unity method. The method is adapted to thin structures by subdividing the structure rather than the surrounding space, and by introducing an anisotropic scaling of local subproblems. The resulting representation is an infinitely smooth level set function, which is stabilized such that there are no spurious zero level sets. We show reconstruction results for 2D cross sections of the diaphragm geometry as well as for the full 3D geometry. We also show solutions to basic PDE test problems in the reconstructed geometries.