We propose a model to address the overlooked problem of node clustering in simple hypergraphs. Simple hypergraphs are suitable when a node may not appear multiple times in the same hyperedge, such as in co-authorship datasets. Our model generalizes the stochastic blockmodel for graphs and assumes the existence of latent node groups and hyperedges are conditionally independent given these groups. We first establish the generic identifiability of the model parameters. We then develop a variational approximation Expectation-Maximization algorithm for parameter inference and node clustering, and derive a statistical criterion for model selection. To illustrate the performance of our R package HyperSBM, we compare it with other node clustering methods using synthetic data generated from the model, as well as from a line clustering experiment and a co-authorship dataset.
We develop novel LASSO-based methods for coefficient testing and confidence interval construction in the Gaussian linear model with $n\ge d$. Our methods' finite-sample guarantees are identical to those of their ubiquitous ordinary-least-squares-$t$-test-based analogues, yet have substantially higher power when the true coefficient vector is sparse. In particular, our coefficient test, which we call the $\ell$-test, performs like the one-sided $t$-test (despite not being given any information about the sign) under sparsity, and the corresponding confidence intervals are more than 10% shorter than the standard $t$-test based intervals. The nature of the $\ell$-test directly provides a novel exact adjustment conditional on LASSO selection for post-selection inference, allowing for the construction of post-selection p-values and confidence intervals. None of our methods require resampling or Monte Carlo estimation. We perform a variety of simulations and a real data analysis on an HIV drug resistance data set to demonstrate the benefits of the $\ell$-test. We end with a discussion of how the $\ell$-test may asymptotically apply to a much more general class of parametric models.
The knockoffs is a recently proposed powerful framework that effectively controls the false discovery rate (FDR) for variable selection. However, none of the existing knockoff solutions are directly suited to handle multivariate or high-dimensional functional data, which has become increasingly prevalent in various scientific applications. In this paper, we propose a novel functional model-X knockoffs selection framework tailored to sparse high-dimensional functional models, and show that our proposal can achieve the effective FDR control for any sample size. Furthermore, we illustrate the proposed functional model-X knockoffs selection procedure along with the associated theoretical guarantees for both FDR control and asymptotic power using examples of commonly adopted functional linear additive regression models and the functional graphical model. In the construction of functional knockoffs, we integrate essential components including the correlation operator matrix, the Karhunen-Lo\`eve expansion, and semidefinite programming, and develop executable algorithms. We demonstrate the superiority of our proposed methods over the competitors through both extensive simulations and the analysis of two brain imaging datasets.
This paper develops an arbitrary-positioned buffer for the smoothed particle hydrodynamics (SPH) method, whose generality and high efficiency are achieved through two techniques. First, with the local coordinate system established at each arbitrary-positioned in-/outlet, particle positions in the global coordinate system are transformed into those in it via coordinate transformation. Since one local axis is located perpendicular to the in-/outlet boundary, the position comparison between particles and the threshold line or surface can be simplified to just this coordinate dimension. Second, particle candidates subjected to position comparison at one specific in-/outlet are restricted to those within the local cell-linked lists nearby the defined buffer zone, which significantly enhances computational efficiency due to a small portion of particles being checked. With this developed buffer, particle generation and deletion at arbitrary-positioned in- and outlets of complex flows can be handled efficiently and straightforwardly. We validate the effectiveness and versatility of the developed buffer through 2-D and 3-D non-/orthogonal uni-/bidirectional flows with arbitrary-positioned in- and outlets, driven by either pressure or velocity boundary conditions.
The stochastic block model is a canonical model of communities in random graphs. It was introduced in the social sciences and statistics as a model of communities, and in theoretical computer science as an average case model for graph partitioning problems under the name of the ``planted partition model.'' Given a sparse stochastic block model, the two standard inference tasks are: (i) Weak recovery: can we estimate the communities with non trivial overlap with the true communities? (ii) Detection/Hypothesis testing: can we distinguish if the sample was drawn from the block model or from a random graph with no community structure with probability tending to $1$ as the graph size tends to infinity? In this work, we show that for sparse stochastic block models, the two inference tasks are equivalent except at a critical point. That is, weak recovery is information theoretically possible if and only if detection is possible. We thus find a strong connection between these two notions of inference for the model. We further prove that when detection is impossible, an explicit hypothesis test based on low degree polynomials in the adjacency matrix of the observed graph achieves the optimal statistical power. This low degree test is efficient as opposed to the likelihood ratio test, which is not known to be efficient. Moreover, we prove that the asymptotic mutual information between the observed network and the community structure exhibits a phase transition at the weak recovery threshold. Our results are proven in much broader settings including the hypergraph stochastic block models and general planted factor graphs. In these settings we prove that the impossibility of weak recovery implies contiguity and provide a condition which guarantees the equivalence of weak recovery and detection.
We propose a framework to perform Bayesian inference using conditional score-based diffusion models to solve a class of inverse problems in mechanics involving the inference of a specimen's spatially varying material properties from noisy measurements of its mechanical response to loading. Conditional score-based diffusion models are generative models that learn to approximate the score function of a conditional distribution using samples from the joint distribution. More specifically, the score functions corresponding to multiple realizations of the measurement are approximated using a single neural network, the so-called score network, which is subsequently used to sample the posterior distribution using an appropriate Markov chain Monte Carlo scheme based on Langevin dynamics. Training the score network only requires simulating the forward model. Hence, the proposed approach can accommodate black-box forward models and complex measurement noise. Moreover, once the score network has been trained, it can be re-used to solve the inverse problem for different realizations of the measurements. We demonstrate the efficacy of the proposed approach on a suite of high-dimensional inverse problems in mechanics that involve inferring heterogeneous material properties from noisy measurements. Some examples we consider involve synthetic data, while others include data collected from actual elastography experiments. Further, our applications demonstrate that the proposed approach can handle different measurement modalities, complex patterns in the inferred quantities, non-Gaussian and non-additive noise models, and nonlinear black-box forward models. The results show that the proposed framework can solve large-scale physics-based inverse problems efficiently.
Popular artificial neural networks (ANN) optimize parameters for unidirectional value propagation, assuming some arbitrary parametrization type like Multi-Layer Perceptron (MLP) or Kolmogorov-Arnold Network (KAN). In contrast, for biological neurons e.g. "it is not uncommon for axonal propagation of action potentials to happen in both directions"~\cite{axon} - suggesting they are optimized to continuously operate in multidirectional way. Additionally, statistical dependencies a single neuron could model is not just (expected) value dependence, but entire joint distributions including also higher moments. Such more agnostic joint distribution neuron would allow for multidirectional propagation (of distributions or values) e.g. $\rho(x|y,z)$ or $\rho(y,z|x)$ by substituting to $\rho(x,y,z)$ and normalizing. There will be discussed Hierarchical Correlation Reconstruction (HCR) for such neuron model: assuming $\rho(x,y,z)=\sum_{ijk} a_{ijk} f_i(x) f_j(y) f_k(z)$ type parametrization of joint distribution in polynomial basis $f_i$, which allows for flexible, inexpensive processing including nonlinearities, direct model estimation and update, trained through standard backpropagation or novel ways for such structure up to tensor decomposition or information bottleneck approach. Using only pairwise (input-output) dependencies, its expected value prediction becomes KAN-like with trained activation functions as polynomials, can be extended by adding higher order dependencies through included products - in conscious interpretable way, allowing for multidirectional propagation of both values and probability densities.
In this work, we develop a new hydrostatic reconstruction procedure to construct well-balanced schemes for one and multilayer shallow water flows, including wetting and drying. Initially, we derive the method for a path-conservative finite volume scheme and combine it with entropy conservative fluxes and suitable numerical dissipation to preserve an entropy inequality in the semi-discrete case. We then combine the novel hydrostatic reconstruction with a collocated nodal split-form discontinuous Galerkin spectral element method, extending the method to high-order and curvilinear meshes. The high-order method incorporates an additional positivity-limiter and is blended with a compatible subcell finite volume method to maintain well-balancedness at wet/dry fronts. We prove entropy stability, well-balancedness, and positivity-preservation for both methods. Numerical results for the high-order method validate the theoretical findings and demonstrate the robustness of the scheme.
We explore a simple approach to quantum logic based on hybrid and dynamic modal logic, where the set of states is given by some Hilbert space. In this setting, a notion of quantum clause is proposed in a similar way the notion of Horn clause is advanced in first-order logic, that is, to give logical properties for use in logic programming and formal specification. We propose proof rules for reasoning about quantum clauses and we investigate soundness and compactness properties that correspond to this proof calculus. Then we prove a Birkhoff completeness result for the fragment of hybrid-dynamic quantum logic determined by quantum clauses.
We present a method to generate contingency tables that follow loglinear models with prescribed marginal probabilities and dependence structures. We make use of (loglinear) Poisson regression, where the dependence structures, described using odds ratios, are implemented using an offset term. We apply this methodology to carry out simulation studies in the context of population size estimation using dual system and triple system estimators, popular in official statistics. These estimators use contingency tables that summarise the counts of elements enumerated or captured within lists that are linked. The simulation is used to investigate these estimators in the situation that the model assumptions are fulfilled, and the situation that the model assumptions are violated.
For the fractional Laplacian of variable order, an efficient and accurate numerical evaluation in multi-dimension is a challenge for the nature of a singular integral. We propose a simple and easy-to-implement finite difference scheme for the multi-dimensional variable-order fractional Laplacian defined by a hypersingular integral. We prove that the scheme is of second-order convergence and apply the developed finite difference scheme to solve various equations with the variable-order fractional Laplacian. We present a fast solver with quasi-linear complexity of the scheme for computing variable-order fractional Laplacian and corresponding PDEs. Several numerical examples demonstrate the accuracy and efficiency of our algorithm and verify our theory.