Histological examination is a crucial step in an autopsy; however, the traditional histochemical staining of post-mortem samples faces multiple challenges, including the inferior staining quality due to autolysis caused by delayed fixation of cadaver tissue, as well as the resource-intensive nature of chemical staining procedures covering large tissue areas, which demand substantial labor, cost, and time. These challenges can become more pronounced during global health crises when the availability of histopathology services is limited, resulting in further delays in tissue fixation and more severe staining artifacts. Here, we report the first demonstration of virtual staining of autopsy tissue and show that a trained neural network can rapidly transform autofluorescence images of label-free autopsy tissue sections into brightfield equivalent images that match hematoxylin and eosin (H&E) stained versions of the same samples, eliminating autolysis-induced severe staining artifacts inherent in traditional histochemical staining of autopsied tissue. Our virtual H&E model was trained using >0.7 TB of image data and a data-efficient collaboration scheme that integrates the virtual staining network with an image registration network. The trained model effectively accentuated nuclear, cytoplasmic and extracellular features in new autopsy tissue samples that experienced severe autolysis, such as COVID-19 samples never seen before, where the traditional histochemical staining failed to provide consistent staining quality. This virtual autopsy staining technique can also be extended to necrotic tissue, and can rapidly and cost-effectively generate artifact-free H&E stains despite severe autolysis and cell death, also reducing labor, cost and infrastructure requirements associated with the standard histochemical staining.
Probability measures on the sphere form an important class of statistical models and are used, for example, in modeling directional data or shapes. Due to their widespread use, but also as an algorithmic building block, efficient sampling of distributions on the sphere is highly desirable. We propose a shrinkage based and an idealized geodesic slice sampling Markov chain, designed to generate approximate samples from distributions on the sphere. In particular, the shrinkage based algorithm works in any dimension, is straight-forward to implement and has no tuning parameters. We verify reversibility and show that under weak regularity conditions geodesic slice sampling is uniformly ergodic. Numerical experiments show that the proposed slice samplers achieve excellent mixing on challenging targets including the Bingham distribution and mixtures of von Mises-Fisher distributions. In these settings our approach outperforms standard samplers such as random-walk Metropolis Hastings and Hamiltonian Monte Carlo.
The autologistic actor attribute model, or ALAAM, is the social influence counterpart of the better-known exponential-family random graph model (ERGM) for social selection. Extensive experience with ERGMs has shown that the problem of near-degeneracy which often occurs with simple models can be overcome by using "geometrically weighted" or "alternating" statistics. In the much more limited empirical applications of ALAAMs to date, the problem of near-degeneracy, although theoretically expected, appears to have been less of an issue. In this work I present a comprehensive survey of ALAAM applications, showing that this model has to date only been used with relatively small networks, in which near-degeneracy does not appear to be a problem. I show near-degeneracy does occur in simple ALAAM models of larger empirical networks, define some geometrically weighted ALAAM statistics analogous to those for ERGM, and demonstrate that models with these statistics do not suffer from near-degeneracy and hence can be estimated where they could not be with the simple statistics.
A rigidity circuit (in 2D) is a minimal dependent set in the rigidity matroid, i.e. a minimal graph supporting a non-trivial stress in any generic placement of its vertices in $\mathbb R^2$. Any rigidity circuit on $n\geq 5$ vertices can be obtained from rigidity circuits on a fewer number of vertices by applying the combinatorial resultant (CR) operation. The inverse operation is called a combinatorial resultant decomposition (CR-decomp). Any rigidity circuit on $n\geq 5$ vertices can be successively decomposed into smaller circuits, until the complete graphs $K_4$ are reached. This sequence of CR-decomps has the structure of a rooted binary tree called the combinatorial resultant tree (CR-tree). A CR-tree encodes an elimination strategy for computing circuit polynomials via Sylvester resultants. Different CR-trees lead to elimination strategies that can vary greatly in time and memory consumption. It is an open problem to establish criteria for optimal CR-trees, or at least to characterize those CR-trees that lead to good elimination strategies. In [12] we presented an algorithm for enumerating CR-trees where we give the algorithms for decomposing 3-connected rigidity circuits in polynomial time. In this paper we focus on those circuits that are not 3-connected, which we simply call 2-connected. In order to enumerate CR-decomps of 2-connected circuits $G$, a brute force exp-time search has to be performed among the subgraphs induced by the subsets of $V(G)$. This exp-time bottleneck is not present in the 3-connected case. In this paper we will argue that we do not have to account for all possible CR-decomps of 2-connected rigidity circuits to find a good elimination strategy; we only have to account for those CR-decomps that are a 2-split, all of which can be enumerated in polynomial time. We present algorithms and computational evidence in support of this heuristic.
We prove axiomatic characterizations of several important multiwinner rules within the class of approval-based committee choice rules. These are voting rules that return a set of (fixed-size) committees. In particular, we provide axiomatic characterizations of Proportional Approval Voting, the Chamberlin--Courant rule, and other Thiele methods. These rules share the important property that they satisfy an axiom called consistency, which is crucial in our characterizations.
Neuromorphic computing is one of the few current approaches that have the potential to significantly reduce power consumption in Machine Learning and Artificial Intelligence. Imam & Cleland presented an odour-learning algorithm that runs on a neuromorphic architecture and is inspired by circuits described in the mammalian olfactory bulb. They assess the algorithm's performance in "rapid online learning and identification" of gaseous odorants and odorless gases (short "gases") using a set of gas sensor recordings of different odour presentations and corrupting them by impulse noise. We replicated parts of the study and discovered limitations that affect some of the conclusions drawn. First, the dataset used suffers from sensor drift and a non-randomised measurement protocol, rendering it of limited use for odour identification benchmarks. Second, we found that the model is restricted in its ability to generalise over repeated presentations of the same gas. We demonstrate that the task the study refers to can be solved with a simple hash table approach, matching or exceeding the reported results in accuracy and runtime. Therefore, a validation of the model that goes beyond restoring a learned data sample remains to be shown, in particular its suitability to odour identification tasks.
We make two contributions to the Isolation Forest method for anomaly and outlier detection. The first contribution is an information-theoretically motivated generalisation of the score function that is used to aggregate the scores across random tree estimators. This generalisation allows one to take into account not just the ensemble average across trees but instead the whole distribution. The second contribution is an alternative scoring function at the level of the individual tree estimator, in which we replace the depth-based scoring of the Isolation Forest with one based on hyper-volumes associated to an isolation tree's leaf nodes. We motivate the use of both of these methods on generated data and also evaluate them on 34 datasets from the recent and exhaustive ``ADBench'' benchmark, finding significant improvement over the standard isolation forest for both variants on some datasets and improvement on average across all datasets for one of the two variants. The code to reproduce our results is made available as part of the submission.
Bursting cells lead to ambient RNA that contaminates sequencing data. This process is especially problematic in perturbation experiments where transcription factors are implanted into cells to determine their effects. The presence of contaminants makes it difficult to determine whether a factor is truly expressed in the cell. This paper studies the properties of contaminant noise from an analytical perspective, showing that the cell bursting process constrains the form of the noise distribution across factors. These constraints can be leveraged to improve decontamination by removing counts that are more likely the result of noise than expression. In two biological replicates of a perturbation experiment, run across two sequencing protocols, decontaminated counts agree with bulk genomic measurements of the transduction rate and are automatically corrected for differences in sequencing.
Inference principles are postulated within statistics, they are not usually derived from any underlying physical constraints on real world observers. An exception to this rule is that in the context of partially observable information engines decision making can be based solely on physical arguments. An inference principle can be derived from minimization of the lower bound on average dissipation [Phys. Rev. Lett., 124(5), 050601], which is achievable with a quasi-static process. Thermodynamically rational decision strategies can be computed algorithmically with the resulting approach. Here, we use this to study an example of binary decision making under uncertainty that is very simple, yet just interesting enough to be non-trivial: observations are either entirely uninformative, or they carry complete certainty about the variable that needs to be known for successful energy harvesting. Solutions found algorithmically can be expressed in terms of parameterized soft partitions of the observable space. This allows for their interpretation, as well as for the analytical calculation of all quantities that characterize the decision problem and the thermodynamically rational strategies.
We consider the problem of attaining either the maximal increase or reduction of the robustness of a complex network by means of a bounded modification of a subset of the edge weights. We propose two novel strategies combining Krylov subspace approximations with a greedy scheme and an interior point method employing either the Hessian or its approximation computed via the limited-memory Broyden-Fletcher-Goldfarb-Shanno algorithm (L-BFGS). The paper discusses the computational and modeling aspects of our methodology and illustrates the various optimization problems on networks that can be addressed within the proposed framework. Finally, in the numerical experiments we compare the performances of our algorithms with state-of-the-art techniques on synthetic and real-world networks.
We propose an approach to compute inner and outer-approximations of the sets of values satisfying constraints expressed as arbitrarily quantified formulas. Such formulas arise for instance when specifying important problems in control such as robustness, motion planning or controllers comparison. We propose an interval-based method which allows for tractable but tight approximations. We demonstrate its applicability through a series of examples and benchmarks using a prototype implementation.