In toxicology research, experiments are often conducted to determine the effect of toxicant exposure on the behavior of mice, where mice are randomized to receive the toxicant or not. In particular, in fixed interval experiments, one provides a mouse reinforcers (e.g., a food pellet), contingent upon some action taken by the mouse (e.g., a press of a lever), but the reinforcers are only provided after fixed time intervals. Often, to analyze fixed interval experiments, one specifies and then estimates the conditional state-action distribution (e.g., using an ANOVA). This existing approach, which in the reinforcement learning framework would be called modeling the mouse's "behavioral policy," is sensitive to misspecification. It is likely that any model for the behavioral policy is misspecified; a mapping from a mouse's exposure to their actions can be highly complex. In this work, we avoid specifying the behavioral policy by instead learning the mouse's reward function. Specifying a reward function is as challenging as specifying a behavioral policy, but we propose a novel approach that incorporates knowledge of the optimal behavior, which is often known to the experimenter, to avoid specifying the reward function itself. In particular, we define the reward as a divergence of the mouse's actions from optimality, where the representations of the action and optimality can be arbitrarily complex. The parameters of the reward function then serve as a measure of the mouse's tolerance for divergence from optimality, which is a novel summary of the impact of the exposure. The parameter itself is scalar, and the proposed objective function is differentiable, allowing us to benefit from typical results on consistency of parametric estimators while making very few assumptions.
Closed-loop neuroscience experimentation, where recorded neural activity is used to modify the experiment on-the-fly, is critical for deducing causal connections and optimizing experimental time. A critical step in creating a closed-loop experiment is real-time inference of neural activity from streaming recordings. One challenging modality for real-time processing is multi-photon calcium imaging (CI). CI enables the recording of activity in large populations of neurons however, often requires batch processing of the video data to extract single-neuron activity from the fluorescence videos. We use the recently proposed robust time-trace estimator-Sparse Emulation of Unused Dictionary Objects (SEUDO) algorithm-as a basis for a new on-line processing algorithm that simultaneously identifies neurons in the fluorescence video and infers their time traces in a way that is robust to as-yet unidentified neurons. To achieve real-time SEUDO (realSEUDO), we optimize the core estimator via both algorithmic improvements and an fast C-based implementation, and create a new cell finding loop to enable realSEUDO to also identify new cells. We demonstrate comparable performance to offline algorithms (e.g., CNMF), and improved performance over the current on-line approach (OnACID) at speeds of 120 Hz on average.
Summary: Medical researchers obtain knowledge about the prevention and treatment of disability and disease using physical measurements and image data. To assist in this endeavor, feature extraction packages are available that are designed to collect data from the image structure. In this study, we aim to augment current works by adding to the current mix of shape-based features. The significance of shape-based features has been explored extensively in research for several decades, but there is no single package available in which all shape-related features can be extracted easily by the researcher. PyCellMech has been crafted to address this gap. The PyCellMech package extracts three classes of shape features, which are classified as one-dimensional, geometric, and polygonal. Future iterations will be expanded to include other feature classes, such as scale-space. Availability and implementation: PyCellMech is freely available at //github.com/icm-dac/pycellmech.
In the realm of Federated Learning (FL) applied to remote sensing image classification, this study introduces and assesses several innovative communication strategies. Our exploration includes feature-centric communication, pseudo-weight amalgamation, and a combined method utilizing both weights and features. Experiments conducted on two public scene classification datasets unveil the effectiveness of these strategies, showcasing accelerated convergence, heightened privacy, and reduced network information exchange. This research provides valuable insights into the implications of feature-centric communication in FL, offering potential applications tailored for remote sensing scenarios.
This work proposes a novel variational approximation of partial differential equations on moving geometries determined by explicit boundary representations. The benefits of the proposed formulation are the ability to handle large displacements of explicitly represented domain boundaries without generating body-fitted meshes and remeshing techniques. For the space discretization, we use a background mesh and an unfitted method that relies on integration on cut cells only. We perform this intersection by using clipping algorithms. To deal with the mesh movement, we pullback the equations to a reference configuration (the spatial mesh at the initial time slab times the time interval) that is constant in time. This way, the geometrical intersection algorithm is only required in 3D, another key property of the proposed scheme. At the end of the time slab, we compute the deformed mesh, intersect the deformed boundary with the background mesh, and consider an exact transfer operator between meshes to compute jump terms in the time discontinuous Galerkin integration. The transfer is also computed using geometrical intersection algorithms. We demonstrate the applicability of the method to fluid problems around rotating (2D and 3D) geometries described by oriented boundary meshes. We also provide a set of numerical experiments that show the optimal convergence of the method.
This paper addresses the approximation of the mean curvature flow of thin structures for which classical phase field methods are not suitable. By thin structures, we mean surfaces that are not domain boundaries, typically higher codimension objects such as 1D curves in 3D, i.e. filaments, or soap films spanning a boundary curve. To approximate the mean curvature flow of such surfaces, we consider a small thickening and we apply to the thickened set an evolution model that combines the classical Allen-Cahn equation with a penalty term that takes on larger values around the skeleton of the set. The novelty of our approach lies in the definition of this penalty term that guarantees a minimal thickness of the evolving set and prevents it from disappearing unexpectedly. We prove a few theoretical properties of our model, provide examples showing the connection with higher codimension mean curvature flow, and introduce a quasi-static numerical scheme with explicit integration of the penalty term. We illustrate the numerical efficiency of the model with accurate approximations of filament structures evolving by mean curvature flow, and we also illustrate its ability to find complex 3D approximations of solutions to the Steiner problem or the Plateau problem.
Pretrial risk assessment tools are used in jurisdictions across the country to assess the likelihood of "pretrial failure," the event where defendants either fail to appear for court or reoffend. Judicial officers, in turn, use these assessments to determine whether to release or detain defendants during trial. While algorithmic risk assessment tools were designed to predict pretrial failure with greater accuracy relative to judges, there is still concern that both risk assessment recommendations and pretrial decisions are biased against minority groups. In this paper, we develop methods to investigate the association between risk factors and pretrial failure, while simultaneously estimating misclassification rates of pretrial risk assessments and of judicial decisions as a function of defendant race. This approach adds to a growing literature that makes use of outcome misclassification methods to answer questions about fairness in pretrial decision-making. We give a detailed simulation study for our proposed methodology and apply these methods to data from the Virginia Department of Criminal Justice Services. We estimate that the VPRAI algorithm has near-perfect specificity, but its sensitivity differs by defendant race. Judicial decisions also display evidence of bias; we estimate wrongful detention rates of 39.7% and 51.4% among white and Black defendants, respectively.
For a set of robots (or agents) moving in a graph, two properties are highly desirable: confidentiality (i.e., a message between two agents must not pass through any intermediate agent) and efficiency (i.e., messages are delivered through shortest paths). These properties can be obtained if the \textsc{Geodesic Mutual Visibility} (GMV, for short) problem is solved: oblivious robots move along the edges of the graph, without collisions, to occupy some vertices that guarantee they become pairwise geodesic mutually visible. This means there is a shortest path (i.e., a ``geodesic'') between each pair of robots along which no other robots reside. In this work, we optimally solve GMV on finite hexagonal grids $G_k$. This, in turn, requires first solving a graph combinatorial problem, i.e. determining the maximum number of mutually visible vertices in $G_k$.
Most of the tailored materials are heterogeneous at the ingredient level. Analysis of those heterogeneous structures requires the knowledge of microstructure. With the knowledge of microstructure, multiscale analysis is carried out with homogenization at the micro level. Second-order homogenization is carried out whenever the ingredient size is comparable to the structure size. Therefore, knowledge of microstructure and its size is indispensable to analyzing those heterogeneous structures. Again, any structural response contains all the information of microstructure, like microstructure distribution, volume fraction, size of ingredients, etc. Here, inverse analysis is carried out to identify a heterogeneous microstructure from macroscopic measurement. Two-step inverse analysis is carried out in the identification process; in the first step, the macrostructures length scale and effective properties are identified from the macroscopic measurement using gradient-based optimization. In the second step, those effective properties and length scales are used to determine the microstructure in inverse second-order homogenization.
We hypothesize that due to the greedy nature of learning in multi-modal deep neural networks, these models tend to rely on just one modality while under-fitting the other modalities. Such behavior is counter-intuitive and hurts the models' generalization, as we observe empirically. To estimate the model's dependence on each modality, we compute the gain on the accuracy when the model has access to it in addition to another modality. We refer to this gain as the conditional utilization rate. In the experiments, we consistently observe an imbalance in conditional utilization rates between modalities, across multiple tasks and architectures. Since conditional utilization rate cannot be computed efficiently during training, we introduce a proxy for it based on the pace at which the model learns from each modality, which we refer to as the conditional learning speed. We propose an algorithm to balance the conditional learning speeds between modalities during training and demonstrate that it indeed addresses the issue of greedy learning. The proposed algorithm improves the model's generalization on three datasets: Colored MNIST, Princeton ModelNet40, and NVIDIA Dynamic Hand Gesture.
When and why can a neural network be successfully trained? This article provides an overview of optimization algorithms and theory for training neural networks. First, we discuss the issue of gradient explosion/vanishing and the more general issue of undesirable spectrum, and then discuss practical solutions including careful initialization and normalization methods. Second, we review generic optimization methods used in training neural networks, such as SGD, adaptive gradient methods and distributed methods, and theoretical results for these algorithms. Third, we review existing research on the global issues of neural network training, including results on bad local minima, mode connectivity, lottery ticket hypothesis and infinite-width analysis.