Brain atrophy and white matter hyperintensity (WMH) are critical neuroimaging features for ascertaining brain injury in cerebrovascular disease and multiple sclerosis. Automated segmentation and quantification is desirable but existing methods require high-resolution MRI with good signal-to-noise ratio (SNR). This precludes application to clinical and low-field portable MRI (pMRI) scans, thus hampering large-scale tracking of atrophy and WMH progression, especially in underserved areas where pMRI has huge potential. Here we present a method that segments white matter hyperintensity and 36 brain regions from scans of any resolution and contrast (including pMRI) without retraining. We show results on eight public datasets and on a private dataset with paired high- and low-field scans (3T and 64mT), where we attain strong correlation between the WMH ($\rho$=.85) and hippocampal volumes (r=.89) estimated at both fields. Our method is publicly available as part of FreeSurfer, at: //surfer.nmr.mgh.harvard.edu/fswiki/WMH-SynthSeg.
The problem of phase retrieval has many applications in the field of optical imaging. Motivated by imaging experiments with biological specimens, we primarily consider the setting of low-dose illumination where Poisson noise plays the dominant role. In this paper, we discuss gradient descent algorithms based on different loss functions adapted to data affected by Poisson noise, in particular in the low-dose regime. Starting from the maximum log-likelihood function for the Poisson distribution, we investigate different regularizations and approximations of the problem to design an algorithm that meets the requirements that are faced in applications. In the course of this, we focus on low-count measurements. For all suggested loss functions, we study the convergence of the respective gradient descent algorithms to stationary points and find constant step sizes that guarantee descent of the loss in each iteration. Numerical experiments in the low-dose regime are performed to corroborate the theoretical observations.
In many communication contexts, the capabilities of the involved actors cannot be known beforehand, whether it is a cell, a plant, an insect, or even a life form unknown to Earth. Regardless of the recipient, the message space and time scale could be too fast, too slow, too large, or too small and may never be decoded. Therefore, it pays to devise a way to encode messages agnostic of space and time scales. We propose the use of fractal functions as self-executable infinite-frequency carriers for sending messages, given their properties of structural self-similarity and scale invariance. We call it `fractal messaging'. Starting from a spatial embedding, we introduce a framework for a space-time scale-free messaging approach to this challenge. When considering a space and time-agnostic framework for message transmission, it would be interesting to encode a message such that it could be decoded at several spatio-temporal scales. Hence, the core idea of the framework proposed herein is to encode a binary message as waves along infinitely many frequencies (in power-like distributions) and amplitudes, transmit such a message, and then decode and reproduce it. To do so, the components of the Weierstrass function, a known fractal, are used as carriers of the message. Each component will have its amplitude modulated to embed the binary stream, allowing for a space-time-agnostic approach to messaging.
Energy-efficient spikformer has been proposed by integrating the biologically plausible spiking neural network (SNN) and artificial Transformer, whereby the Spiking Self-Attention (SSA) is used to achieve both higher accuracy and lower computational cost. However, it seems that self-attention is not always necessary, especially in sparse spike-form calculation manners. In this paper, we innovatively replace vanilla SSA (using dynamic bases calculating from Query and Key) with spike-form Fourier Transform, Wavelet Transform, and their combinations (using fixed triangular or wavelets bases), based on a key hypothesis that both of them use a set of basis functions for information transformation. Hence, the Fourier-or-Wavelet-based spikformer (FWformer) is proposed and verified in visual classification tasks, including both static image and event-based video datasets. The FWformer can achieve comparable or even higher accuracies ($0.4\%$-$1.5\%$), higher running speed ($9\%$-$51\%$ for training and $19\%$-$70\%$ for inference), reduced theoretical energy consumption ($20\%$-$25\%$), and reduced GPU memory usage ($4\%$-$26\%$), compared to the standard spikformer. Our result indicates the continuous refinement of new Transformers, that are inspired either by biological discovery (spike-form), or information theory (Fourier or Wavelet Transform), is promising.
The Shepp & Vardi (1982) implementation of the EM algorithm for PET scan tumor estimation provides a point estimate of the tumor. The current study presents a closed-form formula of the observed Fisher information for Shepp & Vardi PET scan tumor estimation. Keywords: PET scan, EM algorithm, Fisher information matrix, standard errors.
Nowadays, botnets have become one of the major threats to cyber security. The characteristics of botnets are mainly reflected in bots network behavior and their intercommunication relationships. Existing botnet detection methods use flow features or topology features individually, which overlook the other type of feature. This affects model performance. In this paper, we propose a botnet detection model which uses graph convolutional network (GCN) to deeply fuse flow features and topology features for the first time. We construct communication graphs from network traffic and represent nodes with flow features. Due to the imbalance of existing public traffic flow datasets, it is impossible to train a GCN model on these datasets. Therefore, we use a balanced public communication graph dataset to pretrain a GCN model, thereby guaranteeing its capacity for identify topology features. We then feed the communication graph with flow features into the pretrained GCN. The output from the last hidden layer is treated as the fusion of flow and topology features. Additionally, by adjusting the number of layers in the GCN network, the model can effectively detect botnets under both C2 and P2P structures. Validated on the public ISCX2014 dataset, our approach achieves a remarkable recall rate 92.90% and F1-score 92.76% for C2 botnets, alongside recall rate 94.66% and F1-score of 92.35% for P2P botnets. These results not only demonstrate the effectiveness of our method, but also outperform the performance of the currently leading detection models.
Asymptotic methods for hypothesis testing in high-dimensional data usually require the dimension of the observations to increase to infinity, often with an additional condition on its rate of increase compared to the sample size. On the other hand, multivariate asymptotic methods are valid for fixed dimension only, and their practical implementations in hypothesis testing methodology typically require the sample size to be large compared to the dimension for yielding desirable results. However, in practical scenarios, it is usually not possible to determine whether the dimension of the data at hand conform to the conditions required for the validity of the high-dimensional asymptotic methods, or whether the sample size is large enough compared to the dimension of the data. In this work, a theory of asymptotic convergence is proposed, which holds uniformly over the dimension of the random vectors. This theory attempts to unify the asymptotic results for fixed-dimensional multivariate data and high-dimensional data, and accounts for the effect of the dimension of the data on the performance of the hypothesis testing procedures. The methodology developed based on this asymptotic theory can be applied to data of any dimension. An application of this theory is demonstrated in the two-sample test for the equality of locations. The test statistic proposed is unscaled by the sample covariance, similar to usual tests for high-dimensional data. Using simulated examples, it is demonstrated that the proposed test exhibits better performance compared to several popular tests in the literature for high-dimensional data. Further, it is demonstrated in simulated models that the proposed unscaled test performs better than the usual scaled two-sample tests for multivariate data, including the Hotelling's $T^2$ test for multivariate Gaussian data.
We describe a family of iterative algorithms that involve the repeated execution of discrete and inverse discrete Fourier transforms. One interesting member of this family is motivated by the discrete Fourier transform uncertainty principle and involves the application of a sparsification operation to both the real domain and frequency domain data with convergence obtained when real domain sparsity hits a stable pattern. This sparsification variant has practical utility for signal denoising, in particular the recovery of a periodic spike signal in the presence of Gaussian noise. General convergence properties and denoising performance relative to existing methods are demonstrated using simulation studies. An R package implementing this technique and related resources can be found at //hrfrost.host.dartmouth.edu/IterativeFT.
The importance of considering contextual probabilities in shaping response patterns within psychological testing is underscored, despite the ubiquitous nature of order effects discussed extensively in methodological literature. Drawing from concepts such as path-dependency, first-order autocorrelation, state-dependency, and hysteresis, the present study is an attempt to address how earlier responses serve as an anchor for subsequent answers in tests, surveys, and questionnaires. Introducing the notion of non-commuting observables derived from quantum physics, I highlight their role in characterizing psychological processes and the impact of measurement instruments on participants' responses. We advocate for the utilization of first-order Markov chain modeling to capture and forecast sequential dependencies in survey and test responses. The employment of the first-order Markov chain model lies in individuals' propensity to exhibit partial focus to preceding responses, with recent items most likely exerting a substantial influence on subsequent response selection. This study contributes to advancing our understanding of the dynamics inherent in sequential data within psychological research and provides a methodological framework for conducting longitudinal analyses of response patterns of test and questionnaire.
The timely detection of disease outbreaks through reliable early warning signals (EWSs) is indispensable for effective public health mitigation strategies. Nevertheless, the intricate dynamics of real-world disease spread, often influenced by diverse sources of noise and limited data in the early stages of outbreaks, pose a significant challenge in developing reliable EWSs, as the performance of existing indicators varies with extrinsic and intrinsic noises. Here, we address the challenge of modeling disease when the measurements are corrupted by additive white noise, multiplicative environmental noise, and demographic noise into a standard epidemic mathematical model. To navigate the complexities introduced by these noise sources, we employ a deep learning algorithm that provides EWS in infectious disease outbreak by training on noise-induced disease-spreading models. The indicator's effectiveness is demonstrated through its application to real-world COVID-19 cases in Edmonton and simulated time series derived from diverse disease spread models affected by noise. Notably, the indicator captures an impending transition in a time series of disease outbreaks and outperforms existing indicators. This study contributes to advancing early warning capabilities by addressing the intricate dynamics inherent in real-world disease spread, presenting a promising avenue for enhancing public health preparedness and response efforts.
We endeavour to estimate numerous multi-dimensional means of various probability distributions on a common space based on independent samples. Our approach involves forming estimators through convex combinations of empirical means derived from these samples. We introduce two strategies to find appropriate data-dependent convex combination weights: a first one employing a testing procedure to identify neighbouring means with low variance, which results in a closed-form plug-in formula for the weights, and a second one determining weights via minimization of an upper confidence bound on the quadratic risk.Through theoretical analysis, we evaluate the improvement in quadratic risk offered by our methods compared to the empirical means. Our analysis focuses on a dimensional asymptotics perspective, showing that our methods asymptotically approach an oracle (minimax) improvement as the effective dimension of the data increases.We demonstrate the efficacy of our methods in estimating multiple kernel mean embeddings through experiments on both simulated and real-world datasets.