This note proves that the nonparametric maximum likelihood estimator of a univariate log-concave probability density satisfies some consistency properties in the tail regions.
Logistic regression is a classical model for describing the probabilistic dependence of binary responses to multivariate covariates. We consider the predictive performance of the maximum likelihood estimator (MLE) for logistic regression, assessed in terms of logistic risk. We consider two questions: first, that of the existence of the MLE (which occurs when the dataset is not linearly separated), and second that of its accuracy when it exists. These properties depend on both the dimension of covariates and on the signal strength. In the case of Gaussian covariates and a well-specified logistic model, we obtain sharp non-asymptotic guarantees for the existence and excess logistic risk of the MLE. We then generalize these results in two ways: first, to non-Gaussian covariates satisfying a certain two-dimensional margin condition, and second to the general case of statistical learning with a possibly misspecified logistic model. Finally, we consider the case of a Bernoulli design, where the behavior of the MLE is highly sensitive to the parameter direction.
In real-world data, information is stored in extremely large feature vectors. These variables are typically correlated due to complex interactions involving many features simultaneously. Such correlations qualitatively correspond to semantic roles and are naturally recognized by both the human brain and artificial neural networks. This recognition enables, for instance, the prediction of missing parts of an image or text based on their context. We present a method to detect these correlations in high-dimensional data represented as binary numbers. We estimate the binary intrinsic dimension of a dataset, which quantifies the minimum number of independent coordinates needed to describe the data, and is therefore a proxy of semantic complexity. The proposed algorithm is largely insensitive to the so-called curse of dimensionality, and can therefore be used in big data analysis. We test this approach identifying phase transitions in model magnetic systems and we then apply it to the detection of semantic correlations of images and text inside deep neural networks.
This paper aims to devise an adaptive neural network basis method for numerically solving a second-order semilinear partial differential equation (PDE) with low-regular solutions in two/three dimensions. The method is obtained by combining basis functions from a class of shallow neural networks and the resulting multi-scale analogues, a residual strategy in adaptive methods and the non-overlapping domain decomposition method. At the beginning, in view of the solution residual, we partition the total domain $\Omega$ into $K+1$ non-overlapping subdomains, denoted respectively as $\{\Omega_k\}_{k=0}^K$, where the exact solution is smooth on subdomain $\Omega_{0}$ and low-regular on subdomain $\Omega_{k}$ ($1\le k\le K$). Secondly, the low-regular solutions on different subdomains \(\Omega_{k}\)~($1\le k\le K$) are approximated by neural networks with different scales, while the smooth solution on subdomain \(\Omega_0\) is approximated by the initialized neural network. Thirdly, we determine the undetermined coefficients by solving the linear least squares problems directly or the nonlinear least squares problem via the Gauss-Newton method. The proposed method can be extended to multi-level case naturally. Finally, we use this adaptive method for several peak problems in two/three dimensions to show its high-efficient computational performance.
Mobile devices and the Internet of Things (IoT) devices nowadays generate a large amount of heterogeneous spatial-temporal data. It remains a challenging problem to model the spatial-temporal dynamics under privacy concern. Federated learning (FL) has been proposed as a framework to enable model training across distributed devices without sharing original data which reduce privacy concern. Personalized federated learning (PFL) methods further address data heterogenous problem. However, these methods don't consider natural spatial relations among nodes. For the sake of modeling spatial relations, Graph Neural Netowork (GNN) based FL approach have been proposed. But dynamic spatial-temporal relations among edge nodes are not taken into account. Several approaches model spatial-temporal dynamics in a centralized environment, while less effort has been made under federated setting. To overcome these challeges, we propose a novel Federated Adaptive Spatial-Temporal Attention (FedASTA) framework to model the dynamic spatial-temporal relations. On the client node, FedASTA extracts temporal relations and trend patterns from the decomposed terms of original time series. Then, on the server node, FedASTA utilize trend patterns from clients to construct adaptive temporal-spatial aware graph which captures dynamic correlation between clients. Besides, we design a masked spatial attention module with both static graph and constructed adaptive graph to model spatial dependencies among clients. Extensive experiments on five real-world public traffic flow datasets demonstrate that our method achieves state-of-art performance in federated scenario. In addition, the experiments made in centralized setting show the effectiveness of our novel adaptive graph construction approach compared with other popular dynamic spatial-temporal aware methods.
We implement a simulation environment on top of NetSquid that is specifically designed for estimating the end-to-end fidelity across a path of quantum repeaters or quantum switches. The switch model includes several generalizations which are not currently available in other tools, and are useful for gaining insight into practical and realistic quantum network engineering problems: an arbitrary number of memory registers at the switches, simplicity in including entanglement distillation mechanisms, arbitrary switching topologies, and more accurate models for the depolarization noise. An illustrative case study is presented, namely a comparison in terms of performance between a repeater chain where repeaters can only swap sequentially, and a single switch equipped with multiple memory registers, able to handle multiple swapping requests.
This paper leverages various philosophical and ontological frameworks to explore the concept of embodied artificial general intelligence (AGI), its relationship to human consciousness, and the key role of the metaverse in facilitating this relationship. Several theoretical frameworks underpin this exploration, such as embodied cognition, Michael Levin's computational boundary of a "Self," Donald D. Hoffman's Interface Theory of Perception, and Bernardo Kastrup's analytical idealism, which lead to considering our perceived outer reality as a symbolic representation of alternate inner states of being, and where AGI could embody a higher consciousness with a larger computational boundary. The paper further discusses the developmental stages of AGI, the requirements for the emergence of an embodied AGI, the importance of a calibrated symbolic interface for AGI, and the key role played by the metaverse, decentralized systems, open-source blockchain technology, as well as open-source AI research. It also explores the idea of a feedback loop between AGI and human users in metaverse spaces as a tool for AGI calibration, as well as the role of local homeostasis and decentralized governance as preconditions for achieving a stable embodied AGI. The paper concludes by emphasizing the importance of achieving a certain degree of harmony in human relations and recognizing the interconnectedness of humanity at a global level, as key prerequisites for the emergence of a stable embodied AGI.
In epidemiology, understanding causal mechanisms across different populations is essential for designing effective public health interventions. Recently, difference graphs have been introduced as a tool to visually represent causal variations between two distinct populations. While there has been progress in inferring these graphs from data through causal discovery methods, there remains a gap in systematically leveraging their potential to enhance causal reasoning. This paper addresses that gap by establishing conditions for identifying causal changes and effects using difference graphs and observational data. It specifically focuses on identifying total causal changes and total effects in a nonparametric framework, as well as direct causal changes and direct effects in a linear context. In doing so, it provides a novel approach to causal reasoning that holds potential for various public health applications.
We give a streamlined short proof of Newman's theorem in communication complexity by applying the classical and the approximate Carath\'eodory's theorems.
Accurate computation of robust estimates for extremal quantiles of empirical distributions is an essential task for a wide range of applicative fields, including economic policymaking and the financial industry. Such estimates are particularly critical in calculating risk measures, such as Growth-at-Risk (GaR). % and Value-at-Risk (VaR). This work proposes a conformal framework to estimate calibrated quantiles, and presents an extensive simulation study and a real-world analysis of GaR to examine its benefits with respect to the state of the art. Our findings show that CP methods consistently improve the calibration and robustness of quantile estimates at all levels. The calibration gains are appreciated especially at extremal quantiles, which are critical for risk assessment and where traditional methods tend to fall short. In addition, we introduce a novel property that guarantees coverage under the exchangeability assumption, providing a valuable tool for managing risks by quantifying and controlling the likelihood of future extreme observations.
Methods for analyzing representations in neural systems are increasingly popular tools in neuroscience and mechanistic interpretability. Measures comparing neural activations across conditions, architectures, and species give scalable ways to understand information transformation within different neural networks. However, recent findings show that some metrics respond to spurious signals, leading to misleading results. Establishing benchmark test cases is thus essential for identifying the most reliable metric and potential improvements. We propose that compositional learning in recurrent neural networks (RNNs) can provide a test case for dynamical representation alignment metrics. Implementing this case allows us to evaluate if metrics can identify representations that develop throughout learning and determine if representations identified by metrics reflect the network's actual computations. Building both attractor and RNN based test cases, we show that the recently proposed Dynamical Similarity Analysis (DSA) is more noise robust and reliably identifies behaviorally relevant representations compared to prior metrics (Procrustes, CKA). We also demonstrate how such test cases can extend beyond metric evaluation to study new architectures. Specifically, testing DSA in modern (Mamba) state space models suggests that these models, unlike RNNs, may not require changes in recurrent dynamics due to their expressive hidden states. Overall, we develop test cases that showcase how DSA's enhanced ability to detect dynamical motifs makes it highly effective for identifying ongoing computations in RNNs and revealing how networks learn tasks.