This paper advances the understanding of how the size of a machine learning model affects its vulnerability to poisoning, despite state-of-the-art defenses. Given isotropic random honest feature vectors and the geometric median (or clipped mean) as the robust gradient aggregator rule, we essentially prove that, perhaps surprisingly, linear and logistic regressions with $D \geq 169 H^2/P^2$ parameters are subject to arbitrary model manipulation by poisoners, where $H$ and $P$ are the numbers of honestly labeled and poisoned data points used for training. Our experiments go on exposing a fundamental tradeoff between augmenting model expressivity and increasing the poisoners' attack surface, on both synthetic data, and on MNIST & FashionMNIST data for linear classifiers with random features. We also discuss potential implications for source-based learning and neural nets.
The training of modern machine learning models often consists in solving high-dimensional non-convex optimisation problems that are subject to large-scale data. In this context, momentum-based stochastic optimisation algorithms have become particularly widespread. The stochasticity arises from data subsampling which reduces computational cost. Both, momentum and stochasticity help the algorithm to converge globally. In this work, we propose and analyse a continuous-time model for stochastic gradient descent with momentum. This model is a piecewise-deterministic Markov process that represents the optimiser by an underdamped dynamical system and the data subsampling through a stochastic switching. We investigate longtime limits, the subsampling-to-no-subsampling limit, and the momentum-to-no-momentum limit. We are particularly interested in the case of reducing the momentum over time. Under convexity assumptions, we show convergence of our dynamical system to the global minimiser when reducing momentum over time and letting the subsampling rate go to infinity. We then propose a stable, symplectic discretisation scheme to construct an algorithm from our continuous-time dynamical system. In experiments, we study our scheme in convex and non-convex test problems. Additionally, we train a convolutional neural network in an image classification problem. Our algorithm {attains} competitive results compared to stochastic gradient descent with momentum.
The object of observation in present paper is statistical independence of real sequences and its description as independence with re spect to certain class of densities.
The majority of fault-tolerant distributed algorithms are designed assuming a nominal corruption model, in which at most a fraction $f_n$ of parties can be corrupted by the adversary. However, due to the infamous Sybil attack, nominal models are not sufficient to express the trust assumptions in open (i.e., permissionless) settings. Instead, permissionless systems typically operate in a weighted model, where each participant is associated with a weight and the adversary can corrupt a set of parties holding at most a fraction $f_w$ of the total weight. In this paper, we suggest a simple way to transform a large class of protocols designed for the nominal model into the weighted model. To this end, we formalize and solve three novel optimization problems, which we collectively call the weight reduction problems, that allow us to map large real weights into small integer weights while preserving the properties necessary for the correctness of the protocols. In all cases, we manage to keep the sum of the integer weights to be at most linear in the number of parties, resulting in extremely efficient protocols for the weighted model. Moreover, we demonstrate that, on weight distributions that emerge in practice, the sum of the integer weights tends to be far from the theoretical worst case and, sometimes, even smaller than the number of participants. While, for some protocols, our transformation requires an arbitrarily small reduction in resilience (i.e., $f_w = f_n - \epsilon$), surprisingly, for many important problems, we manage to obtain weighted solutions with the same resilience ($f_w = f_n$) as nominal ones. Notable examples include erasure-coded distributed storage and broadcast protocols, verifiable secret sharing, and asynchronous consensus.
Coupled decompositions are a widely used tool for data fusion. This paper studies the coupled matrix factorization (CMF) where two matrices $X$ and $Y$ are represented in a low-rank format sharing one common factor, as well as the coupled matrix and tensor factorization (CMTF) where a matrix $Y$ and a tensor $\mathcal{X}$ are represented in a low-rank format sharing a factor matrix. We show that these problems are equivalent to the low-rank approximation of the matrix $[X \ Y]$ for CMF, that is $[X_{(1)} \ Y]$ for CMTF. Then, in order to speed up computation process, we adapt several randomization techniques, namely, randomized SVD, randomized subspace iteration, and randomized block Krylov iteration to the algorithms for coupled decompositions. We present extensive results of the numerical tests. Furthermore, as a novel approach and with a high success rate, we apply our randomized algorithms to the face recognition problem.
Many models require integrals of high-dimensional functions: for instance, to obtain marginal likelihoods. Such integrals may be intractable, or too expensive to compute numerically. Instead, we can use the Laplace approximation (LA). The LA is exact if the function is proportional to a normal density; its effectiveness therefore depends on the function's true shape. Here, we propose the use of the probabilistic numerical framework to develop a diagnostic for the LA and its underlying shape assumptions, modelling the function and its integral as a Gaussian process and devising a "test" by conditioning on a finite number of function values. The test is decidedly non-asymptotic and is not intended as a full substitute for numerical integration - rather, it is simply intended to test the feasibility of the assumptions underpinning the LA with as minimal computation. We discuss approaches to optimize and design the test, apply it to known sample functions, and highlight the challenges of high dimensions.
We implement a simulation environment on top of NetSquid that is specifically designed for estimating the end-to-end fidelity across a path of quantum repeaters or quantum switches. The switch model includes several generalizations which are not currently available in other tools, and are useful for gaining insight into practical and realistic quantum network engineering problems: an arbitrary number of memory registers at the switches, simplicity in including entanglement distillation mechanisms, arbitrary switching topologies, and more accurate models for the depolarization noise. An illustrative case study is presented, namely a comparison in terms of performance between a repeater chain where repeaters can only swap sequentially, and a single switch equipped with multiple memory registers, able to handle multiple swapping requests.
This paper leverages various philosophical and ontological frameworks to explore the concept of embodied artificial general intelligence (AGI), its relationship to human consciousness, and the key role of the metaverse in facilitating this relationship. Several theoretical frameworks underpin this exploration, such as embodied cognition, Michael Levin's computational boundary of a "Self," Donald D. Hoffman's Interface Theory of Perception, and Bernardo Kastrup's analytical idealism, which lead to considering our perceived outer reality as a symbolic representation of alternate inner states of being, and where AGI could embody a higher consciousness with a larger computational boundary. The paper further discusses the developmental stages of AGI, the requirements for the emergence of an embodied AGI, the importance of a calibrated symbolic interface for AGI, and the key role played by the metaverse, decentralized systems, open-source blockchain technology, as well as open-source AI research. It also explores the idea of a feedback loop between AGI and human users in metaverse spaces as a tool for AGI calibration, as well as the role of local homeostasis and decentralized governance as preconditions for achieving a stable embodied AGI. The paper concludes by emphasizing the importance of achieving a certain degree of harmony in human relations and recognizing the interconnectedness of humanity at a global level, as key prerequisites for the emergence of a stable embodied AGI.
This paper deals with the application of probabilistic time integration methods to semi-explicit partial differential-algebraic equations of parabolic type and its semi-discrete counterparts, namely semi-explicit differential-algebraic equations of index 2. The proposed methods iteratively construct a probability distribution over the solution of deterministic problems, enhancing the information obtained from the numerical simulation. Within this paper, we examine the efficacy of the randomized versions of the implicit Euler method, the midpoint scheme, and exponential integrators of first and second order. By demonstrating the consistency and convergence properties of these solvers, we illustrate their utility in capturing the sensitivity of the solution to numerical errors. Our analysis establishes the theoretical validity of randomized time integration for constrained systems and offers insights into the calibration of probabilistic integrators for practical applications.
This paper addresses nonparametric estimation of nonlinear multivariate Hawkes processes, where the interaction functions are assumed to lie in a reproducing kernel Hilbert space (RKHS). Motivated by applications in neuroscience, the model allows complex interaction functions, in order to express exciting and inhibiting effects, but also a combination of both (which is particularly interesting to model the refractory period of neurons), and considers in return that conditional intensities are rectified by the ReLU function. The latter feature incurs several methodological challenges, for which workarounds are proposed in this paper. In particular, it is shown that a representer theorem can be obtained for approximated versions of the log-likelihood and the least-squares criteria. Based on it, we propose an estimation method, that relies on two simple approximations (of the ReLU function and of the integral operator). We provide an approximation bound, justifying the negligible statistical effect of these approximations. Numerical results on synthetic data confirm this fact as well as the good asymptotic behavior of the proposed estimator. It also shows that our method achieves a better performance compared to related nonparametric estimation techniques and suits neuronal applications.
Deep learning constitutes a recent, modern technique for image processing and data analysis, with promising results and large potential. As deep learning has been successfully applied in various domains, it has recently entered also the domain of agriculture. In this paper, we perform a survey of 40 research efforts that employ deep learning techniques, applied to various agricultural and food production challenges. We examine the particular agricultural problems under study, the specific models and frameworks employed, the sources, nature and pre-processing of data used, and the overall performance achieved according to the metrics used at each work under study. Moreover, we study comparisons of deep learning with other existing popular techniques, in respect to differences in classification or regression performance. Our findings indicate that deep learning provides high accuracy, outperforming existing commonly used image processing techniques.