The fundamental functional summary statistics used for studying spatial point patterns are developed for marked homogeneous and inhomogeneous point processes on the surface of a sphere. These are extended to point processes on the surface of three dimensional convex shapes given the bijective mapping from the shape to the sphere is known. These functional summary statistics are used to test for independence between the marginals of multi-type spatial point processes with methods for sampling the null distribution proposed and discussed. This is illustrated on both simulated data and the RNGC galaxy point pattern, revealing attractive dependencies between different galaxy types.
Fairness holds a pivotal role in the realm of machine learning, particularly when it comes to addressing groups categorised by protected attributes, e.g., gender, race. Prevailing algorithms in fair learning predominantly hinge on accessibility or estimations of these protected attributes, at least in the training process. We design a single group-blind projection map that aligns the feature distributions of both groups in the source data, achieving (demographic) group parity, without requiring values of the protected attribute for individual samples in the computation of the map, as well as its use. Instead, our approach utilises the feature distributions of the privileged and unprivileged groups in a boarder population and the essential assumption that the source data are unbiased representation of the population. We present numerical results on synthetic data and real data.
The unpredictability of random numbers is fundamental to both digital security and applications that fairly distribute resources. However, existing random number generators have limitations-the generation processes cannot be fully traced, audited, and certified to be unpredictable. The algorithmic steps used in pseudorandom number generators are auditable, but they cannot guarantee that their outputs were a priori unpredictable given knowledge of the initial seed. Device-independent quantum random number generators can ensure that the source of randomness was unknown beforehand, but the steps used to extract the randomness are vulnerable to tampering. Here, for the first time, we demonstrate a fully traceable random number generation protocol based on device-independent techniques. Our protocol extracts randomness from unpredictable non-local quantum correlations, and uses distributed intertwined hash chains to cryptographically trace and verify the extraction process. This protocol is at the heart of a public traceable and certifiable quantum randomness beacon that we have launched. Over the first 40 days of operation, we completed the protocol 7434 out of 7454 attempts -- a success rate of 99.7%. Each time the protocol succeeded, the beacon emitted a pulse of 512 bits of traceable randomness. The bits are certified to be uniform with error times actual success probability bounded by $2^{-64}$. The generation of certifiable and traceable randomness represents one of the first public services that operates with an entanglement-derived advantage over comparable classical approaches.
Statistical learning under distribution shift is challenging when neither prior knowledge nor fully accessible data from the target distribution is available. Distributionally robust learning (DRL) aims to control the worst-case statistical performance within an uncertainty set of candidate distributions, but how to properly specify the set remains challenging. To enable distributional robustness without being overly conservative, in this paper, we propose a shape-constrained approach to DRL, which incorporates prior information about the way in which the unknown target distribution differs from its estimate. More specifically, we assume the unknown density ratio between the target distribution and its estimate is isotonic with respect to some partial order. At the population level, we provide a solution to the shape-constrained optimization problem that does not involve the isotonic constraint. At the sample level, we provide consistency results for an empirical estimator of the target in a range of different settings. Empirical studies on both synthetic and real data examples demonstrate the improved accuracy of the proposed shape-constrained approach.
In decision-making, maxitive functions are used for worst-case and best-case evaluations. Maxitivity gives rise to a rich structure that is well-studied in the context of the pointwise order. In this article, we investigate maxitivity with respect to general preorders and provide a representation theorem for such functionals. The results are illustrated for different stochastic orders in the literature, including the usual stochastic order, the increasing convex/concave order, and the dispersive order.
We consider linear models with scalar responses and covariates from a separable Hilbert space. The aim is to detect change points in the error distribution, based on sequential residual empirical distribution functions. Expansions for those estimated functions are more challenging in models with infinite-dimensional covariates than in regression models with scalar or vector-valued covariates due to a slower rate of convergence of the parameter estimators. Yet the suggested change point test is asymptotically distribution-free and consistent for one-change point alternatives. In the latter case we also show consistency of a change point estimator.
In the present work, strong approximation errors are analyzed for both the spatial semi-discretization and the spatio-temporal fully discretization of stochastic wave equations (SWEs) with cubic polynomial nonlinearities and additive noises. The fully discretization is achieved by the standard Galerkin ffnite element method in space and a novel exponential time integrator combined with the averaged vector ffeld approach. The newly proposed scheme is proved to exactly satisfy a trace formula based on an energy functional. Recovering the convergence rates of the scheme, however, meets essential difffculties, due to the lack of the global monotonicity condition. To overcome this issue, we derive the exponential integrability property of the considered numerical approximations, by the energy functional. Armed with these properties, we obtain the strong convergence rates of the approximations in both spatial and temporal direction. Finally, numerical results are presented to verify the previously theoretical findings.
Eigenvalue transformations, which include solving time-dependent differential equations as a special case, have a wide range of applications in scientific and engineering computation. While quantum algorithms for singular value transformations are well studied, eigenvalue transformations are distinct, especially for non-normal matrices. We propose an efficient quantum algorithm for performing a class of eigenvalue transformations that can be expressed as a certain type of matrix Laplace transformation. This allows us to significantly extend the recently developed linear combination of Hamiltonian simulation (LCHS) method [An, Liu, Lin, Phys. Rev. Lett. 131, 150603, 2023; An, Childs, Lin, arXiv:2312.03916] to represent a wider class of eigenvalue transformations, such as powers of the matrix inverse, $A^{-k}$, and the exponential of the matrix inverse, $e^{-A^{-1}}$. The latter can be interpreted as the solution of a mass-matrix differential equation of the form $A u'(t)=-u(t)$. We demonstrate that our eigenvalue transformation approach can solve this problem without explicitly inverting $A$, reducing the computational complexity.
This study presents a scalable Bayesian estimation algorithm for sparse estimation in exploratory item factor analysis based on a classical Bayesian estimation method, namely Bayesian joint modal estimation (BJME). BJME estimates the model parameters and factor scores that maximize the complete-data joint posterior density. Simulation studies show that the proposed algorithm has high computational efficiency and accuracy in variable selection over latent factors and the recovery of the model parameters. Moreover, we conducted a real data analysis using large-scale data from a psychological assessment that targeted the Big Five personality traits. This result indicates that the proposed algorithm achieves computationally efficient parameter estimation and extracts the interpretable factor loading structure.
We propose a novel, highly efficient, second-order accurate, long-time unconditionally stable numerical scheme for a class of finite-dimensional nonlinear models that are of importance in geophysical fluid dynamics. The scheme is highly efficient in the sense that only a (fixed) symmetric positive definite linear problem (with varying right hand sides) is involved at each time-step. The solutions to the scheme are uniformly bounded for all time. We show that the scheme is able to capture the long-time dynamics of the underlying geophysical model, with the global attractors as well as the invariant measures of the scheme converge to those of the original model as the step size approaches zero. In our numerical experiments, we take an indirect approach, using long-term statistics to approximate the invariant measures. Our results suggest that the convergence rate of the long-term statistics, as a function of terminal time, is approximately first order using the Jensen-Shannon metric and half-order using the L1 metric. This implies that very long time simulation is needed in order to capture a few significant digits of long time statistics (climate) correct. Nevertheless, the second order scheme's performance remains superior to that of the first order one, requiring significantly less time to reach a small neighborhood of statistical equilibrium for a given step size.
We derive information-theoretic generalization bounds for supervised learning algorithms based on the information contained in predictions rather than in the output of the training algorithm. These bounds improve over the existing information-theoretic bounds, are applicable to a wider range of algorithms, and solve two key challenges: (a) they give meaningful results for deterministic algorithms and (b) they are significantly easier to estimate. We show experimentally that the proposed bounds closely follow the generalization gap in practical scenarios for deep learning.