The discrete distribution of the length of longest increasing subsequences in random permutations of order $n$ is deeply related to random matrix theory. In a seminal work, Baik, Deift and Johansson provided an asymptotics in terms of the distribution of the largest level of the large matrix limit of GUE. As a numerical approximation, however, this asymptotics is inaccurate for small lengths and has a slow convergence rate, conjectured to be just of order $n^{-1/3}$. Here, we suggest a different type of approximation, based on Hayman's generalization of Stirling's formula. Such a formula gives already a couple of correct digits of the length distribution for $n$ as small as $20$ but allows numerical evaluations, with a uniform error of apparent order $n^{-3/4}$, for $n$ as large as $10^{12}$; thus closing the gap between a table of exact values (that has recently been compiled for up to $n=1000$) and the random matrix limit. Being much more efficient and accurate than Monte-Carlo simulations for larger $n$, the Stirling-type formula allows for a precise numerical understanding of the first few finite size correction terms to the random matrix limit, a study that has recently been initiated by Forrester and Mays, who visualized the form of the first such term. We display also the second one, of order $n^{-2/3}$, and derive (heuristically) expansions of expected value and variance of the length, exhibiting several more terms than previously known.
We propose a monitoring strategy for efficient and robust estimation of disease prevalence and case numbers within closed and enumerated populations such as schools, workplaces, or retirement communities. The proposed design relies largely on voluntary testing, notoriously biased (e.g., in the case of COVID-19) due to non-representative sampling. The approach yields unbiased and comparatively precise estimates with no assumptions about factors underlying selection of individuals for voluntary testing, building on the strength of what can be a small random sampling component. This component unlocks a previously proposed "anchor stream" estimator, a well-calibrated alternative to classical capture-recapture (CRC) estimators based on two data streams. We show here that this estimator is equivalent to a direct standardization based on "capture", i.e., selection (or not) by the voluntary testing program, made possible by means of a key parameter identified by design. This equivalency simultaneously allows for novel two-stream CRC-like estimation of general means (e.g., of continuous variables such as antibody levels or biomarkers). For inference, we propose adaptations of a Bayesian credible interval when estimating case counts and bootstrapping when estimating means of continuous variables. We use simulations to demonstrate significant precision benefits relative to random sampling alone.
In a mixed generalized linear model, the objective is to learn multiple signals from unlabeled observations: each sample comes from exactly one signal, but it is not known which one. We consider the prototypical problem of estimating two statistically independent signals in a mixed generalized linear model with Gaussian covariates. Spectral methods are a popular class of estimators which output the top two eigenvectors of a suitable data-dependent matrix. However, despite the wide applicability, their design is still obtained via heuristic considerations, and the number of samples $n$ needed to guarantee recovery is super-linear in the signal dimension $d$. In this paper, we develop exact asymptotics on spectral methods in the challenging proportional regime in which $n, d$ grow large and their ratio converges to a finite constant. By doing so, we are able to optimize the design of the spectral method, and combine it with a simple linear estimator, in order to minimize the estimation error. Our characterization exploits a mix of tools from random matrices, free probability and the theory of approximate message passing algorithms. Numerical simulations for mixed linear regression and phase retrieval display the advantage enabled by our analysis over existing designs of spectral methods.
The minimum number of inputs needed to control a network is frequently used to quantify its controllability. Control of linear dynamics through a minimum set of inputs, however, often has prohibitively large energy requirements and there is an inherent trade-off between minimizing the number of inputs and control energy. To better understand this trade-off, we study the problem of identifying a minimum set of input nodes such that controllabililty is ensured while restricting the length of the longest control chain. The longest control chain is the maximum distance from input nodes to any network node, and recent work found that reducing its length significantly reduces control energy. We map the longest control chain-constraint minimum input problem to finding a joint maximum matching and minimum dominating set. We show that this graph combinatorial problem is NP-complete, and we introduce and validate a heuristic approximation. Applying this algorithm to a collection of real and model networks, we investigate how network structure affects the minimum number of inputs, revealing, for example, that for many real networks reducing the longest control chain requires only few or no additional inputs, only the rearrangement of the input nodes.
Collective decision-making is vital for recent information and communications technologies. In our previous research, we mathematically derived conflict-free joint decision-making that optimally satisfies players' probabilistic preference profiles. However, two problems exist regarding the optimal joint decision-making method. First, as the number of choices increases, the computational cost of calculating the optimal joint selection probability matrix explodes. Second, to derive the optimal joint selection probability matrix, all players must disclose their probabilistic preferences. Now, it is noteworthy that explicit calculation of the joint probability distribution is not necessarily needed; what is necessary for collective decisions is sampling. This study examines several sampling methods that converge to heuristic joint selection probability matrices that satisfy players' preferences. We show that they can significantly reduce the above problems of computational cost and confidentiality. We analyze the probability distribution each of the sampling methods converges to, as well as the computational cost required and the confidentiality secured. In particular, we introduce two conflict-free joint sampling methods through quantum interference of photons. The first system allows the players to hide their choices while satisfying the players' preferences almost perfectly when they have the same preferences. The second system, where the physical nature of light replaces the expensive computational cost, also conceals their choices under the assumption that they have a trusted third party. This paper has been published in Phys. Rev. Applied 18, 064018 (2022) (DOI: 10.1103/PhysRevApplied.18.064018).
Mathematical models are essential for understanding and making predictions about systems arising in nature and engineering. Yet, mathematical models are a simplification of true phenomena, thus making predictions subject to uncertainty. Hence, the ability to quantify uncertainties is essential to any modelling framework, enabling the user to assess the importance of certain parameters on quantities of interest and have control over the quality of the model output by providing a rigorous understanding of uncertainty. Peridynamic models are a particular class of mathematical models that have proven to be remarkably accurate and robust for a large class of material failure problems. However, the high computational expense of peridynamic models remains a major limitation, hindering outer-loop applications that require a large number of simulations, for example, uncertainty quantification. This contribution provides a framework to make such computations feasible. By employing a Multilevel Monte Carlo (MLMC) framework, where the majority of simulations are performed using a coarse mesh, and performing relatively few simulations using a fine mesh, a significant reduction in computational cost can be realised, and statistics of structural failure can be estimated. The results show a speed-up factor of 16x over a standard Monte Carlo estimator, enabling the forward propagation of uncertain parameters in a computationally expensive peridynamic model. Furthermore, the multilevel method provides an estimate of both the discretisation error and sampling error, thus improving the confidence in numerical predictions. The performance of the approach is demonstrated through an examination of the statistical size effect in quasi-brittle materials.
The main purpose of this paper is to introduce the random tensor with normal distribution, which promotes the matrix normal distribution to a higher order case. Some basic knowledge on tensors are introduced before we focus on the random tensors whose entries follow normal distribution. The random tensor with standard normal distribution(SND) is introduced as an extension of random normal matrices. As a random multi-array deduced from an affine transformation on a SND tensor, the general normal random tensor is initialised in the paper. We then investigate some equivalent definitions of a normal tensor and present the description of the density function, characteristic function, moments, and some other functions related to a random matrix. A general form of an even-order multi-variance tensor is also introduced to tackle a random tensor. Finally some equivalent definitions for the tensor normal distribution are described.
Results on the spectral behavior of random matrices as the dimension increases are applied to the problem of detecting the number of sources impinging on an array of sensors. A common strategy to solve this problem is to estimate the multiplicity of the smallest eigenvalue of the spatial covariance matrix $R$ of the sensed data from the sample covariance matrix $\widehat{R}$. Existing approaches, such as that based on information theoretic criteria, rely on the closeness of the noise eigenvalues of $\widehat R$ to each other and, therefore, the sample size has to be quite large when the number of sources is large in order to obtain a good estimate. The analysis presented in this report focuses on the splitting of the spectrum of $\widehat{R}$ into noise and signal eigenvalues. It is shown that, when the number of sensors is large, the number of signals can be estimated with a sample size considerably less than that required by previous approaches. The practical significance of the main result is that detection can be achieved with a number of samples comparable to the number of sensors in large dimensional array processing.
Autonomous Micro Aerial Vehicles are deployed for a variety tasks including surveillance and monitoring. Perching and staring allow the vehicle to monitor targets without flying, saving battery power and increasing the overall mission time without the need to frequently replace batteries. This paper addresses the Active Visual Perching (AVP) control problem to autonomously perch on inclined surfaces up to $90^\circ$. Our approach generates dynamically feasible trajectories to navigate and perch on a desired target location, while taking into account actuator and Field of View (FoV) constraints. By replanning in mid-flight, we take advantage of more accurate target localization increasing the perching maneuver's robustness to target localization or control errors. We leverage the Karush-Kuhn-Tucker (KKT) conditions to identify the compatibility between planning objectives and the visual sensing constraint during the planned maneuver. Furthermore, we experimentally identify the corresponding boundary conditions that maximizes the spatio-temporal target visibility during the perching maneuver. The proposed approach works on-board in real-time with significant computational constraints relying exclusively on cameras and an Inertial Measurement Unit (IMU). Experimental results validate the proposed approach and shows the higher success rate as well as increased target interception precision and accuracy with respect to a one-shot planning approach, while still retaining aggressive capabilities with flight envelopes that include large excursions from the hover position on inclined surfaces up to 90$^\circ$, angular speeds up to 750~deg/s, and accelerations up to 10~m/s$^2$.
Explicit time integration schemes coupled with Galerkin discretizations of time-dependent partial differential equations require solving a linear system with the mass matrix at each time step. For applications in structural dynamics, the solution of the linear system is frequently approximated through so-called mass lumping, which consists in replacing the mass matrix by some diagonal approximation. Mass lumping has been widely used in engineering practice for decades already and has a sound mathematical theory supporting it for finite element methods using the classical Lagrange basis. However, the theory for more general basis functions is still missing. Our paper partly addresses this shortcoming. Some special and practically relevant properties of lumped mass matrices are proved and we discuss how these properties naturally extend to banded and Kronecker product matrices whose structure allows to solve linear systems very efficiently. Our theoretical results are applied to isogeometric discretizations but are not restricted to them.
Since hardware resources are limited, the objective of training deep learning models is typically to maximize accuracy subject to the time and memory constraints of training and inference. We study the impact of model size in this setting, focusing on Transformer models for NLP tasks that are limited by compute: self-supervised pretraining and high-resource machine translation. We first show that even though smaller Transformer models execute faster per iteration, wider and deeper models converge in significantly fewer steps. Moreover, this acceleration in convergence typically outpaces the additional computational overhead of using larger models. Therefore, the most compute-efficient training strategy is to counterintuitively train extremely large models but stop after a small number of iterations. This leads to an apparent trade-off between the training efficiency of large Transformer models and the inference efficiency of small Transformer models. However, we show that large models are more robust to compression techniques such as quantization and pruning than small models. Consequently, one can get the best of both worlds: heavily compressed, large models achieve higher accuracy than lightly compressed, small models.