This paper develops a notion of geometric quantiles on Hadamard spaces, also known as global non-positive curvature spaces. After providing some definitions and basic properties, including scaled isometry equivariance and a necessary condition on the gradient of the quantile loss function at quantiles on Hadamard manifolds, we investigate asymptotic properties of sample quantiles on Hadamard manifolds, such as strong consistency and joint asymptotic normality. We provide a detailed description of how to compute quantiles using a gradient descent algorithm in hyperbolic space and, in particular, an explicit formula for the gradient of the quantile loss function, along with experiments using simulated and real single-cell RNA sequencing data.
We study a colored generalization of the famous simple-switch Markov chain for sampling the set of graphs with a fixed degree sequence. Here we consider the space of graphs with colored vertices, in which we fix the degree sequence and another statistic arising from the vertex coloring, and prove that the set can be connected with simple color-preserving switches or moves. These moves form a basis for defining an irreducible Markov chain necessary for testing statistical model fit to block-partitioned network data. Our methods further generalize well-known algebraic results from the 1990s: namely, that the corresponding moves can be used to construct a regular triangulation for a generalization of the second hypersimplex. On the other hand, in contrast to the monochromatic case, we show that for simple graphs, the 1-norm of the moves necessary to connect the space increases with the number of colors.
We give a fully polynomial-time randomized approximation scheme (FPRAS) for two terminal reliability in directed acyclic graphs (DAGs). In contrast, we also show the complementing problem of approximating two terminal unreliability in DAGs is #BIS-hard.
In this paper, we perform a study on the effectiveness of Neural Network (NN) techniques for deconvolution inverse problems. We consider NN's asymptotic limits, corresponding to Gaussian Processes (GPs), where parameter non-linearities are lost. Using these resulting GPs, we address the deconvolution inverse problem in the case of a quantum harmonic oscillator simulated through Monte Carlo techniques on a lattice. A scenario with a known analytical solution. Our findings indicate that solving the deconvolution inverse problem with a fully connected NN yields less performing results than those obtained using the GPs derived from NN's asymptotic limits. Furthermore, we observe the trained NN's accuracy approaching that of GPs with increasing layer width. Notably, one of these GPs defies interpretation as a probabilistic model, offering a novel perspective compared to established methods in the literature. Additionally, the NNs, in their asymptotic limit, provide cost-effective analytical solutions.
We investigate a linearised Calder\'on problem in a two-dimensional bounded simply connected $C^{1,\alpha}$ domain $\Omega$. After extending the linearised problem for $L^2(\Omega)$ perturbations, we orthogonally decompose $L^2(\Omega) = \oplus_{k=0}^\infty \mathcal{H}_k$ and prove Lipschitz stability on each of the infinite-dimensional $\mathcal{H}_k$ subspaces. In particular, $\mathcal{H}_0$ is the space of square-integrable harmonic perturbations. This appears to be the first Lipschitz stability result for infinite-dimensional spaces of perturbations in the context of the (linearised) Calder\'on problem. Previous optimal estimates with respect to the operator norm of the data map have been of the logarithmic-type in infinite-dimensional settings. The remarkable improvement is enabled by using the Hilbert-Schmidt norm for the Neumann-to-Dirichlet boundary map and its Fr\'echet derivative with respect to the conductivity coefficient. We also derive a direct reconstruction method that inductively yields the orthogonal projections of a general $L^2(\Omega)$ perturbation onto the $\mathcal{H}_k$ spaces, hence reconstructing any $L^2(\Omega)$ perturbation.
This paper is concerned with the problem of sampling and interpolation involving derivatives in shift-invariant spaces and the error analysis of the derivative sampling expansions for fundamentally large classes of functions. A new type of polynomials based on derivative samples is introduced, which is different from the Euler-Frobenius polynomials for the multiplicity $r>1$. A complete characterization of uniform sampling with derivatives is given using Laurent operators. The rate of approximation of a signal (not necessarily continuous) by the derivative sampling expansions in shift-invariant spaces generated by compactly supported functions is established in terms of $L^p$- average modulus of smoothness. Finally, several typical examples illustrating the various problems are discussed in detail.
We present an information-theoretic lower bound for the problem of parameter estimation with time-uniform coverage guarantees. Via a new a reduction to sequential testing, we obtain stronger lower bounds that capture the hardness of the time-uniform setting. In the case of location model estimation, logistic regression, and exponential family models, our $\Omega(\sqrt{n^{-1}\log \log n})$ lower bound is sharp to within constant factors in typical settings.
The exponential growth in scientific publications poses a severe challenge for human researchers. It forces attention to more narrow sub-fields, which makes it challenging to discover new impactful research ideas and collaborations outside one's own field. While there are ways to predict a scientific paper's future citation counts, they need the research to be finished and the paper written, usually assessing impact long after the idea was conceived. Here we show how to predict the impact of onsets of ideas that have never been published by researchers. For that, we developed a large evolving knowledge graph built from more than 21 million scientific papers. It combines a semantic network created from the content of the papers and an impact network created from the historic citations of papers. Using machine learning, we can predict the dynamic of the evolving network into the future with high accuracy, and thereby the impact of new research directions. We envision that the ability to predict the impact of new ideas will be a crucial component of future artificial muses that can inspire new impactful and interesting scientific ideas.
In approximation of functions based on point values, least-squares methods provide more stability than interpolation, at the expense of increasing the sampling budget. We show that near-optimal approximation error can nevertheless be achieved, in an expected $L^2$ sense, as soon as the sample size $m$ is larger than the dimension $n$ of the approximation space by a constant ratio. On the other hand, for $m=n$, we obtain an interpolation strategy with a stability factor of order $n$. The proposed sampling algorithms are greedy procedures based on arXiv:0808.0163 and arXiv:1508.03261, with polynomial computational complexity.
We study the modeling and forecasting of high-dimensional functional time series (HDFTS), which can be cross-sectionally correlated and temporally dependent. We introduce a decomposition of the HDFTS into two distinct components: a deterministic component and a residual component that varies over time. The decomposition is derived through the estimation of two-way functional analysis of variance. A functional time series forecasting method, based on functional principal component analysis, is implemented to produce forecasts for the residual component. By combining the forecasts of the residual component with the deterministic component, we obtain forecast curves for multiple populations. We apply the model to age- and sex-specific mortality rates in the United States, France, and Japan, in which there are 51 states, 95 departments, and 47 prefectures, respectively. The proposed method is capable of delivering more accurate point and interval forecasts in forecasting multi-population mortality than several benchmark methods considered.
Hashing has been widely used in approximate nearest search for large-scale database retrieval for its computation and storage efficiency. Deep hashing, which devises convolutional neural network architecture to exploit and extract the semantic information or feature of images, has received increasing attention recently. In this survey, several deep supervised hashing methods for image retrieval are evaluated and I conclude three main different directions for deep supervised hashing methods. Several comments are made at the end. Moreover, to break through the bottleneck of the existing hashing methods, I propose a Shadow Recurrent Hashing(SRH) method as a try. Specifically, I devise a CNN architecture to extract the semantic features of images and design a loss function to encourage similar images projected close. To this end, I propose a concept: shadow of the CNN output. During optimization process, the CNN output and its shadow are guiding each other so as to achieve the optimal solution as much as possible. Several experiments on dataset CIFAR-10 show the satisfying performance of SRH.