In this paper, we consider the steps to be followed in the analysis and interpretation of the quantization problem related to the $C_{2,8}$ channel, where the Fuchsian differential equations, the generators of the Fuchsian groups, and the tessellations associated with the cases $g=2$ and $g=3$, related to the hyperbolic case, are determined. In order to obtain these results, it is necessary to determine the genus $g$ of each surface on which this channel may be embedded. After that, the procedure is to determine the algebraic structure (Fuchsian group generators) associated with the fundamental region of each surface. To achieve this goal, an associated linear second-order Fuchsian differential equation whose linearly independent solutions provide the generators of this Fuchsian group is devised. In addition, the tessellations associated with each analyzed case are identified. These structures are identified in four situations, divided into two cases $(g=2$ and $g=3)$, obtaining, therefore, both algebraic and geometric characterizations associated with quantizing the $C_{2,8}$ channel.
In this paper we investigate some properties of the Fiedler vector, the so-called first non-trivial eigenvector of the Laplacian matrix of a graph. There are important results about the Fiedler vector to identify spectral cuts in graphs but far less is known about its extreme values and points. We propose a few results and conjectures in this direction. We also bring two concrete contributions, i) by defining a new measure for graphs that can be interpreted in terms of extremality (inverse of centrality), ii) by applying a small perturbation to the Fiedler vector of cerebral shapes such as the corpus callosum to robustify their parameterization.
Given the exponential growth of the volume of the ball w.r.t. its radius, the hyperbolic space is capable of embedding trees with arbitrarily small distortion and hence has received wide attention for representing hierarchical datasets. However, this exponential growth property comes at a price of numerical instability such that training hyperbolic learning models will sometimes lead to catastrophic NaN problems, encountering unrepresentable values in floating point arithmetic. In this work, we carefully analyze the limitation of two popular models for the hyperbolic space, namely, the Poincar\'e ball and the Lorentz model. We first show that, under the 64 bit arithmetic system, the Poincar\'e ball has a relatively larger capacity than the Lorentz model for correctly representing points. Then, we theoretically validate the superiority of the Lorentz model over the Poincar\'e ball from the perspective of optimization. Given the numerical limitations of both models, we identify one Euclidean parametrization of the hyperbolic space which can alleviate these limitations. We further extend this Euclidean parametrization to hyperbolic hyperplanes and exhibits its ability in improving the performance of hyperbolic SVM.
This paper is concerned with function reconstruction from samples. The sampling points used in several approaches are (1) structured points connected with fast algorithms or (2) unstructured points coming from, e.g., an initial random draw to achieve an improved information complexity. We connect both approaches and propose a subsampling of structured points in an offline step. In particular, we start with structured quadrature points (QMC), which provide stable $L_2$ reconstruction properties. The subsampling procedure consists of a computationally inexpensive random step followed by a deterministic procedure to further reduce the number of points while keeping its information. In these points functions (belonging to a RKHS of bounded functions) will be sampled and reconstructed from whilst achieving state of the art error decay. Our method is dimension-independent and is applicable as soon as we know some initial quadrature points. We apply our general findings on the $d$-dimensional torus to subsample rank-1 lattices, where it is known that full rank-1 lattices lose half the optimal order of convergence (expressed in terms of the size of the lattice). In contrast to that, our subsampled version regains the optimal rate since many of the lattice points are not needed. Moreover, we utilize fast and memory efficient Fourier algorithms in order to compute the approximation. Numerical experiments in several dimensions support our findings.
Learned inverse problem solvers exhibit remarkable performance in applications like image reconstruction tasks. These data-driven reconstruction methods often follow a two-step scheme. First, one trains the often neural network-based reconstruction scheme via a dataset. Second, one applies the scheme to new measurements to obtain reconstructions. We follow these steps but parameterize the reconstruction scheme with invertible residual networks (iResNets). We demonstrate that the invertibility enables investigating the influence of the training and architecture choices on the resulting reconstruction scheme. For example, assuming local approximation properties of the network, we show that these schemes become convergent regularizations. In addition, the investigations reveal a formal link to the linear regularization theory of linear inverse problems and provide a nonlinear spectral regularization for particular architecture classes. On the numerical side, we investigate the local approximation property of selected trained architectures and present a series of experiments on the MNIST dataset that underpin and extend our theoretical findings.
As a computational alternative to Markov chain Monte Carlo approaches, variational inference (VI) is becoming more and more popular for approximating intractable posterior distributions in large-scale Bayesian models due to its comparable efficacy and superior efficiency. Several recent works provide theoretical justifications of VI by proving its statistical optimality for parameter estimation under various settings; meanwhile, formal analysis on the algorithmic convergence aspects of VI is still largely lacking. In this paper, we consider the common coordinate ascent variational inference (CAVI) algorithm for implementing the mean-field (MF) VI towards optimizing a Kullback--Leibler divergence objective functional over the space of all factorized distributions. Focusing on the two-block case, we analyze the convergence of CAVI by leveraging the extensive toolbox from functional analysis and optimization. We provide general conditions for certifying global or local exponential convergence of CAVI. Specifically, a new notion of generalized correlation for characterizing the interaction between the constituting blocks in influencing the VI objective functional is introduced, which according to the theory, quantifies the algorithmic contraction rate of two-block CAVI. As illustrations, we apply the developed theory to a number of examples, and derive explicit problem-dependent upper bounds on the algorithmic contraction rate.
At the end of the 19th century the logician C.S. Peirce coined the term "fallibilism" for the "... the doctrine that our knowledge is never absolute but always swims, as it were, in a continuum of uncertainty and of indeterminacy". In terms of scientific practice, this means we are obliged to reexamine the assumptions, the evidence, and the arguments for conclusions that subsequent experience has cast into doubt. In this paper we examine an assumption that underpinned the development of the Internet architecture, namely that a loosely synchronous point-to-point datagram delivery service could adequately meet the needs of all network applications, including those which deliver content and services to a mass audience at global scale. We examine how the inability of the Networking community to provide a public and affordable mechanism to support such asynchronous point-to-multipoint applications led to the development of private overlay infrastructure, namely CDNs and Cloud networks, whose architecture stands at odds with the Open Data Networking goals of the early Internet advocates. We argue that the contradiction between those initial goals and the monopolistic commercial imperatives of hypergiant overlay infrastructure operators is an important reason for the apparent contradiction posed by the negative impact of their most profitable applications (e.g., social media) and strategies (e.g., targeted advertisement). We propose that, following the prescription of Peirce, we can only resolve this contradiction by reconsidering some of our deeply held assumptions.
We consider the problem of binary string reconstruction from the multiset of its substring compositions, i.e., referred to as the substring composition multiset, first introduced and studied by Acharya et al. We introduce a new algorithm for the problem of string reconstruction from its substring composition multiset which relies on the algebraic properties of the equivalent bivariate polynomial formulation of the problem. We then characterize specific algebraic conditions for the binary string to be reconstructed that guarantee the algorithm does not require any backtracking through the reconstruction, and, consequently, the time complexity is bounded polynomially. More specifically, in the case of no backtracking, our algorithm has a time complexity of $O(n^2)$ compared to the algorithm by Acharya et al., which has a time complexity of $O(n^2\log(n))$, where $n$ is the length of the binary string. Furthermore, it is shown that larger sets of binary strings are uniquely reconstructable by the new algorithm and without the need for backtracking leading to codebooks of reconstruction codes that are larger, by a linear factor in size, compared to the previously known construction by Pattabiraman et al., while having $O(n^2)$ reconstruction complexity.
In the study of cancer evolution and therapeutic strategies, scientific evidence shows that a key dynamics lies in the tumor-environment interaction. In particular, oxygen concentration plays a central role in the determination of the phenotypic heterogeneity of cancer cell populations, whose qualitative and geometric characteristics are predominant factors in the occurrence of relapses and failure of eradication. We propose a mathematical model able to describe the eco-evolutionary spatial dynamics of tumour cells in their adaptation to hypoxic microenvironments. As a main novelty with respect to the existing literature, we combine a phenotypic indicator reflecting the experimentally-observed metabolic trade-off between the hypoxia-resistance ability and the proliferative potential with a 2d geometric domain, without the constraint of radial symmetry. The model is settled in the mathematical framework of phenotype-structured population dynamics and it is formulated in terms of systems of coupled non-linear integro-differential equations. The computational outcomes demonstrate that hypoxia-induced selection results in a geometric characterization of phenotypic-defined tumour niches that impact on tumour aggressiveness and invasive ability. Furthermore, results show how the knowledge of environmental characteristics provides a predictive advantage on tumour mass development in terms of size, shape, and composition.
This book develops an effective theory approach to understanding deep neural networks of practical relevance. Beginning from a first-principles component-level picture of networks, we explain how to determine an accurate description of the output of trained networks by solving layer-to-layer iteration equations and nonlinear learning dynamics. A main result is that the predictions of networks are described by nearly-Gaussian distributions, with the depth-to-width aspect ratio of the network controlling the deviations from the infinite-width Gaussian description. We explain how these effectively-deep networks learn nontrivial representations from training and more broadly analyze the mechanism of representation learning for nonlinear models. From a nearly-kernel-methods perspective, we find that the dependence of such models' predictions on the underlying learning algorithm can be expressed in a simple and universal way. To obtain these results, we develop the notion of representation group flow (RG flow) to characterize the propagation of signals through the network. By tuning networks to criticality, we give a practical solution to the exploding and vanishing gradient problem. We further explain how RG flow leads to near-universal behavior and lets us categorize networks built from different activation functions into universality classes. Altogether, we show that the depth-to-width ratio governs the effective model complexity of the ensemble of trained networks. By using information-theoretic techniques, we estimate the optimal aspect ratio at which we expect the network to be practically most useful and show how residual connections can be used to push this scale to arbitrary depths. With these tools, we can learn in detail about the inductive bias of architectures, hyperparameters, and optimizers.
This paper focuses on the expected difference in borrower's repayment when there is a change in the lender's credit decisions. Classical estimators overlook the confounding effects and hence the estimation error can be magnificent. As such, we propose another approach to construct the estimators such that the error can be greatly reduced. The proposed estimators are shown to be unbiased, consistent, and robust through a combination of theoretical analysis and numerical testing. Moreover, we compare the power of estimating the causal quantities between the classical estimators and the proposed estimators. The comparison is tested across a wide range of models, including linear regression models, tree-based models, and neural network-based models, under different simulated datasets that exhibit different levels of causality, different degrees of nonlinearity, and different distributional properties. Most importantly, we apply our approaches to a large observational dataset provided by a global technology firm that operates in both the e-commerce and the lending business. We find that the relative reduction of estimation error is strikingly substantial if the causal effects are accounted for correctly.