Discretization of the uniform norm of functions from a given finite dimensional subspace of continuous functions is studied. We pay special attention to the case of trigonometric polynomials with frequencies from an arbitrary finite set with fixed cardinality. We give two different proofs of the fact that for any $N$-dimensional subspace of the space of continuous functions it is sufficient to use $e^{CN}$ sample points for an accurate upper bound for the uniform norm. Previous known results show that one cannot improve on the exponential growth of the number of sampling points for a good discretization theorem in the uniform norm. Also, we prove a general result, which connects the upper bound on the number of sampling points in the discretization theorem for the uniform norm with the best $m$-term bilinear approximation of the Dirichlet kernel associated with the given subspace. We illustrate application of our technique on the example of trigonometric polynomials.
The estimation of information measures of continuous distributions based on samples is a fundamental problem in statistics and machine learning. In this paper, we analyze estimates of differential entropy in $K$-dimensional Euclidean space, computed from a finite number of samples, when the probability density function belongs to a predetermined convex family $\mathcal{P}$. First, estimating differential entropy to any accuracy is shown to be infeasible if the differential entropy of densities in $\mathcal{P}$ is unbounded, clearly showing the necessity of additional assumptions. Subsequently, we investigate sufficient conditions that enable confidence bounds for the estimation of differential entropy. In particular, we provide confidence bounds for simple histogram based estimation of differential entropy from a fixed number of samples, assuming that the probability density function is Lipschitz continuous with known Lipschitz constant and known, bounded support. Our focus is on differential entropy, but we provide examples that show that similar results hold for mutual information and relative entropy as well.
We study the problem of approximating the eigenspectrum of a symmetric matrix $A \in \mathbb{R}^{n \times n}$ with bounded entries (i.e., $\|A\|_{\infty} \leq 1$). We present a simple sublinear time algorithm that approximates all eigenvalues of $A$ up to additive error $\pm \epsilon n$ using those of a randomly sampled $\tilde{O}(\frac{1}{\epsilon^4}) \times \tilde O(\frac{1}{\epsilon^4})$ principal submatrix. Our result can be viewed as a concentration bound on the full eigenspectrum of a random principal submatrix. It significantly extends existing work which shows concentration of just the spectral norm [Tro08]. It also extends work on sublinear time algorithms for testing the presence of large negative eigenvalues in the spectrum [BCJ20]. To complement our theoretical results, we provide numerical simulations, which demonstrate the effectiveness of our algorithm in approximating the eigenvalues of a wide range of matrices.
We propose a monotone discretization for the integral fractional Laplace equation on bounded Lipschitz domains with the homogenous Dirichlet boundary condition. The method is inspired by a quadrature-based finite difference method of Huang and Oberman, but is defined on unstructured grids in arbitrary dimensions with a more flexible domain for approximating singular integral. The scale of the singular integral domain not only depends on the local grid size, but also on the distance to the boundary, since the H\"{o}lder coefficient of the solution deteriorates as it approaches the boundary. By using a discrete barrier function that also reflects the distance to the boundary, we show optimal pointwise convergence rates in terms of the H\"{o}lder regularity of the data on both quasi-uniform and graded grids. Several numerical examples are provided to illustrate the sharpness of the theoretical results.
In this work we analyse a PDE-ODE problem modelling the evolution of a Glioblastoma, which includes an anisotropic nonlinear diffusion term with a diffusion velocity increasing with respect to vasculature. First, we prove the existence of global in time weak-strong solutions using a regularization technique via an artificial diffusion in the ODE-system and a fixed point argument. In addition, stability results of the critical points are given under some constraints on parameters. Finally, we design a fully discrete finite element scheme for the model which preserves the pointwise and energy estimates of the continuous problem.
The paper addresses the problem of sampling discretization of integral norms of elements of finite-dimensional subspaces satisfying some conditions. We prove sampling discretization results under two standard kinds of assumptions -- conditions on the entropy numbers and conditions in terms of the Nikol'skii-type inequalities. We prove some upper bounds on the number of sample points sufficient for good discretization and show that these upper bounds are sharp in a certain sense. Then we apply our general conditional results to subspaces with special structures, namely, subspaces with the tensor product structure. We demonstrate that applications of results based on the Nikol'skii-type inequalities provide somewhat better results than applications of results based on the entropy numbers conditions. Finally, we apply discretization results to the problem of sampling recovery.
In the first part of this work, we develop a novel scheme for solving nonparametric regression problems. That is the approximation of possibly low regular and noised functions from the knowledge of their approximate values given at some random points. Our proposed scheme is based on the use of the pseudo-inverse of a random projection matrix, combined with some specific properties of the Jacobi polynomials system, as well as some properties of positive definite random matrices. This scheme has the advantages to be stable, robust, accurate and fairly fast in terms of execution time. Moreover and unlike most of the existing nonparametric regression estimators, no extra regularization step is required by our proposed estimator. Although, this estimator is initially designed to work with random sampling set of uni-variate i.i.d. random variables following a Beta distribution, we show that it is still work for a wide range of sampling distribution laws. Moreover, we briefly describe how our estimator can be adapted in order to handle the multivariate case of random sampling sets. In the second part of this work, we extend the random pseudo-inverse scheme technique to build a stable and accurate estimator for solving linear functional regression (LFR) problems. A dyadic decomposition approach is used to construct this last stable estimator for the LFR problem. The performance of the two proposed estimators are illustrated by various numerical simulations. In particular, a real dataset is used to illustrate the performance of our nonparametric regression estimator.
The list-decodable code has been an active topic in theoretical computer science since the seminal papers of M. Sudan and V. Guruswami in 1997-1998. List-decodable codes are also considered in rank-metric, subspace metric, cover-metric, pair metric and insdel metric settings. In this paper we show that rates, list-decodable radius and list sizes are closely related to the classical topic of covering codes. We prove new general simple but strong upper bounds for list-decodable codes in general finite metric spaces based on various covering codes of finite metric spaces. The general covering code upper bounds can apply to the case when the volumes of the balls depend on the centers, not only on the radius case. Then any good upper bound on the covering radius or the size of covering code imply a good upper bound on the size of list-decodable codes. Hence the list-decodablity of codes is a strong constraint from the view of covering codes on general finite metric spaces. Our results give exponential improvements on the recent generalized Singleton upper bound of Shangguan and Tamo in STOC 2020 for Hamming metric list-decodable codes, when the code lengths are very large. The asymptotic forms of covering code bounds can partially recover the Blinovsky bound and the combinatorial bound of Guruswami-H{\aa}stad-Sudan-Zuckerman in Hamming metric setting. We also suggest to study the combinatorial covering list-decodable codes as a natural generalization of combinatorial list-decodable codes. We apply our general covering code upper bounds for list-decodable rank-metric codes, list-decodable subspace codes, list-decodable insertion codes and list-decodable deletion codes. Some new better results about non-list-decodability of rank-metric codes and subspace codes are obtained.
We provide a framework for high-order discretizations of nonlinear scalar convection-diffusion equations that satisfy a discrete maximum principle. The resulting schemes can have arbitrarily high order accuracy in time and space, and can be stable and maximum-principle-preserving (MPP) with no step size restriction. The schemes are based on a two-tiered limiting strategy, starting with a high-order limiter-based method that may have small oscillations or maximum-principle violations, followed by an additional limiting step that removes these violations while preserving high order accuracy. The desirable properties of the resulting schemes are demonstrated through several numerical examples.
A weakly admissible mesh (WAM) on a continuum real-valued domain is a sequence of discrete grids such that the discrete maximum norm of polynomials on the grid is comparable to the supremum norm of polynomials on the domain. The asymptotic rate of growth of the grid sizes and of the comparability constants must grow in a controlled manner. In this paper, we generalize the notion of a WAM to a hierarchical subspaces of not necessarily polynomial functions, and we analyze particular strategies for random sampling as a technique for generating WAM's. Our main results show that WAM's and their stronger variant, admissible meshes (AM's), can be generated by random sampling, and our analysis provides concrete estimates for growth of both the meshes and the discrete-continuum comparability constants.
While Generative Adversarial Networks (GANs) have empirically produced impressive results on learning complex real-world distributions, recent work has shown that they suffer from lack of diversity or mode collapse. The theoretical work of Arora et al.~\cite{AroraGeLiMaZh17} suggests a dilemma about GANs' statistical properties: powerful discriminators cause overfitting, whereas weak discriminators cannot detect mode collapse. In contrast, we show in this paper that GANs can in principle learn distributions in Wasserstein distance (or KL-divergence in many cases) with polynomial sample complexity, if the discriminator class has strong distinguishing power against the particular generator class (instead of against all possible generators). For various generator classes such as mixture of Gaussians, exponential families, and invertible neural networks generators, we design corresponding discriminators (which are often neural nets of specific architectures) such that the Integral Probability Metric (IPM) induced by the discriminators can provably approximate the Wasserstein distance and/or KL-divergence. This implies that if the training is successful, then the learned distribution is close to the true distribution in Wasserstein distance or KL divergence, and thus cannot drop modes. Our preliminary experiments show that on synthetic datasets the test IPM is well correlated with KL divergence, indicating that the lack of diversity may be caused by the sub-optimality in optimization instead of statistical inefficiency.