This article aims to implement the novel piecewise Maehly based Pad\'e-Chebyshev approximation and study its utility in minimizing the Gibbs phenomenon while approximating piecewise smooth functions in two-dimensions. We first develop a piecewise Pad\'e-Chebyshev method (PiPC) to approximate univariate piecewise smooth functions and then extend the same to a two dimensional space, leading to a piecewise bivariate Pad\'e-Chebyshev approximation (Pi2DPC) for approximating bivariate piecewise smooth functions. The chief advantage of these methods lie in their non dependence on any apriori knowledge of the locations and types of singularities present in the original function. Finally, we supplement our method with numerical results which validate its effectiveness in diminishing the Gibbs phenomenon to negligible levels.
We show that isogeometric Galerkin discretizations of eigenvalue problems related to the Laplace operator subject to any standard type of homogeneous boundary conditions have no outliers in certain optimal spline subspaces. Roughly speaking, these optimal subspaces are obtained from the full spline space defined on certain uniform knot sequences by imposing specific additional boundary conditions. The spline subspaces of interest have been introduced in the literature some years ago when proving their optimality with respect to Kolmogorov $n$-widths in $L^2$-norm for some function classes. The eigenfunctions of the Laplacian -- with any standard type of homogeneous boundary conditions -- belong to such classes. Here we complete the analysis of the approximation properties of these optimal spline subspaces. In particular, we provide explicit $L^2$ and $H^1$ error estimates with full approximation order for Ritz projectors in the univariate and in the multivariate tensor-product setting. Besides their intrinsic interest, these estimates imply that, for a fixed number of degrees of freedom, all the eigenfunctions and the corresponding eigenvalues are well approximated, without loss of accuracy in the whole spectrum when compared to the full spline space. Moreover, there are no spurious values in the approximated spectrum. In other words, the considered subspaces provide accurate outlier-free discretizations in the univariate and in the multivariate tensor-product case. This main contribution is complemented by an explicit construction of B-spline-like bases for the considered spline subspaces. The role of such spaces as accurate discretization spaces for addressing general problems with non-homogeneous boundary behavior is discussed as well.
Large health care data repositories such as electronic health records (EHR) opens new opportunities to derive individualized treatment strategies to improve disease outcomes. We study the problem of estimating sequential treatment rules tailored to patient's individual characteristics, often referred to as dynamic treatment regimes (DTRs). We seek to find the optimal DTR which maximizes the discontinuous value function through direct maximization of a fisher consistent surrogate loss function. We show that a large class of concave surrogates fails to be Fisher consistent, which differs from the classic setting for binary classification. We further characterize a non-concave family of Fisher consistent smooth surrogate functions, which can be optimized with gradient descent using off-the-shelf machine learning algorithms. Compared to the existing direct search approach under the support vector machine framework (Zhao et al., 2015), our proposed DTR estimation via surrogate loss optimization (DTRESLO) method is more computationally scalable to large sample size and allows for a broader functional class for the predictor effects. We establish theoretical properties for our proposed DTR estimator and obtain a sharp upper bound on the regret corresponding to our DTRESLO method. Finite sample performance of our proposed estimator is evaluated through extensive simulations and an application on deriving an optimal DTR for treatment sepsis using EHR data from patients admitted to intensive care units.
In this work we obtain results related to the approximation of $h$-dimensional dominant subspaces and low rank approximations of matrices $ A\in\mathbb K^{m\times n}$ (where $\mathbb K=\mathbb R$ or $\mathbb C)$ in case there is no singular gap at the index $h$, i.e. if $\sigma_h=\sigma_{h+1}$ (where $\sigma_1\geq \ldots\geq \sigma_p\geq 0$ denote the singular values of $ A$, and $p=\min\{m,n\}$). In order to do this, we develop a novel perspective for the convergence analysis of the classical deterministic block Krylov methods in this context. Indeed, starting with a matrix $ X\in\mathbb K^{n\times r}$ with $r\geq h$ satisfying a compatibility assumption with some $h$-dimensional right dominant subspace, we show that block Krylov methods produce arbitrarily good approximations for both problems mentioned above. Our approach is based on recent work by Drineas, Ipsen, Kontopoulou and Magdon-Ismail on approximation of structural left dominant subspaces. The main difference between our work and previous work on this topic is that instead of exploiting a singular gap at $h$ (which is zero in this case) we exploit the nearest existing singular gaps.
We present a novel method for reducing the computational complexity of rigorously estimating the partition functions (normalizing constants) of Gibbs (Boltzmann) distributions, which arise ubiquitously in probabilistic graphical models. A major obstacle to practical applications of Gibbs distributions is the need to estimate their partition functions. The state of the art in addressing this problem is multi-stage algorithms, which consist of a cooling schedule, and a mean estimator in each step of the schedule. While the cooling schedule in these algorithms is adaptive, the mean estimation computations use MCMC as a black-box to draw approximate samples. We develop a doubly adaptive approach, combining the adaptive cooling schedule with an adaptive MCMC mean estimator, whose number of Markov chain steps adapts dynamically to the underlying chain. Through rigorous theoretical analysis, we prove that our method outperforms the state of the art algorithms in several factors: (1) The computational complexity of our method is smaller; (2) Our method is less sensitive to loose bounds on mixing times, an inherent component in these algorithms; and (3) The improvement obtained by our method is particularly significant in the most challenging regime of high-precision estimation. We demonstrate the advantage of our method in experiments run on classic factor graphs, such as voting models and Ising models.
In this paper we get error bounds for fully discrete approximations of infinite horizon problems via the dynamic programming approach. It is well known that considering a time discretization with a positive step size $h$ an error bound of size $h$ can be proved for the difference between the value function (viscosity solution of the Hamilton-Jacobi-Bellman equation corresponding to the infinite horizon) and the value function of the discrete time problem. However, including also a spatial discretization based on elements of size $k$ an error bound of size $O(k/h)$ can be found in the literature for the error between the value functions of the continuous problem and the fully discrete problem. In this paper we revise the error bound of the fully discrete method and prove, under similar assumptions to those of the time discrete case, that the error of the fully discrete case is in fact $O(h+k)$ which gives first order in time and space for the method. This error bound matches the numerical experiments of many papers in the literature in which the behaviour $1/h$ from the bound $O(k/h)$ have not been observed.
For artificial deep neural networks, we prove expression rates for analytic functions $f:\mathbb{R}^d\to\mathbb{R}$ in the norm of $L^2(\mathbb{R}^d,\gamma_d)$ where $d\in {\mathbb{N}}\cup\{ \infty \}$. Here $\gamma_d$ denotes the Gaussian product probability measure on $\mathbb{R}^d$. We consider in particular ReLU and ReLU${}^k$ activations for integer $k\geq 2$. For $d\in\mathbb{N}$, we show exponential convergence rates in $L^2(\mathbb{R}^d,\gamma_d)$. In case $d=\infty$, under suitable smoothness and sparsity assumptions on $f:\mathbb{R}^{\mathbb{N}}\to\mathbb{R}$, with $\gamma_\infty$ denoting an infinite (Gaussian) product measure on $\mathbb{R}^{\mathbb{N}}$, we prove dimension-independent expression rate bounds in the norm of $L^2(\mathbb{R}^{\mathbb{N}},\gamma_\infty)$. The rates only depend on quantified holomorphy of (an analytic continuation of) the map $f$ to a product of strips in $\mathbb{C}^d$. As an application, we prove expression rate bounds of deep ReLU-NNs for response surfaces of elliptic PDEs with log-Gaussian random field inputs.
Some properties of generalized convexity for sets and for functions are identified in case of the reliability polynomials of two dual minimal networks. A method of approximating the reliability polynomials of two dual minimal network is developed based on their mutual complementarity properties. The approximating objects are from the class of quadratic spline functions, constructed based both on interpolation conditions and on shape knowledge. It is proved that the approximant objects preserve the shape properties of the exact reliability polynomials. Numerical examples and simulations show the performance of the algorithm, both in terms of low complexity, small error and shape preserving. Possibilities of increasing the accuracy of approximation are discussed.
A key advantage of isogeometric discretizations is their accurate and well-behaved eigenfrequencies and eigenmodes. For degree two and higher, however, optical branches of spurious outlier frequencies and modes may appear due to boundaries or reduced continuity at patch interfaces. In this paper, we introduce a variational approach based on perturbed eigenvalue analysis that eliminates outlier frequencies without negatively affecting the accuracy in the remainder of the spectrum and modes. We then propose a pragmatic iterative procedure that estimates the perturbation parameters in such a way that the outlier frequencies are effectively reduced. We demonstrate that our approach allows for a much larger critical time-step size in explicit dynamics calculations. In addition, we show that the critical time-step size obtained with the proposed approach does not depend on the polynomial degree of spline basis functions.
We introduce a new multi-dimensional nonlinear embedding -- Piecewise Flat Embedding (PFE) -- for image segmentation. Based on the theory of sparse signal recovery, piecewise flat embedding with diverse channels attempts to recover a piecewise constant image representation with sparse region boundaries and sparse cluster value scattering. The resultant piecewise flat embedding exhibits interesting properties such as suppressing slowly varying signals, and offers an image representation with higher region identifiability which is desirable for image segmentation or high-level semantic analysis tasks. We formulate our embedding as a variant of the Laplacian Eigenmap embedding with an $L_{1,p} (0<p\leq1)$ regularization term to promote sparse solutions. First, we devise a two-stage numerical algorithm based on Bregman iterations to compute $L_{1,1}$-regularized piecewise flat embeddings. We further generalize this algorithm through iterative reweighting to solve the general $L_{1,p}$-regularized problem. To demonstrate its efficacy, we integrate PFE into two existing image segmentation frameworks, segmentation based on clustering and hierarchical segmentation based on contour detection. Experiments on four major benchmark datasets, BSDS500, MSRC, Stanford Background Dataset, and PASCAL Context, show that segmentation algorithms incorporating our embedding achieve significantly improved results.
Many problems on signal processing reduce to nonparametric function estimation. We propose a new methodology, piecewise convex fitting (PCF), and give a two-stage adaptive estimate. In the first stage, the number and location of the change points is estimated using strong smoothing. In the second stage, a constrained smoothing spline fit is performed with the smoothing level chosen to minimize the MSE. The imposed constraint is that a single change point occurs in a region about each empirical change point of the first-stage estimate. This constraint is equivalent to requiring that the third derivative of the second-stage estimate has a single sign in a small neighborhood about each first-stage change point. We sketch how PCF may be applied to signal recovery, instantaneous frequency estimation, surface reconstruction, image segmentation, spectral estimation and multivariate adaptive regression.