亚州AV无码专区在线电影-久久久久精品一区二区三区

We study the computation complexity of deep ReLU (Rectified Linear Unit) neural networks for the approximation of functions from the H\"older-Zygmund space of mixed smoothness defined on the $d$-dimensional unit cube when the dimension $d$ may be very large. The approximation error is measured in the norm of isotropic Sobolev space. For every function $f$ from the H\"older-Zygmund space of mixed smoothness, we explicitly construct a deep ReLU neural network having an output that approximates $f$ with a prescribed accuracy $\varepsilon$, and prove tight dimension-dependent upper and lower bounds of the computation complexity of this approximation, characterized as the size and the depth of this deep ReLU neural network, explicitly in $d$ and $\varepsilon$. The proof of these results are in particular, relied on the approximation by sparse-grid sampling recovery based on the Faber series.

相關內容

關注 0

CC在計算復雜性方面表現突出。它的學科處于數學與計算機理論科學的交叉點，具有清晰的數學輪廓和嚴格的數學格式。官網鏈接： · 近似 · ReLU · Networking · 寬度 ·

2021 年 9 月 24 日

Deep Network Approximation for Smooth Functions

Jianfeng Lu,Zuowei Shen,Haizhao Yang,Shijun Zhang

This paper establishes the (nearly) optimal approximation error characterization of deep rectified linear unit (ReLU) networks for smooth functions in terms of both width and depth simultaneously. To that end, we first prove that multivariate polynomials can be approximated by deep ReLU networks of width $\mathcal{O}(N)$ and depth $\mathcal{O}(L)$ with an approximation error $\mathcal{O}(N^{-L})$. Through local Taylor expansions and their deep ReLU network approximations, we show that deep ReLU networks of width $\mathcal{O}(N\ln N)$ and depth $\mathcal{O}(L\ln L)$ can approximate $f\in C^s([0,1]^d)$ with a nearly optimal approximation error $\mathcal{O}(\|f\|_{C^s([0,1]^d)}N^{-2s/d}L^{-2s/d})$. Our estimate is non-asymptotic in the sense that it is valid for arbitrary width and depth specified by $N\in\mathbb{N}^+$ and $L\in\mathbb{N}^+$, respectively.

估計/估計量 · 經驗風險最小化 · 經驗風險 · Neural Networks · Networking ·

2021 年 9 月 24 日

Statistical Learning using Sparse Deep Neural Networks in Empirical Risk Minimization

Shujie Ma,Mingming Liu

We consider a sparse deep ReLU network (SDRN) estimator obtained from empirical risk minimization with a Lipschitz loss function in the presence of a large number of features. Our framework can be applied to a variety of regression and classification problems. The unknown target function to estimate is assumed to be in a Sobolev space with mixed derivatives. Functions in this space only need to satisfy a smoothness condition rather than having a compositional structure. We develop non-asymptotic excess risk bounds for our SDRN estimator. We further derive that the SDRN estimator can achieve the same minimax rate of estimation (up to logarithmic factors) as one-dimensional nonparametric regression when the dimension of the features is fixed, and the estimator has a suboptimal rate when the dimension grows with the sample size. We show that the depth and the total number of nodes and weights of the ReLU network need to grow as the sample size increases to ensure a good performance, and also investigate how fast they should increase with the sample size. These results provide an important theoretical guidance and basis for empirical studies by deep neural networks.

估計/估計量 · Processing（編程語言） · Networking · Networks · 相似度 ·

2021 年 9 月 23 日

Joint Estimation and Inference for Multi-Experiment Networks of High-Dimensional Point Processes

Xu Wang,Ali Shojaie

from arxiv, 49 pages, 9 figures

Modern high-dimensional point process data, especially those from neuroscience experiments, often involve observations from multiple conditions and/or experiments. Networks of interactions corresponding to these conditions are expected to share many edges, but also exhibit unique, condition-specific ones. However, the degree of similarity among the networks from different conditions is generally unknown. Existing approaches for multivariate point processes do not take these structures into account and do not provide inference for jointly estimated networks. To address these needs, we propose a joint estimation procedure for networks of high-dimensional point processes that incorporates easy-to-compute weights in order to data-adaptively encourage similarity between the estimated networks. We also propose a powerful hierarchical multiple testing procedure for edges of all estimated networks, which takes into account the data-driven similarity structure of the multi-experiment networks. Compared to conventional multiple testing procedures, our proposed procedure greatly reduces the number of tests and results in improved power, while tightly controlling the family-wise error rate. Unlike existing procedures, our method is also free of assumptions on dependency between tests, offers flexibility on p-values calculated along the hierarchy, and is robust to misspecification of the hierarchical structure. We verify our theoretical results via simulation studies and demonstrate the application of the proposed procedure using neuronal spike train data.

賭博機/老虎機 · 置信度 · 優化器 · 邊緣化 · 上置信界限 ·

2021 年 9 月 23 日

Regret Lower Bound and Optimal Algorithm for High-Dimensional Contextual Linear Bandit

Ke Li,Yun Yang,Naveen N. Narisetty

In this paper, we consider the multi-armed bandit problem with high-dimensional features. First, we prove a minimax lower bound, $\mathcal{O}\big((\log d)^{\frac{\alpha+1}{2}}T^{\frac{1-\alpha}{2}}+\log T\big)$, for the cumulative regret, in terms of horizon $T$, dimension $d$ and a margin parameter $\alpha\in[0,1]$, which controls the separation between the optimal and the sub-optimal arms. This new lower bound unifies existing regret bound results that have different dependencies on T due to the use of different values of margin parameter $\alpha$ explicitly implied by their assumptions. Second, we propose a simple and computationally efficient algorithm inspired by the general Upper Confidence Bound (UCB) strategy that achieves a regret upper bound matching the lower bound. The proposed algorithm uses a properly centered $\ell_1$-ball as the confidence set in contrast to the commonly used ellipsoid confidence set. In addition, the algorithm does not require any forced sampling step and is thereby adaptive to the practically unknown margin parameter. Simulations and a real data analysis are conducted to compare the proposed method with existing ones in the literature.

近似 · 縮放 · 圖 · 可理解性 · 目標函數 ·

2021 年 9 月 23 日

Multidimensional Scaling: Approximation and Complexity

Erik Demaine,Adam Hesterberg,Frederic Koehler,Jayson Lynch,John Urschel

Metric Multidimensional scaling (MDS) is a classical method for generating meaningful (non-linear) low-dimensional embeddings of high-dimensional data. MDS has a long history in the statistics, machine learning, and graph drawing communities. In particular, the Kamada-Kawai force-directed graph drawing method is equivalent to MDS and is one of the most popular ways in practice to embed graphs into low dimensions. Despite its ubiquity, our theoretical understanding of MDS remains limited as its objective function is highly non-convex. In this paper, we prove that minimizing the Kamada-Kawai objective is NP-hard and give a provable approximation algorithm for optimizing it, which in particular is a PTAS on low-diameter graphs. We supplement this result with experiments suggesting possible connections between our greedy approximation algorithm and gradient-based methods.

分段 · 近似 · 平滑 · 泛函 · 奇異的 ·

2021 年 9 月 23 日

Piecewise Padé-Chebyshev Approximation of Bivariate Piecewise smooth Functions

Akansha Singh

from arxiv, 25 pages, 15 figures. arXiv admin note: text overlap with arXiv:1910.10385 by other authors

This article aims to implement the novel piecewise Maehly based Pad\'e-Chebyshev approximation and study its utility in minimizing the Gibbs phenomenon while approximating piecewise smooth functions in two-dimensions. We first develop a piecewise Pad\'e-Chebyshev method (PiPC) to approximate univariate piecewise smooth functions and then extend the same to a two dimensional space, leading to a piecewise bivariate Pad\'e-Chebyshev approximation (Pi2DPC) for approximating bivariate piecewise smooth functions. The chief advantage of these methods lie in their non dependence on any apriori knowledge of the locations and types of singularities present in the original function. Finally, we supplement our method with numerical results which validate its effectiveness in diminishing the Gibbs phenomenon to negligible levels.

通用近似器 · 通用近似定理 · Continuity · 寬度 · 近似 ·

2021 年 9 月 23 日

Arbitrary-Depth Universal Approximation Theorems for Operator Neural Networks

Annan Yu,Chloé Becquey,Diana Halikias,Matthew Esmaili Mallory,Alex Townsend

from arxiv, 12 pages

The standard Universal Approximation Theorem for operator neural networks (NNs) holds for arbitrary width and bounded depth. Here, we prove that operator NNs of bounded width and arbitrary depth are universal approximators for continuous nonlinear operators. In our main result, we prove that for non-polynomial activation functions that are continuously differentiable at a point with a nonzero derivative, one can construct an operator NN of width five, whose inputs are real numbers with finite decimal representations, that is arbitrarily close to any given continuous nonlinear operator. We derive an analogous result for non-affine polynomial activation functions. We also show that depth has theoretical advantages by constructing operator ReLU NNs of depth $2k^3+8$ and constant width that cannot be well-approximated by any operator ReLU NN of depth $k$, unless its width is exponential in $k$.

估計/估計量 · 結構化學習 · MoDELS · 向量化 · 邊緣化 ·

2021 年 9 月 22 日

High-dimensional structure learning of sparse vector autoregressive models using fractional marginal pseudo-likelihood

Kimmo Suotsalo,Yingying Xu,Jukka Corander,Johan Pensar

Learning vector autoregressive models from multivariate time series is conventionally approached through least squares or maximum likelihood estimation. These methods typically assume a fully connected model which provides no direct insight to the model structure and may lead to highly noisy estimates of the parameters. Because of these limitations, there has been an increasing interest towards methods that produce sparse estimates through penalized regression. However, such methods are computationally intensive and may become prohibitively time-consuming when the number of variables in the model increases. In this paper we adopt an approximate Bayesian approach to the learning problem by combining fractional marginal likelihood and pseudo-likelihood. We propose a novel method, PLVAR, that is both faster and produces more accurate estimates than the state-of-the-art methods based on penalized regression. We prove the consistency of the PLVAR estimator and demonstrate the attractive performance of the method on both simulated and real-world data.

貝葉斯網/貝葉斯網絡 · Networking · Neural Networks · MoDELS · Extensibility ·

2021 年 9 月 22 日

On Resource-Efficient Bayesian Network Classifiers and Deep Neural Networks

Wolfgang Roth,Günther Schindler,Holger Fr?ning,Franz Pernkopf

from arxiv, Accepted at ICPR 2020, fixed Figure 5

We present two methods to reduce the complexity of Bayesian network (BN) classifiers. First, we introduce quantization-aware training using the straight-through gradient estimator to quantize the parameters of BNs to few bits. Second, we extend a recently proposed differentiable tree-augmented naive Bayes (TAN) structure learning approach by also considering the model size. Both methods are motivated by recent developments in the deep learning community, and they provide effective means to trade off between model size and prediction accuracy, which is demonstrated in extensive experiments. Furthermore, we contrast quantized BN classifiers with quantized deep neural networks (DNNs) for small-scale scenarios which have hardly been investigated in the literature. We show Pareto optimal models with respect to model size, number of operations, and test error and find that both model classes are viable options.

估計/估計量 · 穩健性 · 向量化 · MoDELS · 損失函數（機器學習） ·

2021 年 9 月 21 日

Robust Estimation of High-Dimensional Vector Autoregressive Models

Linbo Liu,Danna Zhang

High dimensional non-Gaussian time series data are increasingly encountered in a wide range of applications. Conventional estimation methods and technical tools are inadequate when it comes to ultra high dimensional and heavy-tailed data. We investigate robust estimation of high dimensional autoregressive models with fat-tailed innovation vectors by solving a regularized regression problem using convex robust loss function. As a significant improvement, the dimension can be allowed to increase exponentially with the sample size to ensure consistency under very mild moment conditions. To develop the consistency theory, we establish a new Bernstein type inequality for the sum of autoregressive models. Numerical results indicate a good performance of robust estimates.