国产一国产一级毛片A久久久,狠狠丁香激情久久综合,日韩欧美亚洲丝袜制服在线,黑人一个接一个上来糟蹋,一级做A爰片久久毛片免费陪

The generalization error of a classifier is related to the complexity of the set of functions among which the classifier is chosen. Roughly speaking, the more complex the family, the greater the potential disparity between the training error and the population error of the classifier. This principle is embodied in layman's terms by Occam's razor principle, which suggests favoring low-complexity hypotheses over complex ones. We study a family of low-complexity classifiers consisting of thresholding the one-dimensional feature obtained by projecting the data on a random line after embedding it into a higher dimensional space parametrized by monomials of order up to k. More specifically, the extended data is projected n-times and the best classifier among those n (based on its performance on training data) is chosen. We obtain a bound on the generalization error of these low-complexity classifiers. The bound is less than that of any classifier with a non-trivial VC dimension, and thus less than that of a linear classifier. We also show that, given full knowledge of the class conditional densities, the error of the classifiers would converge to the optimal (Bayes) error as k and n go to infinity; if only a training dataset is given, we show that the classifiers will perfectly classify all the training points as k and n go to infinity.

相關內容

泛化誤差

關注 106

學習方法的泛化能力（Generalization Error）是由該方法學習到的模型對未知數據的預測能力，是學習方法本質上重要的性質。現實中采用最多的辦法是通過測試泛化誤差來評價學習方法的泛化能力。泛化誤差界刻畫了學習算法的經驗風險與期望風險之間偏差和收斂速度。一個機器學習的泛化誤差（Generalization Error），是一個描述學生機器在從樣品數據中學習之后，離教師機器之間的差距的函數。

Performer · MoDELS · Integration · 正交 · 近似 ·

2021 年 10 月 12 日

Computation of Eigenvalues for Nonlocal Models by Spectral Methods

Luciano Lopez,Sabrina Francesca Pellegrino

The purpose of this work is to study spectral methods to approximate the eigenvalues of nonlocal integral operators. Indeed, even if the spatial domain is an interval, it is very challenging to obtain closed analytical expressions for the eigenpairs of peridynamic operators. Our approach is based on the weak formulation of eigenvalue problem and we consider as orthogonal basis to compute the eigenvalues a set of Fourier trigonometric or Chebyshev polynomials. We show the order of convergence for eigenvalues and eigenfunctions in $L^2$-norm, and finally, we perform some numerical simulations to compare the two proposed methods.

泛函 · 線性的 · 線性模型 · 統計量 · 再生核希爾伯特空間 ·

2021 年 10 月 12 日

A new RKHS-based global testing for functional linear model

Jianjun Xu,Wenquan Cui

from arxiv, There are some mistakes in the paper

This article studies global testing of the slope function in functional linear regression model in the framework of reproducing kernel Hilbert space. We propose a new testing statistic based on smoothness regularization estimators. The asymptotic distribution of the testing statistic is established under null hypothesis. It is shown that the null asymptotic distribution is determined jointly by the reproducing kernel and the covariance function. Our theoretical analysis shows that the proposed testing is consistent over a class of smooth local alternatives. Despite the generality of the method of regularization, we show the procedure is easily implementable. Numerical examples are provided to demonstrate the empirical advantages over the competing methods.

線性的 · 近似 · 泛函 · 可約的 · 值函數近似 ·

2021 年 10 月 12 日

Self-guided Approximate Linear Programs

Parshan Pakiman,Selvaprabu Nadarajah,Negar Soheili,Qihang Lin

from arxiv, 52 pages

Approximate linear programs (ALPs) are well-known models based on value function approximations (VFAs) to obtain policies and lower bounds on the optimal policy cost of discounted-cost Markov decision processes (MDPs). Formulating an ALP requires (i) basis functions, the linear combination of which defines the VFA, and (ii) a state-relevance distribution, which determines the relative importance of different states in the ALP objective for the purpose of minimizing VFA error. Both these choices are typically heuristic: basis function selection relies on domain knowledge while the state-relevance distribution is specified using the frequency of states visited by a heuristic policy. We propose a self-guided sequence of ALPs that embeds random basis functions obtained via inexpensive sampling and uses the known VFA from the previous iteration to guide VFA computation in the current iteration. Self-guided ALPs mitigate the need for domain knowledge during basis function selection as well as the impact of the initial choice of the state-relevance distribution, thus significantly reducing the ALP implementation burden. We establish high probability error bounds on the VFAs from this sequence and show that a worst-case measure of policy performance is improved. We find that these favorable implementation and theoretical properties translate to encouraging numerical results on perishable inventory control and options pricing applications, where self-guided ALP policies improve upon policies from problem-specific methods. More broadly, our research takes a meaningful step toward application-agnostic policies and bounds for MDPs.

TOOLS · 估計/估計量 · Continuity · 查準率/準確率 · 確切的 ·

2021 年 10 月 11 日

On the Complexity of the Plantinga-Vegter Algorithm

Felipe Cucker,Alperen A. Ergür,Josué Tonelli-Cueto

from arxiv, 32 pages, 1 figure. This paper supersedes our earlier conference paper (arXiv:1901.09234). 2nd version: Re-structuring of the paper and correction of typos

We introduce tools from numerical analysis and high dimensional probability for precision control and complexity analysis of subdivision-based algorithms in computational geometry. We combine these tools with the continuous amortization framework from exact computation. We use these tools on a well-known example from the subdivision family: the adaptive subdivision algorithm due to Plantinga and Vegter. The only existing complexity estimate on this rather fast algorithm was an exponential worst-case upper bound for its interval arithmetic version. We go beyond the worst-case by considering both average and smoothed analysis, and prove polynomial time complexity estimates for both interval arithmetic and finite-precision versions of the Plantinga-Vegter algorithm.

估計/估計量 · 最大似然估計 · 極大似然 · 似然 · 極大似然估計 ·

2021 年 10 月 9 日

On the benefits of maximum likelihood estimation for Regression and Forecasting

Pranjal Awasthi,Abhimanyu Das,Rajat Sen,Ananda Theertha Suresh

We advocate for a practical Maximum Likelihood Estimation (MLE) approach towards designing loss functions for regression and forecasting, as an alternative to the typical approach of direct empirical risk minimization on a specific target metric. The MLE approach is better suited to capture inductive biases such as prior domain knowledge in datasets, and can output post-hoc estimators at inference time that can optimize different types of target metrics. We present theoretical results to demonstrate that our approach is competitive with any estimator for the target metric under some general conditions. In two example practical settings, Poisson and Pareto regression, we show that our competitive results can be used to prove that the MLE approach has better excess risk bounds than directly minimizing the target metric. We also demonstrate empirically that our method instantiated with a well-designed general purpose mixture likelihood family can obtain superior performance for a variety of tasks across time-series forecasting and regression datasets with different data distributions.

GROUP · 相互獨立的 · SimPLe · 優化器 · 代價 ·

2021 年 10 月 8 日

Optimal Group-Sequential Tests with Groups of Random Size

Andrey Novikov,Xóchitl Itxel Popoca-Jiménez

We consider sequential hypothesis testing based on observations which are received in groups of random size. The observations are assumed to be independent both within and between the groups. We assume that the group sizes are independent and their distributions are known, and that the groups are formed independently of the observations. We are concerned with a problem of testing a simple hypothesis against a simple alternative. For any (group-) sequential test, we take into account the following three characteristics: its type I and type II error probabilities and the average cost of observations. Under mild conditions, we characterize the structure of sequential tests minimizing the average cost of observations among all sequential tests whose type I and type II error probabilities do not exceed some prescribed levels.

樣本復雜度 · 策略搜索 · 估計/估計量 · 泛函 · 評論員 ·

2021 年 10 月 7 日

On the Sample Complexity of Actor-Critic Method for Reinforcement Learning with Function Approximation

Harshat Kumar,Alec Koppel,Alejandro Ribeiro

Reinforcement learning, mathematically described by Markov Decision Problems, may be approached either through dynamic programming or policy search. Actor-critic algorithms combine the merits of both approaches by alternating between steps to estimate the value function and policy gradient updates. Due to the fact that the updates exhibit correlated noise and biased gradient updates, only the asymptotic behavior of actor-critic is known by connecting its behavior to dynamical systems. This work puts forth a new variant of actor-critic that employs Monte Carlo rollouts during the policy search updates, which results in controllable bias that depends on the number of critic evaluations. As a result, we are able to provide for the first time the convergence rate of actor-critic algorithms when the policy search step employs policy gradient, agnostic to the choice of policy evaluation technique. In particular, we establish conditions under which the sample complexity is comparable to stochastic gradient method for non-convex problems or slower as a result of the critic estimation error, which is the main complexity bottleneck. These results hold in continuous state and action spaces with linear function approximation for the value function. We then specialize these conceptual results to the case where the critic is estimated by Temporal Difference, Gradient Temporal Difference, and Accelerated Gradient Temporal Difference. These learning rates are then corroborated on a navigation problem involving an obstacle, providing insight into the interplay between optimization and generalization in reinforcement learning.

流形 · 近似 · 數據點 · 線性的 · 維數災難 ·

2019 年 3 月 7 日

Manifold Approximation by Moving Least-Squares Projection (MMLS)

Barak Sober,David Levin

In order to avoid the curse of dimensionality, frequently encountered in Big Data analysis, there was a vast development in the field of linear and nonlinear dimension reduction techniques in recent years. These techniques (sometimes referred to as manifold learning) assume that the scattered input data is lying on a lower dimensional manifold, thus the high dimensionality problem can be overcome by learning the lower dimensionality behavior. However, in real life applications, data is often very noisy. In this work, we propose a method to approximate $\mathcal{M}$ a $d$-dimensional $C^{m+1}$ smooth submanifold of $\mathbb{R}^n$ ($d \ll n$) based upon noisy scattered data points (i.e., a data cloud). We assume that the data points are located "near" the lower dimensional manifold and suggest a non-linear moving least-squares projection on an approximating $d$-dimensional manifold. Under some mild assumptions, the resulting approximant is shown to be infinitely smooth and of high approximation order (i.e., $O(h^{m+1})$, where $h$ is the fill distance and $m$ is the degree of the local polynomial approximation). The method presented here assumes no analytic knowledge of the approximated manifold and the approximation algorithm is linear in the large dimension $n$. Furthermore, the approximating manifold can serve as a framework to perform operations directly on the high dimensional data in a computationally efficient manner. This way, the preparatory step of dimension reduction, which induces distortions to the data, can be avoided altogether.

UniFormer · 流形 · 近似 · 流形學習 · Performance ·

2018 年 12 月 6 日

UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction

Leland McInnes,John Healy,James Melville

from arxiv, Reference implementation available at //github.com/lmcinnes/umap

UMAP (Uniform Manifold Approximation and Projection) is a novel manifold learning technique for dimension reduction. UMAP is constructed from a theoretical framework based in Riemannian geometry and algebraic topology. The result is a practical scalable algorithm that applies to real world data. The UMAP algorithm is competitive with t-SNE for visualization quality, and arguably preserves more of the global structure with superior run time performance. Furthermore, UMAP has no computational restrictions on embedding dimension, making it viable as a general purpose dimension reduction technique for machine learning.

估計/估計量 · 過擬合 · 無偏 · 經驗風險 · 估計誤差 ·

2017 年 11 月 4 日

Positive-Unlabeled Learning with Non-Negative Risk Estimator

Ryuichi Kiryo,Gang Niu,Marthinus C. du Plessis,Masashi Sugiyama

from arxiv, NIPS 2017 camera-ready version (this paper was selected for oral presentation)

From only positive (P) and unlabeled (U) data, a binary classifier could be trained with PU learning, in which the state of the art is unbiased PU learning. However, if its model is very flexible, empirical risks on training data will go negative, and we will suffer from serious overfitting. In this paper, we propose a non-negative risk estimator for PU learning: when getting minimized, it is more robust against overfitting, and thus we are able to use very flexible models (such as deep neural networks) given limited P data. Moreover, we analyze the bias, consistency, and mean-squared-error reduction of the proposed risk estimator, and bound the estimation error of the resulting empirical risk minimizer. Experiments demonstrate that our risk estimator fixes the overfitting problem of its unbiased counterparts.