亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Uncertainty sampling in active learning is heavily used in practice to reduce the annotation cost. However, there has been no wide consensus on the function to be used for uncertainty estimation in binary classification tasks and convergence guarantees of the corresponding active learning algorithms are not well understood. The situation is even more challenging for multi-category classification. In this work, we propose an efficient uncertainty estimator for binary classification which we also extend to multiple classes, and provide a non-asymptotic rate of convergence for our uncertainty sampling-based active learning algorithm in both cases under no-noise conditions (i.e., linearly separable data). We also extend our analysis to the noisy case and provide theoretical guarantees for our algorithm under the influence of noise in the task of binary and multi-class classification.

相關內容

主(zhu)(zhu)動學(xue)(xue)(xue)(xue)(xue)(xue)習(xi)(xi)是機器學(xue)(xue)(xue)(xue)(xue)(xue)習(xi)(xi)(更普遍(bian)的(de)(de)說是人工智(zhi)能)的(de)(de)一(yi)個子領域(yu),在統計學(xue)(xue)(xue)(xue)(xue)(xue)領域(yu)也叫查(cha)詢學(xue)(xue)(xue)(xue)(xue)(xue)習(xi)(xi)、最優實驗設(she)計。“學(xue)(xue)(xue)(xue)(xue)(xue)習(xi)(xi)模(mo)塊(kuai)”和“選擇策略”是主(zhu)(zhu)動學(xue)(xue)(xue)(xue)(xue)(xue)習(xi)(xi)算法的(de)(de)2個基本且重(zhong)要的(de)(de)模(mo)塊(kuai)。 主(zhu)(zhu)動學(xue)(xue)(xue)(xue)(xue)(xue)習(xi)(xi)是“一(yi)種學(xue)(xue)(xue)(xue)(xue)(xue)習(xi)(xi)方法,在這(zhe)種方法中,學(xue)(xue)(xue)(xue)(xue)(xue)生會主(zhu)(zhu)動或體驗性(xing)地(di)參與(yu)學(xue)(xue)(xue)(xue)(xue)(xue)習(xi)(xi)過(guo)程(cheng)(cheng),并且根據學(xue)(xue)(xue)(xue)(xue)(xue)生的(de)(de)參與(yu)程(cheng)(cheng)度(du),有不同程(cheng)(cheng)度(du)的(de)(de)主(zhu)(zhu)動學(xue)(xue)(xue)(xue)(xue)(xue)習(xi)(xi)。” (Bonwell&Eison 1991)Bonwell&Eison(1991) 指出:“學(xue)(xue)(xue)(xue)(xue)(xue)生除了(le)被動地(di)聽(ting)課以(yi)外,還(huan)從事(shi)其他(ta)活動。” 在高(gao)等教育研究協會(ASHE)的(de)(de)一(yi)份(fen)報告中,作(zuo)者(zhe)討論了(le)各種促進主(zhu)(zhu)動學(xue)(xue)(xue)(xue)(xue)(xue)習(xi)(xi)的(de)(de)方法。他(ta)們(men)引(yin)用了(le)一(yi)些文獻,這(zhe)些文獻表明學(xue)(xue)(xue)(xue)(xue)(xue)生不僅要做聽(ting),還(huan)必(bi)須做更多的(de)(de)事(shi)情才能學(xue)(xue)(xue)(xue)(xue)(xue)習(xi)(xi)。他(ta)們(men)必(bi)須閱(yue)讀,寫作(zuo),討論并參與(yu)解決問題。此過(guo)程(cheng)(cheng)涉(she)及(ji)三個學(xue)(xue)(xue)(xue)(xue)(xue)習(xi)(xi)領域(yu),即知識,技能和態度(du)(KSA)。這(zhe)種學(xue)(xue)(xue)(xue)(xue)(xue)習(xi)(xi)行為分(fen)(fen)類法可以(yi)被認(ren)為是“學(xue)(xue)(xue)(xue)(xue)(xue)習(xi)(xi)過(guo)程(cheng)(cheng)的(de)(de)目標”。特別(bie)是,學(xue)(xue)(xue)(xue)(xue)(xue)生必(bi)須從事(shi)諸(zhu)如(ru)分(fen)(fen)析,綜合和評估之類的(de)(de)高(gao)級思(si)維任務。

We study reinforcement learning (RL) with linear function approximation. Existing algorithms for this problem only have high-probability regret and/or Probably Approximately Correct (PAC) sample complexity guarantees, which cannot guarantee the convergence to the optimal policy. In this paper, in order to overcome the limitation of existing algorithms, we propose a new algorithm called FLUTE, which enjoys uniform-PAC convergence to the optimal policy with high probability. The uniform-PAC guarantee is the strongest possible guarantee for reinforcement learning in the literature, which can directly imply both PAC and high probability regret bounds, making our algorithm superior to all existing algorithms with linear function approximation. At the core of our algorithm is a novel minimax value function estimator and a multi-level partition scheme to select the training samples from historical observations. Both of these techniques are new and of independent interest.

In many practical settings control decisions must be made under partial/imperfect information about the evolution of a relevant state variable. Partially Observable Markov Decision Processes (POMDPs) is a relatively well-developed framework for modeling and analyzing such problems. In this paper we consider the structural estimation of the primitives of a POMDP model based upon the observable history of the process. We analyze the structural properties of POMDP model with random rewards and specify conditions under which the model is identifiable without knowledge of the state dynamics. We consider a soft policy gradient algorithm to compute a maximum likelihood estimator and provide a finite-time characterization of convergence to a stationary point. We illustrate the estimation methodology with an application to optimal equipment replacement. In this context, replacement decisions must be made under partial/imperfect information on the true state (i.e. condition of the equipment). We use synthetic and real data to highlight the robustness of the proposed methodology and characterize the potential for misspecification when partial state observability is ignored.

The deep-learning-based least squares method has shown successful results in solving high-dimensional non-linear partial differential equations (PDEs). However, this method usually converges slowly. To speed up the convergence of this approach, an active-learning-based sampling algorithm is proposed in this paper. This algorithm actively chooses the most informative training samples from a probability density function based on residual errors to facilitate error reduction. In particular, points with larger residual errors will have more chances of being selected for training. This algorithm imitates the human learning process: learners are likely to spend more time repeatedly studying mistakes than other tasks they have correctly finished. A series of numerical results are illustrated to demonstrate the effectiveness of our active-learning-based sampling in high dimensions to speed up the convergence of the deep-learning-based least squares method.

This paper focuses on the expected difference in borrower's repayment when there is a change in the lender's credit decisions. Classical estimators overlook the confounding effects and hence the estimation error can be magnificent. As such, we propose another approach to construct the estimators such that the error can be greatly reduced. The proposed estimators are shown to be unbiased, consistent, and robust through a combination of theoretical analysis and numerical testing. Moreover, we compare the power of estimating the causal quantities between the classical estimators and the proposed estimators. The comparison is tested across a wide range of models, including linear regression models, tree-based models, and neural network-based models, under different simulated datasets that exhibit different levels of causality, different degrees of nonlinearity, and different distributional properties. Most importantly, we apply our approaches to a large observational dataset provided by a global technology firm that operates in both the e-commerce and the lending business. We find that the relative reduction of estimation error is strikingly substantial if the causal effects are accounted for correctly.

Modern online advertising systems inevitably rely on personalization methods, such as click-through rate (CTR) prediction. Recent progress in CTR prediction enjoys the rich representation capabilities of deep learning and achieves great success in large-scale industrial applications. However, these methods can suffer from lack of exploration. Another line of prior work addresses the exploration-exploitation trade-off problem with contextual bandit methods, which are less studied in the industry recently due to the difficulty in extending their flexibility with deep models. In this paper, we propose a novel Deep Uncertainty-Aware Learning (DUAL) method to learn deep CTR models based on Gaussian processes, which can provide efficient uncertainty estimations along with the CTR predictions while maintaining the flexibility of deep neural networks. By linking the ability to estimate predictive uncertainties of DUAL to well-known bandit algorithms, we further present DUAL-based Ad-ranking strategies to boost up long-term utilities such as the social welfare in advertising systems. Experimental results on several public datasets demonstrate the effectiveness of our methods. Remarkably, an online A/B test deployed in the Alibaba display advertising platform shows an $8.2\%$ social welfare improvement and an $8.0\%$ revenue lift.

Learning to classify unseen class samples at test time is popularly referred to as zero-shot learning (ZSL). If test samples can be from training (seen) as well as unseen classes, it is a more challenging problem due to the existence of strong bias towards seen classes. This problem is generally known as \emph{generalized} zero-shot learning (GZSL). Thanks to the recent advances in generative models such as VAEs and GANs, sample synthesis based approaches have gained considerable attention for solving this problem. These approaches are able to handle the problem of class bias by synthesizing unseen class samples. However, these ZSL/GZSL models suffer due to the following key limitations: $(i)$ Their training stage learns a class-conditioned generator using only \emph{seen} class data and the training stage does not \emph{explicitly} learn to generate the unseen class samples; $(ii)$ They do not learn a generic optimal parameter which can easily generalize for both seen and unseen class generation; and $(iii)$ If we only have access to a very few samples per seen class, these models tend to perform poorly. In this paper, we propose a meta-learning based generative model that naturally handles these limitations. The proposed model is based on integrating model-agnostic meta learning with a Wasserstein GAN (WGAN) to handle $(i)$ and $(iii)$, and uses a novel task distribution to handle $(ii)$. Our proposed model yields significant improvements on standard ZSL as well as more challenging GZSL setting. In ZSL setting, our model yields 4.5\%, 6.0\%, 9.8\%, and 27.9\% relative improvements over the current state-of-the-art on CUB, AWA1, AWA2, and aPY datasets, respectively.

Alternating Direction Method of Multipliers (ADMM) is a widely used tool for machine learning in distributed settings, where a machine learning model is trained over distributed data sources through an interactive process of local computation and message passing. Such an iterative process could cause privacy concerns of data owners. The goal of this paper is to provide differential privacy for ADMM-based distributed machine learning. Prior approaches on differentially private ADMM exhibit low utility under high privacy guarantee and often assume the objective functions of the learning problems to be smooth and strongly convex. To address these concerns, we propose a novel differentially private ADMM-based distributed learning algorithm called DP-ADMM, which combines an approximate augmented Lagrangian function with time-varying Gaussian noise addition in the iterative process to achieve higher utility for general objective functions under the same differential privacy guarantee. We also apply the moments accountant method to bound the end-to-end privacy loss. The theoretical analysis shows that DP-ADMM can be applied to a wider class of distributed learning problems, is provably convergent, and offers an explicit utility-privacy tradeoff. To our knowledge, this is the first paper to provide explicit convergence and utility properties for differentially private ADMM-based distributed learning algorithms. The evaluation results demonstrate that our approach can achieve good convergence and model accuracy under high end-to-end differential privacy guarantee.

Deep learning (DL) is a high dimensional data reduction technique for constructing high-dimensional predictors in input-output models. DL is a form of machine learning that uses hierarchical layers of latent features. In this article, we review the state-of-the-art of deep learning from a modeling and algorithmic perspective. We provide a list of successful areas of applications in Artificial Intelligence (AI), Image Processing, Robotics and Automation. Deep learning is predictive in its nature rather then inferential and can be viewed as a black-box methodology for high-dimensional function estimation.

From only positive (P) and unlabeled (U) data, a binary classifier could be trained with PU learning, in which the state of the art is unbiased PU learning. However, if its model is very flexible, empirical risks on training data will go negative, and we will suffer from serious overfitting. In this paper, we propose a non-negative risk estimator for PU learning: when getting minimized, it is more robust against overfitting, and thus we are able to use very flexible models (such as deep neural networks) given limited P data. Moreover, we analyze the bias, consistency, and mean-squared-error reduction of the proposed risk estimator, and bound the estimation error of the resulting empirical risk minimizer. Experiments demonstrate that our risk estimator fixes the overfitting problem of its unbiased counterparts.

During recent years, active learning has evolved into a popular paradigm for utilizing user's feedback to improve accuracy of learning algorithms. Active learning works by selecting the most informative sample among unlabeled data and querying the label of that point from user. Many different methods such as uncertainty sampling and minimum risk sampling have been utilized to select the most informative sample in active learning. Although many active learning algorithms have been proposed so far, most of them work with binary or multi-class classification problems and therefore can not be applied to problems in which only samples from one class as well as a set of unlabeled data are available. Such problems arise in many real-world situations and are known as the problem of learning from positive and unlabeled data. In this paper we propose an active learning algorithm that can work when only samples of one class as well as a set of unlabelled data are available. Our method works by separately estimating probability desnity of positive and unlabeled points and then computing expected value of informativeness to get rid of a hyper-parameter and have a better measure of informativeness./ Experiments and empirical analysis show promising results compared to other similar methods.

北京阿比特科技有限公司