亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

We consider a batch active learning scenario where the learner adaptively issues batches of points to a labeling oracle. Sampling labels in batches is highly desirable in practice due to the smaller number of interactive rounds with the labeling oracle (often human beings). However, batch active learning typically pays the price of a reduced adaptivity, leading to suboptimal results. In this paper we propose a solution which requires a careful trade off between the informativeness of the queried points and their diversity. We theoretically investigate batch active learning in the practically relevant scenario where the unlabeled pool of data is available beforehand (pool-based active learning). We analyze a novel stage-wise greedy algorithm and show that, as a function of the label complexity, the excess risk of this algorithm operating in the realizable setting for which we prove matches the known minimax rates in standard statistical learning settings. Our results also exhibit a mild dependence on the batch size. These are the first theoretical results that employ careful trade offs between informativeness and diversity to rigorously quantify the statistical performance of batch active learning in the pool-based scenario.

相關內容

主(zhu)(zhu)(zhu)動(dong)(dong)學(xue)(xue)(xue)(xue)(xue)(xue)習(xi)(xi)是(shi)機器學(xue)(xue)(xue)(xue)(xue)(xue)習(xi)(xi)(更普遍的(de)(de)(de)(de)說是(shi)人(ren)工智能)的(de)(de)(de)(de)一(yi)(yi)個子領(ling)域(yu),在(zai)統計學(xue)(xue)(xue)(xue)(xue)(xue)領(ling)域(yu)也叫查詢學(xue)(xue)(xue)(xue)(xue)(xue)習(xi)(xi)、最優(you)實驗(yan)設計。“學(xue)(xue)(xue)(xue)(xue)(xue)習(xi)(xi)模(mo)塊(kuai)”和(he)“選(xuan)擇策略”是(shi)主(zhu)(zhu)(zhu)動(dong)(dong)學(xue)(xue)(xue)(xue)(xue)(xue)習(xi)(xi)算法的(de)(de)(de)(de)2個基本(ben)且(qie)重要的(de)(de)(de)(de)模(mo)塊(kuai)。 主(zhu)(zhu)(zhu)動(dong)(dong)學(xue)(xue)(xue)(xue)(xue)(xue)習(xi)(xi)是(shi)“一(yi)(yi)種(zhong)(zhong)學(xue)(xue)(xue)(xue)(xue)(xue)習(xi)(xi)方法,在(zai)這種(zhong)(zhong)方法中,學(xue)(xue)(xue)(xue)(xue)(xue)生(sheng)(sheng)(sheng)會主(zhu)(zhu)(zhu)動(dong)(dong)或體驗(yan)性地(di)參與學(xue)(xue)(xue)(xue)(xue)(xue)習(xi)(xi)過(guo)程(cheng),并且(qie)根據(ju)學(xue)(xue)(xue)(xue)(xue)(xue)生(sheng)(sheng)(sheng)的(de)(de)(de)(de)參與程(cheng)度(du),有不同(tong)程(cheng)度(du)的(de)(de)(de)(de)主(zhu)(zhu)(zhu)動(dong)(dong)學(xue)(xue)(xue)(xue)(xue)(xue)習(xi)(xi)。” (Bonwell&Eison 1991)Bonwell&Eison(1991) 指出(chu):“學(xue)(xue)(xue)(xue)(xue)(xue)生(sheng)(sheng)(sheng)除(chu)了被(bei)動(dong)(dong)地(di)聽(ting)課(ke)以(yi)外,還從事其他活動(dong)(dong)。” 在(zai)高等教育研究協會(ASHE)的(de)(de)(de)(de)一(yi)(yi)份報告中,作者(zhe)討(tao)論了各(ge)種(zhong)(zhong)促進主(zhu)(zhu)(zhu)動(dong)(dong)學(xue)(xue)(xue)(xue)(xue)(xue)習(xi)(xi)的(de)(de)(de)(de)方法。他們(men)引用了一(yi)(yi)些文(wen)獻,這些文(wen)獻表明(ming)學(xue)(xue)(xue)(xue)(xue)(xue)生(sheng)(sheng)(sheng)不僅要做(zuo)聽(ting),還必(bi)須(xu)做(zuo)更多的(de)(de)(de)(de)事情才能學(xue)(xue)(xue)(xue)(xue)(xue)習(xi)(xi)。他們(men)必(bi)須(xu)閱讀,寫作,討(tao)論并參與解決問題(ti)。此過(guo)程(cheng)涉及三(san)個學(xue)(xue)(xue)(xue)(xue)(xue)習(xi)(xi)領(ling)域(yu),即知識,技能和(he)態度(du)(KSA)。這種(zhong)(zhong)學(xue)(xue)(xue)(xue)(xue)(xue)習(xi)(xi)行(xing)為分(fen)類法可以(yi)被(bei)認為是(shi)“學(xue)(xue)(xue)(xue)(xue)(xue)習(xi)(xi)過(guo)程(cheng)的(de)(de)(de)(de)目標”。特別是(shi),學(xue)(xue)(xue)(xue)(xue)(xue)生(sheng)(sheng)(sheng)必(bi)須(xu)從事諸(zhu)如分(fen)析(xi),綜(zong)合和(he)評(ping)估之類的(de)(de)(de)(de)高級思維(wei)任務(wu)。

The goal of co-salient object detection (CoSOD) is to discover salient objects that commonly appear in a query group containing two or more relevant images. Therefore, how to effectively extract inter-image correspondence is crucial for the CoSOD task. In this paper, we propose a global-and-local collaborative learning architecture, which includes a global correspondence modeling (GCM) and a local correspondence modeling (LCM) to capture comprehensive inter-image corresponding relationship among different images from the global and local perspectives. Firstly, we treat different images as different time slices and use 3D convolution to integrate all intra features intuitively, which can more fully extract the global group semantics. Secondly, we design a pairwise correlation transformation (PCT) to explore similarity correspondence between pairwise images and combine the multiple local pairwise correspondences to generate the local inter-image relationship. Thirdly, the inter-image relationships of the GCM and LCM are integrated through a global-and-local correspondence aggregation (GLA) module to explore more comprehensive inter-image collaboration cues. Finally, the intra- and inter-features are adaptively integrated by an intra-and-inter weighting fusion (AEWF) module to learn co-saliency features and predict the co-saliency map. The proposed GLNet is evaluated on three prevailing CoSOD benchmark datasets, demonstrating that our model trained on a small dataset (about 3k images) still outperforms eleven state-of-the-art competitors trained on some large datasets (about 8k-200k images).

Despite the recent advances in the field of object detection, common architectures are still ill-suited to incrementally detect new categories over time. They are vulnerable to catastrophic forgetting: they forget what has been already learned while updating their parameters in absence of the original training data. Previous works extended standard classification methods in the object detection task, mainly adopting the knowledge distillation framework. However, we argue that object detection introduces an additional problem, which has been overlooked. While objects belonging to new classes are learned thanks to their annotations, if no supervision is provided for other objects that may still be present in the input, the model learns to associate them to background regions. We propose to handle these missing annotations by revisiting the standard knowledge distillation framework. Our approach outperforms current state-of-the-art methods in every setting of the Pascal-VOC dataset. We further propose an extension to instance segmentation, outperforming the other baselines. In this work, we propose to handle the missing annotations by revisiting the standard knowledge distillation framework. We show that our approach outperforms current state-of-the-art methods in every setting of the Pascal-VOC 2007 dataset. Moreover, we propose a simple extension to instance segmentation, showing that it outperforms the other baselines.

Annotating data for supervised learning can be costly. When the annotation budget is limited, active learning can be used to select and annotate those observations that are likely to give the most gain in model performance. We propose an active learning algorithm that, in addition to selecting which observation to annotate, selects the precision of the annotation that is acquired. Assuming that annotations with low precision are cheaper to obtain, this allows the model to explore a larger part of the input space, with the same annotation costs. We build our acquisition function on the previously proposed BALD objective for Gaussian Processes, and empirically demonstrate the gains of being able to adjust the annotation precision in the active learning loop.

Despite the recent progress, the existing multi-view unsupervised feature selection methods mostly suffer from two limitations. First, they generally utilize either cluster structure or similarity structure to guide the feature selection, neglecting the possibility of a joint formulation with mutual benefits. Second, they often learn the similarity structure by either global structure learning or local structure learning, lacking the capability of graph learning with both global and local structural awareness. In light of this, this paper presents a joint multi-view unsupervised feature selection and graph learning (JMVFG) approach. Particularly, we formulate the multi-view feature selection with orthogonal decomposition, where each target matrix is decomposed into a view-specific basis matrix and a view-consistent cluster indicator. Cross-space locality preservation is incorporated to bridge the cluster structure learning in the projected space and the similarity learning (i.e., graph learning) in the original space. Further, a unified objective function is presented to enable the simultaneous learning of the cluster structure, the global and local similarity structures, and the multi-view consistency and inconsistency, upon which an alternating optimization algorithm is developed with theoretically proved convergence. Extensive experiments demonstrate the superiority of our approach for both multi-view feature selection and graph learning tasks.

An important challenge in statistical analysis lies in controlling the estimation bias when handling the ever-increasing data size and model complexity. For example, approximate methods are increasingly used to address the analytical and/or computational challenges when implementing standard estimators, but they often lead to inconsistent estimators. So consistent estimators can be difficult to obtain, especially for complex models and/or in settings where the number of parameters diverges with the sample size. We propose a general simulation-based estimation framework that allows to construct consistent and bias corrected estimators for parameters of increasing dimensions. The key advantage of the proposed framework is that it only requires to compute a simple inconsistent estimator multiple times. The resulting Just Identified iNdirect Inference estimator (JINI) enjoys nice properties, including consistency, asymptotic normality, and finite sample bias correction better than alternative methods. We further provide a simple algorithm to construct the JINI in a computationally efficient manner. Therefore, the JINI is especially useful in settings where standard methods may be challenging to apply, for example, in the presence of misclassification and rounding. We consider comprehensive simulation studies and analyze an alcohol consumption data example to illustrate the excellent performance and usefulness of the method.

In this monograph, I introduce the basic concepts of Online Learning through a modern view of Online Convex Optimization. Here, online learning refers to the framework of regret minimization under worst-case assumptions. I present first-order and second-order algorithms for online learning with convex losses, in Euclidean and non-Euclidean settings. All the algorithms are clearly presented as instantiation of Online Mirror Descent or Follow-The-Regularized-Leader and their variants. Particular attention is given to the issue of tuning the parameters of the algorithms and learning in unbounded domains, through adaptive and parameter-free online learning algorithms. Non-convex losses are dealt through convex surrogate losses and through randomization. The bandit setting is also briefly discussed, touching on the problem of adversarial and stochastic multi-armed bandits. These notes do not require prior knowledge of convex analysis and all the required mathematical tools are rigorously explained. Moreover, all the proofs have been carefully chosen to be as simple and as short as possible.

Graph Neural Networks (GNNs), which generalize deep neural networks to graph-structured data, have drawn considerable attention and achieved state-of-the-art performance in numerous graph related tasks. However, existing GNN models mainly focus on designing graph convolution operations. The graph pooling (or downsampling) operations, that play an important role in learning hierarchical representations, are usually overlooked. In this paper, we propose a novel graph pooling operator, called Hierarchical Graph Pooling with Structure Learning (HGP-SL), which can be integrated into various graph neural network architectures. HGP-SL incorporates graph pooling and structure learning into a unified module to generate hierarchical representations of graphs. More specifically, the graph pooling operation adaptively selects a subset of nodes to form an induced subgraph for the subsequent layers. To preserve the integrity of graph's topological information, we further introduce a structure learning mechanism to learn a refined graph structure for the pooled graph at each layer. By combining HGP-SL operator with graph neural networks, we perform graph level representation learning with focus on graph classification task. Experimental results on six widely used benchmarks demonstrate the effectiveness of our proposed model.

Many tasks in natural language processing can be viewed as multi-label classification problems. However, most of the existing models are trained with the standard cross-entropy loss function and use a fixed prediction policy (e.g., a threshold of 0.5) for all the labels, which completely ignores the complexity and dependencies among different labels. In this paper, we propose a meta-learning method to capture these complex label dependencies. More specifically, our method utilizes a meta-learner to jointly learn the training policies and prediction policies for different labels. The training policies are then used to train the classifier with the cross-entropy loss function, and the prediction policies are further implemented for prediction. Experimental results on fine-grained entity typing and text classification demonstrate that our proposed method can obtain more accurate multi-label classification results.

With the rapid increase of large-scale, real-world datasets, it becomes critical to address the problem of long-tailed data distribution (i.e., a few classes account for most of the data, while most classes are under-represented). Existing solutions typically adopt class re-balancing strategies such as re-sampling and re-weighting based on the number of observations for each class. In this work, we argue that as the number of samples increases, the additional benefit of a newly added data point will diminish. We introduce a novel theoretical framework to measure data overlap by associating with each sample a small neighboring region rather than a single point. The effective number of samples is defined as the volume of samples and can be calculated by a simple formula $(1-\beta^{n})/(1-\beta)$, where $n$ is the number of samples and $\beta \in [0,1)$ is a hyperparameter. We design a re-weighting scheme that uses the effective number of samples for each class to re-balance the loss, thereby yielding a class-balanced loss. Comprehensive experiments are conducted on artificially induced long-tailed CIFAR datasets and large-scale datasets including ImageNet and iNaturalist. Our results show that when trained with the proposed class-balanced loss, the network is able to achieve significant performance gains on long-tailed datasets.

Deep learning has yielded state-of-the-art performance on many natural language processing tasks including named entity recognition (NER). However, this typically requires large amounts of labeled data. In this work, we demonstrate that the amount of labeled training data can be drastically reduced when deep learning is combined with active learning. While active learning is sample-efficient, it can be computationally expensive since it requires iterative retraining. To speed this up, we introduce a lightweight architecture for NER, viz., the CNN-CNN-LSTM model consisting of convolutional character and word encoders and a long short term memory (LSTM) tag decoder. The model achieves nearly state-of-the-art performance on standard datasets for the task while being computationally much more efficient than best performing models. We carry out incremental active learning, during the training process, and are able to nearly match state-of-the-art performance with just 25\% of the original training data.

北京阿比特科技有限公司