日韩在线精品小视频,久久精品高清一区二区三区

We propose a methodology for detecting multiple change points in the mean of an otherwise stationary, autocorrelated, linear time series. It combines solution path generation based on the wild contrast maximisation principle, and an information criterion-based model selection strategy termed gappy Schwarz algorithm. The former is well-suited to separating shifts in the mean from fluctuations due to serial correlations, while the latter simultaneously estimates the dependence structure and the number of change points without performing the difficult task of estimating the level of the noise as quantified e.g. by the long-run variance. We provide modular investigation into their theoretical properties and show that the combined methodology, named WCM.gSa, achieves consistency in estimating both the total number and the locations of the change points. The good performance of WCM.gSa is demonstrated via extensive simulation studies, and we further illustrate its usefulness by applying the methodology to London air quality data.

相關內容

估計/估計量

關注 3

標注 · 數據集 · MoDELS · 噪聲 · Better ·

2022 年 2 月 10 日

Active label cleaning for improved dataset quality under resource constraints

Melanie Bernhardt,Daniel C. Castro,Ryutaro Tanno,Anton Schwaighofer,Kerem C. Tezcan,Miguel Monteiro,Shruthi Bannur,Matthew Lungren,Aditya Nori,Ben Glocker,Javier Alvarez-Valle,Ozan Oktay

from arxiv, Accepted for publication in Nature Communications

Imperfections in data annotation, known as label noise, are detrimental to the training of machine learning models and have an often-overlooked confounding effect on the assessment of model performance. Nevertheless, employing experts to remove label noise by fully re-annotating large datasets is infeasible in resource-constrained settings, such as healthcare. This work advocates for a data-driven approach to prioritising samples for re-annotation - which we term "active label cleaning". We propose to rank instances according to estimated label correctness and labelling difficulty of each sample, and introduce a simulation framework to evaluate relabelling efficacy. Our experiments on natural images and on a new medical imaging benchmark show that cleaning noisy labels mitigates their negative impact on model training, evaluation, and selection. Crucially, the proposed active label cleaning enables correcting labels up to 4 times more effectively than typical random selection in realistic conditions, making better use of experts' valuable time for improving dataset quality.

生成模型 · MoDELS · 極大似然 · 似然 · 學成 ·

2022 年 2 月 10 日

Analyzing and Improving Adversarial Training for Generative Modeling

Xuwang Yin,Shiying Li,Gustavo K. Rohde

We study a new generative modeling technique based on adversarial training (AT). We show that in a setting where the model is trained to discriminate in-distribution data from adversarial examples perturbed from out-distribution samples, the model learns the support of the in-distribution data. The learning process is also closely related to MCMC-based maximum likelihood learning of energy-based models (EBMs), and can be considered as an approximate maximum likelihood learning method. We show that this AT generative model achieves competitive image generation performance to state-of-the-art EBMs, and at the same time is stable to train and has better sampling efficiency. We demonstrate that the AT generative model is well-suited for the task of image translation and worst-case out-of-distribution detection.

contrastive · 目標檢測 · Extensibility · Performer · state-of-the-art ·

2022 年 2 月 9 日

Point-Level Region Contrast for Object Detection Pre-Training

Yutong Bai,Xinlei Chen,Alexander Kirillov,Alan Yuille,Alexander C. Berg

In this work we present point-level region contrast, a self-supervised pre-training approach for the task of object detection. This approach is motivated by the two key factors in detection: localization and recognition. While accurate localization favors models that operate at the pixel- or point-level, correct recognition typically relies on a more holistic, region-level view of objects. Incorporating this perspective in pre-training, our approach performs contrastive learning by directly sampling individual point pairs from different regions. Compared to an aggregated representation per region, our approach is more robust to the change in input region quality, and further enables us to implicitly improve initial region assignments via online knowledge distillation during training. Both advantages are important when dealing with imperfect regions encountered in the unsupervised setting. Experiments show point-level region contrast improves on state-of-the-art pre-training methods for object detection and segmentation across multiple tasks and datasets, and we provide extensive ablation studies and visualizations to aid understanding. Code will be made available.

Weight · CASE · 統計量 · 混合 · 統計理論 ·

2022 年 2 月 9 日

Change-point detection based on weighted two-sample U-statistics

Herold Dehling,Kata Vuk,Martin Wendler

We investigate the large-sample behavior of change-point tests based on weighted two-sample U-statistics, in the case of short-range dependent data. Under some mild mixing conditions, we establish convergence of the test statistic to an extreme value distribution. A simulation study shows that the weighted tests are superior to the non-weighted versions when the change-point occurs near the boundary of the time interval, while they loose power in the center.

contrastive · 圖像分割 · 對比學習 · 學成 · 未標記 ·

2022 年 2 月 8 日

Cross-level Contrastive Learning and Consistency Constraint for Semi-supervised Medical Image Segmentation

Xinkai Zhao,Chaowei Fang,De-Jun Fan,Xutao Lin,Feng Gao,Guanbin Li

Semi-supervised learning (SSL), which aims at leveraging a few labeled images and a large number of unlabeled images for network training, is beneficial for relieving the burden of data annotation in medical image segmentation. According to the experience of medical imaging experts, local attributes such as texture, luster and smoothness are very important factors for identifying target objects like lesions and polyps in medical images. Motivated by this, we propose a cross-level constrastive learning scheme to enhance representation capacity for local features in semi-supervised medical image segmentation. Compared to existing image-wise, patch-wise and point-wise constrastive learning algorithms, our devised method is capable of exploring more complex similarity cues, namely the relational characteristics between global point-wise and local patch-wise representations. Additionally, for fully making use of cross-level semantic relations, we devise a novel consistency constraint that compares the predictions of patches against those of the full image. With the help of the cross-level contrastive learning and consistency constraint, the unlabelled data can be effectively explored to improve segmentation performance on two medical image datasets for polyp and skin lesion segmentation respectively. Code of our approach is available.

估計/估計量 · 無偏估計 · 對數幾率回歸 · 推斷 · 極大似然 ·

2022 年 2 月 8 日

Inference from Sampling with Response Probabilities Estimated via Calibration

Caren Hasler

A solution to control for nonresponse bias consists of multiplying the design weights of respondents by the inverse of estimated response probabilities to compensate for the nonrespondents. Maximum likelihood and calibration are two approaches that can be applied to obtain estimated response probabilities. The paper develops asymptotic properties of the resulting estimator when calibration is applied. A logistic regression model for the response probabilities is postulated and missing at random data is supposed. The author shows that the estimators with the response probabilities estimated via calibration are asymptotically equivalent to unbiased estimators and that a gain in efficiency is obtained when estimating the response probabilities via calibration as compared to the estimator with the true response probabilities.

無監督 · 無監督學習 · 學成 · state-of-the-art · Pair ·

2021 年 4 月 7 日

Warp Consistency for Unsupervised Learning of Dense Correspondences

Prune Truong,Martin Danelljan,Fisher Yu,Luc Van Gool

from arxiv, code: //github.com/PruneTruong/DenseMatching

The key challenge in learning dense correspondences lies in the lack of ground-truth matches for real image pairs. While photometric consistency losses provide unsupervised alternatives, they struggle with large appearance changes, which are ubiquitous in geometric and semantic matching tasks. Moreover, methods relying on synthetic training pairs often suffer from poor generalisation to real data. We propose Warp Consistency, an unsupervised learning objective for dense correspondence regression. Our objective is effective even in settings with large appearance and view-point changes. Given a pair of real images, we first construct an image triplet by applying a randomly sampled warp to one of the original images. We derive and analyze all flow-consistency constraints arising between the triplet. From our observations and empirical results, we design a general unsupervised objective employing two of the derived constraints. We validate our warp consistency loss by training three recent dense correspondence networks for the geometric and semantic matching tasks. Our approach sets a new state-of-the-art on several challenging benchmarks, including MegaDepth, RobotCar and TSS. Code and models will be released at //github.com/PruneTruong/DenseMatching.

異常點 · 主動學習 · INFORMS · 學成 · 模式崩潰 ·

2019 年 3 月 14 日

Generative Adversarial Active Learning for Unsupervised Outlier Detection

Yezheng Liu,Zhe Li,Chong Zhou,Yuanchun Jiang,Jianshan Sun,Meng Wang,Xiangnan He

from arxiv, TKDE 2019

Outlier detection is an important topic in machine learning and has been used in a wide range of applications. In this paper, we approach outlier detection as a binary-classification issue by sampling potential outliers from a uniform reference distribution. However, due to the sparsity of data in high-dimensional space, a limited number of potential outliers may fail to provide sufficient information to assist the classifier in describing a boundary that can separate outliers from normal data effectively. To address this, we propose a novel Single-Objective Generative Adversarial Active Learning (SO-GAAL) method for outlier detection, which can directly generate informative potential outliers based on the mini-max game between a generator and a discriminator. Moreover, to prevent the generator from falling into the mode collapsing problem, the stop node of training should be determined when SO-GAAL is able to provide sufficient information. But without any prior information, it is extremely difficult for SO-GAAL. Therefore, we expand the network structure of SO-GAAL from a single generator to multiple generators with different objectives (MO-GAAL), which can generate a reasonable reference distribution for the whole dataset. We empirically compare the proposed approach with several state-of-the-art outlier detection methods on both synthetic and real-world datasets. The results show that MO-GAAL outperforms its competitors in the majority of cases, especially for datasets with various cluster types or high irrelevant variable ratio.

單純形 · Performer · Processing（編程語言） · 貝葉斯推斷 · 離散化 ·

2018 年 6 月 19 日

Large-Scale Stochastic Sampling from the Probability Simplex

Jack Baker,Paul Fearnhead,Emily B Fox,Christopher Nemeth

Stochastic gradient Markov chain Monte Carlo (SGMCMC) has become a popular method for scalable Bayesian inference. These methods are based on sampling a discrete-time approximation to a continuous time process, such as the Langevin diffusion. When applied to distributions defined on a constrained space, such as the simplex, the time-discretisation error can dominate when we are near the boundary of the space. We demonstrate that while current SGMCMC methods for the simplex perform well in certain cases, they struggle with sparse simplex spaces; when many of the components are close to zero. However, most popular large-scale applications of Bayesian inference on simplex spaces, such as network or topic models, are sparse. We argue that this poor performance is due to the biases of SGMCMC caused by the discretization error. To get around this, we propose the stochastic CIR process, which removes all discretization error and we prove that samples from the stochastic CIR process are asymptotically unbiased. Use of the stochastic CIR process within a SGMCMC algorithm is shown to give substantially better performance for a topic model and a Dirichlet process mixture model than existing SGMCMC approaches.

優化器 · 可約的 · Performer · state-of-the-art · AIM ·

2018 年 3 月 13 日

Optimal Transport for Multi-source Domain Adaptation under Target Shift

Ievgen Redko,Nicolas Courty,Rémi Flamary,Devis Tuia

In this paper, we propose to tackle the problem of reducing discrepancies between multiple domains referred to as multi-source domain adaptation and consider it under the target shift assumption: in all domains we aim to solve a classification problem with the same output classes, but with labels' proportions differing across them. We design a method based on optimal transport, a theory that is gaining momentum to tackle adaptation problems in machine learning due to its efficiency in aligning probability distributions. Our method performs multi-source adaptation and target shift correction simultaneously by learning the class probabilities of the unlabeled target sample and the coupling allowing to align two (or more) probability distributions. Experiments on both synthetic and real-world data related to satellite image segmentation task show the superiority of the proposed method over the state-of-the-art.