五月丁香四月婷婷激情综合,少妇高潮惨叫正在播放对白,亚洲精品人人AV人人亚洲

This paper investigates the efficiency of the K-fold cross-validation (CV) procedure and a debiased version thereof as a means of estimating the generalization risk of a learning algorithm. We work under the general assumption of uniform algorithmic stability. We show that the K-fold risk estimate may not be consistent under such general stability assumptions, by constructing non vanishing lower bounds on the error in realistic contexts such as regularized empirical risk minimisation and stochastic gradient descent. We thus advocate the use of a debiased version of the K-fold and prove an error bound with exponential tail decay regarding this version. Our result is applicable to the large class of uniformly stable algorithms, contrarily to earlier works focusing on specific tasks such as density estimation. We illustrate the relevance of the debiased K-fold CV on a simple model selection problem and demonstrate empirically the usefulness of the promoted approach on real world classification and regression datasets.

相關內容

交叉驗證

關注 2

交叉驗證，有時也稱為旋轉估計或樣本外測試，是用于評估統計結果如何的各種類似模型驗證技術中的任何一種分析將概括為一個獨立的數據集。它主要用于設置，其目的是預測，和一個想要估計如何準確地一個預測模型在實踐中執行。在預測問題中，通常會給模型一個已知數據的數據集，在該數據集上進行訓練（訓練數據集）以及未知數據（或首次看到的數據）的數據集（根據該數據集測試模型）（稱為驗證數據集或測試集）。交叉驗證的目標是測試模型預測未用于估計數據的新數據的能力，以發現諸如過度擬合或選擇偏倚之類的問題，并提供有關如何進行建模的見解。該模型將推廣到一個獨立的數據集（例如，未知數據集，例如來自實際問題的數據集）。一輪交叉驗證涉及分割一個樣品的數據到互補的子集，在一個子集執行所述分析（稱為訓練集），以及驗證在另一子集中的分析（稱為驗證集合或測試集）。為了減少可變性，在大多數方法中，使用不同的分區執行多輪交叉驗證，并將驗證結果組合（例如取平均值）在各輪中，以估計模型的預測性能。總而言之，交叉驗證結合了預測中適用性的度量（平均），以得出模型預測性能的更準確估計。

Networking · CASE · 知識 (knowledge) · 結點 · 可辨認的 ·

2023 年 8 月 3 日

Trade-off between Time, Space, and Workload: the case of the Self-stabilizing Unison

Stéphane Devismes,David Ilcinkas,Colette Johnen,Frédéric Mazoit

from arxiv, arXiv admin note: substantial text overlap with arXiv:2307.06635

We present a self-stabilizing algorithm for the (asynchronous) unison problem which achieves an efficient trade-off between time, workload, and space in a weak model. Precisely, our algorithm is defined in the atomic-state model and works in anonymous networks in which even local ports are unlabeled. It makes no assumption on the daemon and thus stabilizes under the weakest one: the distributed unfair daemon. In a $n$-node network of diameter $D$ and assuming a period $B \geq 2D+2$, our algorithm only requires $O(\log B)$ bits per node to achieve full polynomiality as it stabilizes in at most $2D-2$ rounds and $O(\min(n^2B, n^3))$ moves. In particular and to the best of our knowledge, it is the first self-stabilizing unison for arbitrary anonymous networks achieving an asymptotically optimal stabilization time in rounds using a bounded memory at each node. Finally, we show that our solution allows to efficiently simulate synchronous self-stabilizing algorithms in an asynchronous environment. This provides a new state-of-the-art algorithm solving both the leader election and the spanning tree construction problem in any identified connected network which, to the best of our knowledge, beat all existing solutions of the literature.

Markovian · 估計/估計量 · SGD · 隨機梯度下降 · 相互獨立的 ·

2023 年 8 月 3 日

Online covariance estimation for stochastic gradient descent under Markovian sampling

Abhishek Roy,Krishnakumar Balasubramanian

We study the online overlapping batch-means covariance estimator for Stochastic Gradient Descent (SGD) under Markovian sampling. We show that the convergence rates of the covariance estimator are $O\big(\sqrt{d}\,n^{-1/8}(\log n)^{1/4}\big)$ and $O\big(\sqrt{d}\,n^{-1/8}\big)$ under state-dependent and state-independent Markovian sampling, respectively, with $d$ representing dimensionality and $n$ denoting the number of observations or SGD iterations. Remarkably, these rates match the best-known convergence rate previously established for the independent and identically distributed ($\iid$) case by \cite{zhu2021online}, up to logarithmic factors. Our analysis overcomes significant challenges that arise due to Markovian sampling, leading to the introduction of additional error terms and complex dependencies between the blocks of the batch-means covariance estimator. Moreover, we establish the convergence rate for the first four moments of the $\ell_2$ norm of the error of SGD dynamics under state-dependent Markovian data, which holds potential interest as an independent result. To validate our theoretical findings, we provide numerical illustrations to derive confidence intervals for SGD when training linear and logistic regression models under Markovian sampling. Additionally, we apply our approach to tackle the intriguing problem of strategic classification with logistic regression, where adversaries can adaptively modify features during the training process to increase their chances of being classified in a specific target class.

線性的 · 估計/估計量 · 相互獨立的 · 離散化 · Analysis ·

2023 年 8 月 2 日

An Eulerian finite element method for the linearized Navier--Stokes problem in an evolving domain

Michael Neilan,Maxim Olshanskii

The paper addresses an error analysis of an Eulerian finite element method used for solving a linearized Navier--Stokes problem in a time-dependent domain. In this study, the domain's evolution is assumed to be known and independent of the solution to the problem at hand. The numerical method employed in the study combines a standard Backward Differentiation Formula (BDF)-type time-stepping procedure with a geometrically unfitted finite element discretization technique. Additionally, Nitsche's method is utilized to enforce the boundary conditions. The paper presents a convergence estimate for several velocity--pressure elements that are inf-sup stable. The estimate demonstrates optimal order convergence in the energy norm for the velocity component and a scaled $L^2(H^1)$-type norm for the pressure component.

相互獨立的 · 隨機變量 · 均值 · 蒙特卡羅 · 聯合分布 ·

2023 年 8 月 2 日

Asymptotic Independence of the Quadratic form and Maximum of Independent Random Variables with Applications to High-Dimensional Tests

Dachuan Chen,Decai Liang,Long Feng

This paper establishes the asymptotic independence between the quadratic form and maximum of a sequence of independent random variables. Based on this theoretical result, we find the asymptotic joint distribution for the quadratic form and maximum, which can be applied into the high-dimensional testing problems. By combining the sum-type test and the max-type test, we propose the Fisher's combination tests for the one-sample mean test and two-sample mean test. Under this novel general framework, several strong assumptions in existing literature have been relaxed. Monte Carlo simulation has been done which shows that our proposed tests are strongly robust to both sparse and dense data.

Performer · 可理解性 · 離散化 · Analysis · 線性的 ·

2023 年 8 月 2 日

Non-intrusive implementation of a wide variety of Multiscale Finite Element Methods

Rutger A. Biezemans,Claude Le Bris,Frédéric Legoll,Alexei Lozinski

from arxiv, 50 pages, 4 figures; typos corrected (notably in the proof of Lemma 3), some clarifications added, numerical results added in Section 7

Multiscale Finite Element Methods (MsFEMs) are now well-established finite element type approaches dedicated to multiscale problems. They first compute local, oscillatory, problem-dependent basis functions that generate a suitable discretization space, and next perform a Galerkin approximation of the problem on that space. We investigate here how these approaches can be implemented in a non-intrusive way, in order to facilitate their dissemination within industrial codes or non-academic environments. We develop an abstract framework that covers a wide variety of MsFEMs for linear second-order partial differential equations. Non-intrusive MsFEM approaches are developed within the full generality of this framework, which may moreover be beneficial to steering software development and improving the theoretical understanding and analysis of MsFEMs.

Minimax · SimPLe · 散度 · 統計理論 ·

2023 年 8 月 2 日

A short note on an inequality between KL and TV

Clément L. Canonne

from arxiv, Update: positive answer to Question 8 ("from the TFL to the BH bound"), communicated to me by Hao-Chung Cheng

The goal of this short note is to discuss the relation between Kullback--Leibler divergence and total variation distance, starting with the celebrated Pinsker's inequality relating the two, before switching to a simple, yet (arguably) more useful inequality, apparently not as well known, due to Bretagnolle and Huber. We also discuss applications of this bound for minimax testing lower bounds.

泛函 · 估計/估計量 · 均值 · tuning · MoDELS ·

2023 年 8 月 1 日

Unified unconditional regression for multivariate quantiles, M-quantiles and expectiles

Luca Merlo,Lea Petrella,Nicola Salvati,Nikos Tzavidis

In this paper, we develop a unified regression approach to model unconditional quantiles, M-quantiles and expectiles of multivariate dependent variables exploiting the multidimensional Huber's function. To assess the impact of changes in the covariates across the entire unconditional distribution of the responses, we extend the work of Firpo et al. (2009) by running a mean regression of the recentered influence function on the explanatory variables. We discuss the estimation procedure and establish the asymptotic properties of the derived estimators. A data-driven procedure is also presented to select the tuning constant of the Huber's function. The validity of the proposed methodology is explored with simulation studies and through an application using the Survey of Household Income and Wealth 2016 conducted by the Bank of Italy.

Neural Networks · Parse · Networking · 粵港澳大灣區數字經濟研究院 · 解析樹 ·

2021 年 2 月 25 日

How to represent part-whole hierarchies in a neural network

Geoffrey Hinton

from arxiv, 43 pages, 5 figures

This paper does not describe a working system. Instead, it presents a single idea about representation which allows advances made by several different groups to be combined into an imaginary system called GLOM. The advances include transformers, neural fields, contrastive representation learning, distillation and capsules. GLOM answers the question: How can a neural network with a fixed architecture parse an image into a part-whole hierarchy which has a different structure for each image? The idea is simply to use islands of identical vectors to represent the nodes in the parse tree. If GLOM can be made to work, it should significantly improve the interpretability of the representations produced by transformer-like systems when applied to vision or language

Pegasus · Performer · state-of-the-art · MoDELS · ROUGE ·

2020 年 6 月 2 日

PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization

Jingqing Zhang,Yao Zhao,Mohammad Saleh,Peter J. Liu

from arxiv, Added Human Evaluation results; Code link added; Accepted for ICML 2020

Recent work pre-training Transformers with self-supervised objectives on large text corpora has shown great success when fine-tuned on downstream NLP tasks including text summarization. However, pre-training objectives tailored for abstractive text summarization have not been explored. Furthermore there is a lack of systematic evaluation across diverse domains. In this work, we propose pre-training large Transformer-based encoder-decoder models on massive text corpora with a new self-supervised objective. In PEGASUS, important sentences are removed/masked from an input document and are generated together as one output sequence from the remaining sentences, similar to an extractive summary. We evaluated our best PEGASUS model on 12 downstream summarization tasks spanning news, science, stories, instructions, emails, patents, and legislative bills. Experiments demonstrate it achieves state-of-the-art performance on all 12 downstream datasets measured by ROUGE scores. Our model also shows surprising performance on low-resource summarization, surpassing previous state-of-the-art results on 6 datasets with only 1000 examples. Finally we validated our results using human evaluation and show that our model summaries achieve human performance on multiple datasets.

戴斯相似度 · 圖 · 高斯混合（模型） · 優化器 · 學成 ·

2018 年 1 月 25 日

Deep LOGISMOS: Deep Learning Graph-based 3D Segmentation of Pancreatic Tumors on CT scans

Zhihui Guo,Ling Zhang,Le Lu,Mohammadhadi Bagheri,Ronald M. Summers,Milan Sonka,Jianhua Yao

from arxiv, 4 pages,3 figures

This paper reports Deep LOGISMOS approach to 3D tumor segmentation by incorporating boundary information derived from deep contextual learning to LOGISMOS - layered optimal graph image segmentation of multiple objects and surfaces. Accurate and reliable tumor segmentation is essential to tumor growth analysis and treatment selection. A fully convolutional network (FCN), UNet, is first trained using three adjacent 2D patches centered at the tumor, providing contextual UNet segmentation and probability map for each 2D patch. The UNet segmentation is then refined by Gaussian Mixture Model (GMM) and morphological operations. The refined UNet segmentation is used to provide the initial shape boundary to build a segmentation graph. The cost for each node of the graph is determined by the UNet probability maps. Finally, a max-flow algorithm is employed to find the globally optimal solution thus obtaining the final segmentation. For evaluation, we applied the method to pancreatic tumor segmentation on a dataset of 51 CT scans, among which 30 scans were used for training and 21 for testing. With Deep LOGISMOS, DICE Similarity Coefficient (DSC) and Relative Volume Difference (RVD) reached 83.2+-7.8% and 18.6+-17.4% respectively, both are significantly improved (p<0.05) compared with contextual UNet and/or LOGISMOS alone.