国产亚洲欧美日韩精品色狠二区_高清国产三级在线播放_九九99精品国产精品欧洲_精品一区二区成人精品91_91人人揉人人捏人人添_女生羞羞视频网站入口_中国少妇一级毛片

In the online balanced graph repartitioning problem, one has to maintain a clustering of $n$ nodes into $\ell$ clusters, each having $k = n / \ell$ nodes. During runtime, an online algorithm is given a stream of communication requests between pairs of nodes: an inter-cluster communication costs one unit, while the intra-cluster communication is free. An algorithm can change the clustering, paying unit cost for each moved node. This natural problem admits a simple $O(\ell^2 \cdot k^2)$-competitive algorithm COMP, whose performance is far apart from the best known lower bound of $\Omega(\ell \cdot k)$. One of open questions is whether the dependency on $\ell$ can be made linear; this question is of practical importance as in the typical datacenter application where virtual machines are clustered on physical servers, $\ell$ is of several orders of magnitude larger than $k$. We answer this question affirmatively, proving that a simple modification of COMP is $(\ell \cdot 2^{O(k)})$-competitive. On the technical level, we achieve our bound by translating the problem to a system of linear integer equations and using Graver bases to show the existence of a ``small'' solution.

相關內容

簇(cu)

關注 1

線性的 · 線性分類 · UniFormer · 泛化理論 · 學成 ·

2021 年 9 月 3 日

A Limitation of the PAC-Bayes Framework

Roi Livni,Shay Moran

PAC-Bayes is a useful framework for deriving generalization bounds which was introduced by McAllester ('98). This framework has the flexibility of deriving distribution- and algorithm-dependent bounds, which are often tighter than VC-related uniform convergence bounds. In this manuscript we present a limitation for the PAC-Bayes framework. We demonstrate an easy learning task that is not amenable to a PAC-Bayes analysis. Specifically, we consider the task of linear classification in 1D; it is well-known that this task is learnable using just $O(\log(1/\delta)/\epsilon)$ examples. On the other hand, we show that this fact can not be proved using a PAC-Bayes analysis: for any algorithm that learns 1-dimensional linear classifiers there exists a (realizable) distribution for which the PAC-Bayes bound is arbitrarily large.

Oracle · 邊 · 圖 · 無向 · Weight ·

2021 年 9 月 2 日

Improved Distance Sensitivity Oracles with Subcubic Preprocessing Time

Hanlin Ren

from arxiv, journal version

We consider the problem of building Distance Sensitivity Oracles (DSOs). Given a directed graph $G=(V, E)$ with edge weights in $\{1, 2, \dots, M\}$, we need to preprocess it into a data structure, and answer the following queries: given vertices $u,v\in V$ and a failed vertex or edge $f\in (V\cup E)$, output the length of the shortest path from $u$ to $v$ that does not go through $f$. Our main result is a simple DSO with $\tilde{O}(n^{2.7233}M)$ preprocessing time and $O(1)$ query time. Moreover, if the input graph is undirected, the preprocessing time can be improved to $\tilde{O}(n^{2.6865}M)$. The preprocessing algorithm is randomized with correct probability $\ge 1-1/n^C$, for a constant $C$ that can be made arbitrarily large. Previously, there is a DSO with $\tilde{O}(n^{2.8729}M)$ preprocessing time and $\operatorname{polylog}(n)$ query time [Chechik and Cohen, STOC'20]. At the core of our DSO is the following observation from [Bernstein and Karger, STOC'09]: if there is a DSO with preprocessing time $P$ and query time $Q$, then we can construct a DSO with preprocessing time $P+\tilde{O}(n^2)\cdot Q$ and query time $O(1)$. (Here $\tilde{O}(\cdot)$ hides $\operatorname{polylog}(n)$ factors.)

估計/估計量 · 可約的 · 向量化 · 方差 · MoDELS ·

2021 年 9 月 1 日

Improved Estimators for Semi-supervised High-dimensional Regression Model

Ilan Livne,David Azriel,Yair Goldberg

We study a linear high-dimensional regression model in a semi-supervised setting, where for many observations only the vector of covariates $X$ is given with no response $Y$. We do not make any sparsity assumptions on the vector of coefficients, and aim at estimating $\mathrm{Var}(Y|X)$. We propose an estimator, which is unbiased, consistent, and asymptotically normal. This estimator can be improved by adding zero-estimators arising from the unlabelled data. Adding zero-estimators does not affect the bias and potentially can reduce variance. In order to achieve optimal improvement, many zero-estimators should be used, but this raises the problem of estimating many parameters. Therefore, we introduce covariate selection algorithms that identify which zero-estimators should be used in order to improve the above estimator. We further illustrate our approach for other estimators, and present an algorithm that improves estimation for any given variance estimator. Our theoretical results are demonstrated in a simulation study.

近鄰 · Bagging · 可約的 · Performer · 集成學習 ·

2021 年 9 月 1 日

Under-bagging Nearest Neighbors for Imbalanced Classification

Hanyuan Hang,Yuchao Cai,Hanfang Yang,Zhouchen Lin

In this paper, we propose an ensemble learning algorithm called \textit{under-bagging $k$-nearest neighbors} (\textit{under-bagging $k$-NN}) for imbalanced classification problems. On the theoretical side, by developing a new learning theory analysis, we show that with properly chosen parameters, i.e., the number of nearest neighbors $k$, the expected sub-sample size $s$, and the bagging rounds $B$, optimal convergence rates for under-bagging $k$-NN can be achieved under mild assumptions w.r.t.~the arithmetic mean (AM) of recalls. Moreover, we show that with a relatively small $B$, the expected sub-sample size $s$ can be much smaller than the number of training data $n$ at each bagging round, and the number of nearest neighbors $k$ can be reduced simultaneously, especially when the data are highly imbalanced, which leads to substantially lower time complexity and roughly the same space complexity. On the practical side, we conduct numerical experiments to verify the theoretical results on the benefits of the under-bagging technique by the promising AM performance and efficiency of our proposed algorithm.

表示學習 · 學成 · 簇 · INFORMS · Performer ·

2019 年 11 月 26 日

Self-labelling via simultaneous clustering and representation learning

Yuki Markus Asano,Christian Rupprecht,Andrea Vedaldi

Combining clustering and representation learning is one of the most promising approaches for unsupervised learning of deep neural networks. However, doing so naively leads to ill posed learning problems with degenerate solutions. In this paper, we propose a novel and principled learning formulation that addresses these issues. The method is obtained by maximizing the information between labels and input data indices. We show that this criterion extends standard cross-entropy minimization to an optimal transport problem, which we solve efficiently for millions of input images and thousands of labels using a fast variant of the Sinkhorn-Knopp algorithm. The resulting method is able to self-label visual data so as to train highly competitive image representations without manual labels. Our method achieves state of the art representation learning performance for AlexNet and ResNet-50 on SVHN, CIFAR-10, CIFAR-100 and ImageNet.

圖卷積神經網絡/圖卷積網絡 · 簇 · 圖卷積 · 圖 · Networking ·

2019 年 3 月 27 日

Linkage Based Face Clustering via Graph Convolution Network

Zhongdao Wang,Liang Zheng,Yali Li,Shengjin Wang

from arxiv, To appear in CVPR 2019

In this paper, we present an accurate and scalable approach to the face clustering task. We aim at grouping a set of faces by their potential identities. We formulate this task as a link prediction problem: a link exists between two faces if they are of the same identity. The key idea is that we find the local context in the feature space around an instance (face) contains rich information about the linkage relationship between this instance and its neighbors. By constructing sub-graphs around each instance as input data, which depict the local context, we utilize the graph convolution network (GCN) to perform reasoning and infer the likelihood of linkage between pairs in the sub-graphs. Experiments show that our method is more robust to the complex distribution of faces than conventional methods, yielding favorably comparable results to state-of-the-art methods on standard face clustering benchmarks, and is scalable to large datasets. Furthermore, we show that the proposed method does not need the number of clusters as prior, is aware of noises and outliers, and can be extended to a multi-view version for more accurate clustering accuracy.

Neural Networks · 自動問答 · Networking · 泛函 · 知識庫 ·

2018 年 10 月 5 日

Improving Question Answering by Commonsense-Based Pre-Training

Wanjun Zhong,Duyu Tang,Nan Duan,Ming Zhou,Jiahai Wang,Jian Yin

from arxiv, 8 pages

Although neural network approaches achieve remarkable success on a variety of NLP tasks, many of them struggle to answer questions that require commonsense knowledge. We believe the main reason is the lack of commonsense connections between concepts. To remedy this, we provide a simple and effective method that leverages external commonsense knowledge base such as ConceptNet. We pre-train direct and indirect relational functions between concepts, and show that these pre-trained functions could be easily added to existing neural network models. Results show that incorporating commonsense-based function improves the state-of-the-art on two question answering tasks that require commonsense reasoning. Further analysis shows that our system discovers and leverages useful evidences from an external commonsense knowledge base, which is missing in existing neural network models and help derive the correct answer.

Performer · 圖像分割 · Seven · CASE · 相互獨立的 ·

2018 年 9 月 27 日

nnU-Net: Self-adapting Framework for U-Net-Based Medical Image Segmentation

Fabian Isensee,Jens Petersen,Andre Klein,David Zimmerer,Paul F. Jaeger,Simon Kohl,Jakob Wasserthal,Gregor Koehler,Tobias Norajitra,Sebastian Wirkert,Klaus H. Maier-Hein

The U-Net was presented in 2015. With its straight-forward and successful architecture it quickly evolved to a commonly used benchmark in medical image segmentation. The adaptation of the U-Net to novel problems, however, comprises several degrees of freedom regarding the exact architecture, preprocessing, training and inference. These choices are not independent of each other and substantially impact the overall performance. The present paper introduces the nnU-Net ('no-new-Net'), which refers to a robust and self-adapting framework on the basis of 2D and 3D vanilla U-Nets. We argue the strong case for taking away superfluous bells and whistles of many proposed network designs and instead focus on the remaining aspects that make out the performance and generalizability of a method. We evaluate the nnU-Net in the context of the Medical Segmentation Decathlon challenge, which measures segmentation performance in ten disciplines comprising distinct entities, image modalities, image geometries and dataset sizes, with no manual adjustments between datasets allowed. At the time of manuscript submission, nnU-Net achieves the highest mean dice scores across all classes and seven phase 1 tasks (except class 1 in BrainTumour) in the online leaderboard of the challenge.

網絡嵌入 · Networking · 可約的 · DC · 鏈路預測 ·

2018 年 5 月 7 日

Billion-scale Network Embedding with Iterative Random Projection

Ziwei Zhang,Peng Cui,Haoyang Li,Xiao Wang,Wenwu Zhu

from arxiv, 7 pages, 5 figures

Network embedding has attracted considerable research attention recently. However, the existing methods are incapable of handling billion-scale networks, because they are computationally expensive and, at the same time, difficult to be accelerated by distributed computing schemes. To address these problems, we propose RandNE, a novel and simple billion-scale network embedding method. Specifically, we propose a Gaussian random projection approach to map the network into a low-dimensional embedding space while preserving the high-order proximities between nodes. To reduce the time complexity, we design an iterative projection procedure to avoid the explicit calculation of the high-order proximities. Theoretical analysis shows that our method is extremely efficient, and friendly to distributed computing schemes without any communication cost in the calculation. We demonstrate the efficacy of RandNE over state-of-the-art methods in network reconstruction and link prediction tasks on multiple datasets with different scales, ranging from thousands to billions of nodes and edges.

圖像分割 · 簇 · 有偏 · 劃分 · 圖 ·

2018 年 3 月 27 日

Compassionately Conservative Balanced Cuts for Image Segmentation

Nathan D. Cahill,Tyler L. Hayes,Renee T. Meinhold,John F. Hamilton

from arxiv, Long version of paper accepted at CVPR 2018

The Normalized Cut (NCut) objective function, widely used in data clustering and image segmentation, quantifies the cost of graph partitioning in a way that biases clusters or segments that are balanced towards having lower values than unbalanced partitionings. However, this bias is so strong that it avoids any singleton partitions, even when vertices are very weakly connected to the rest of the graph. Motivated by the B\"uhler-Hein family of balanced cut costs, we propose the family of Compassionately Conservative Balanced (CCB) Cut costs, which are indexed by a parameter that can be used to strike a compromise between the desire to avoid too many singleton partitions and the notion that all partitions should be balanced. We show that CCB-Cut minimization can be relaxed into an orthogonally constrained $\ell_{\tau}$-minimization problem that coincides with the problem of computing Piecewise Flat Embeddings (PFE) for one particular index value, and we present an algorithm for solving the relaxed problem by iteratively minimizing a sequence of reweighted Rayleigh quotients (IRRQ). Using images from the BSDS500 database, we show that image segmentation based on CCB-Cut minimization provides better accuracy with respect to ground truth and greater variability in region size than NCut-based image segmentation.