好男人在线观看免费2019-人人干人人摸人人操

We study principal component analysis (PCA), where given a dataset in $\mathbb{R}^d$ from a distribution, the task is to find a unit vector $v$ that approximately maximizes the variance of the distribution after being projected along $v$. Despite being a classical task, standard estimators fail drastically if the data contains even a small fraction of outliers, motivating the problem of robust PCA. Recent work has developed computationally-efficient algorithms for robust PCA that either take super-linear time or have sub-optimal error guarantees. Our main contribution is to develop a nearly-linear time algorithm for robust PCA with near-optimal error guarantees. We also develop a single-pass streaming algorithm for robust PCA with memory usage nearly-linear in the dimension.

相關內容

PCA

關注 3

在統計中，主成分分析（PCA）是一種通過最大化每個維度的方差來將較高維度空間中的數據投影到較低維度空間中的方法。給定二維，三維或更高維空間中的點集合，可以將“最佳擬合”線定義為最小化從點到線的平均平方距離的線。可以從垂直于第一條直線的方向類似地選擇下一條最佳擬合線。重復此過程會產生一個正交的基礎，其中數據的不同單個維度是不相關的。這些基向量稱為主成分。

確切的 · 啟發式算法 · ILP · state-of-the-art · 優化器 ·

2023 年 6 月 16 日

Exact and Heuristic Algorithms for the Domination Problem

Ernesto Parra Inza,Frank Angel Hernández Mira,José María Sigarreta Almira,Nodari Vakhania

In a simple connected graph $G=(V,E)$, a subset of vertices $S \subseteq V$ is a dominating set if any vertex $v \in V\setminus S$ is adjacent to some vertex $x$ from this subset. A number of real-life problems can be modeled using this problem which is known to be among the difficult NP-hard problems in its class. We formulate the problem as an integer liner program (ILP) and compare the performance with the two earlier existing exact state-of-the-art algorithms and exact implicit enumeration and heuristic algorithms that we propose here. Our exact algorithm was able to find optimal solutions much faster than ILP and the above two exact algorithms for middle-dense instances. For graphs with a considerable size, our heuristic algorithm was much faster than both, ILP and our exact algorithm. It found an optimal solution for more than half of the tested instances, whereas it improved the earlier known state-of-the-art solutions for almost all the tested benchmark instances. Among the instances where the optimum was not found, it gave an average approximation error of $1.18$.

簇 · 優化器 · 穩健性 · Extensibility · 異常點 ·

2023 年 6 月 16 日

Adversarially robust clustering with optimality guarantees

Soham Jana,Kun Yang,Sanjeev Kulkarni

from arxiv, 36 pages, 5 figures

We consider the problem of clustering data points coming from sub-Gaussian mixtures. Existing methods that provably achieve the optimal mislabeling error, such as the Lloyd algorithm, are usually vulnerable to outliers. In contrast, clustering methods seemingly robust to adversarial perturbations are not known to satisfy the optimal statistical guarantees. We propose a simple algorithm that obtains the optimal mislabeling rate even when we allow adversarial outliers to be present. Our algorithm achieves the optimal error rate in constant iterations when a weak initialization condition is satisfied. In the absence of outliers, in fixed dimensions, our theoretical guarantees are similar to that of the Lloyd algorithm. Extensive experiments on various simulated data sets are conducted to support the theoretical guarantees of our method.

泛函 · 統計量 · 近似 · 推斷 · 估計/估計量 ·

2023 年 6 月 15 日

Data-Driven Influence Functions for Optimization-Based Causal Inference

Michael I. Jordan,Yixin Wang,Angela Zhou

from arxiv, Extended version of conference version "Empirical Gateaux Derivatives for Causal Inference" accepted at Neurips 2022; new results on optimization and sensitivity analysis

We study a constructive algorithm that approximates Gateaux derivatives for statistical functionals by finite differencing, with a focus on functionals that arise in causal inference. We study the case where probability distributions are not known a priori but need to be estimated from data. These estimated distributions lead to empirical Gateaux derivatives, and we study the relationships between empirical, numerical, and analytical Gateaux derivatives. Starting with a case study of the interventional mean (average potential outcome), we delineate the relationship between finite differences and the analytical Gateaux derivative. We then derive requirements on the rates of numerical approximation in perturbation and smoothing that preserve the statistical benefits of one-step adjustments, such as rate double robustness. We then study more complicated functionals such as dynamic treatment regimes, the linear-programming formulation for policy optimization in infinite-horizon Markov decision processes, and sensitivity analysis in causal inference. More broadly, we study optimization-based estimators, since this begets a class of estimands where identification via regression adjustment is straightforward but obtaining influence functions under minor variations thereof is not. The ability to approximate bias adjustments in the presence of arbitrary constraints illustrates the usefulness of constructive approaches for Gateaux derivatives. We also find that the statistical structure of the functional (rate double robustness) can permit less conservative rates for finite-difference approximation. This property, however, can be specific to particular functionals; e.g., it occurs for the average potential outcome (hence average treatment effect) but not the infinite-horizon MDP policy value.

偽標記 · 泛化理論 · INFORMS · 泛化誤差 · 標注 ·

2023 年 6 月 15 日

How Does Pseudo-Labeling Affect the Generalization Error of the Semi-Supervised Gibbs Algorithm?

Haiyun He,Gholamali Aminian,Yuheng Bu,Miguel Rodrigues,Vincent Y. F. Tan

from arxiv, 30 pages, 4 figures

We provide an exact characterization of the expected generalization error (gen-error) for semi-supervised learning (SSL) with pseudo-labeling via the Gibbs algorithm. The gen-error is expressed in terms of the symmetrized KL information between the output hypothesis, the pseudo-labeled dataset, and the labeled dataset. Distribution-free upper and lower bounds on the gen-error can also be obtained. Our findings offer new insights that the generalization performance of SSL with pseudo-labeling is affected not only by the information between the output hypothesis and input training data but also by the information {\em shared} between the {\em labeled} and {\em pseudo-labeled} data samples. This serves as a guideline to choose an appropriate pseudo-labeling method from a given family of methods. To deepen our understanding, we further explore two examples -- mean estimation and logistic regression. In particular, we analyze how the ratio of the number of unlabeled to labeled data $\lambda$ affects the gen-error under both scenarios. As $\lambda$ increases, the gen-error for mean estimation decreases and then saturates at a value larger than when all the samples are labeled, and the gap can be quantified {\em exactly} with our analysis, and is dependent on the \emph{cross-covariance} between the labeled and pseudo-labeled data samples. For logistic regression, the gen-error and the variance component of the excess risk also decrease as $\lambda$ increases.

確切的 · 穩健性 · ReLU · 負相關法 · 可約的 ·

2023 年 6 月 15 日

Exact Count of Boundary Pieces of ReLU Classifiers: Towards the Proper Complexity Measure for Classification

Pawe? Piwek,Adam Klukowski,Tianyang Hu

from arxiv, Accepted to UAI 2023

Classic learning theory suggests that proper regularization is the key to good generalization and robustness. In classification, current training schemes only target the complexity of the classifier itself, which can be misleading and ineffective. Instead, we advocate directly measuring the complexity of the decision boundary. Existing literature is limited in this area with few well-established definitions of boundary complexity. As a proof of concept, we start by analyzing ReLU neural networks, whose boundary complexity can be conveniently characterized by the number of affine pieces. With the help of tropical geometry, we develop a novel method that can explicitly count the exact number of boundary pieces, and as a by-product, the exact number of total affine pieces. Numerical experiments are conducted and distinctive properties of our boundary complexity are uncovered. First, the boundary piece count appears largely independent of other measures, e.g., total piece count, and $l_2$ norm of weights, during the training process. Second, the boundary piece count is negatively correlated with robustness, where popular robust training techniques, e.g., adversarial training or random noise injection, are found to reduce the number of boundary pieces.

簇 · 近似 · 圖 · 有偏 · motivation ·

2023 年 6 月 14 日

Multi-class Graph Clustering via Approximated Effective $p$-Resistance

Shota Saito,Mark Herbster

from arxiv, To appear at ICML 2023

This paper develops an approximation to the (effective) $p$-resistance and applies it to multi-class clustering. Spectral methods based on the graph Laplacian and its generalization to the graph $p$-Laplacian have been a backbone of non-euclidean clustering techniques. The advantage of the $p$-Laplacian is that the parameter $p$ induces a controllable bias on cluster structure. The drawback of $p$-Laplacian eigenvector based methods is that the third and higher eigenvectors are difficult to compute. Thus, instead, we are motivated to use the $p$-resistance induced by the $p$-Laplacian for clustering. For $p$-resistance, small $p$ biases towards clusters with high internal connectivity while large $p$ biases towards clusters of small ``extent,'' that is a preference for smaller shortest-path distances between vertices in the cluster. However, the $p$-resistance is expensive to compute. We overcome this by developing an approximation to the $p$-resistance. We prove upper and lower bounds on this approximation and observe that it is exact when the graph is a tree. We also provide theoretical justification for the use of $p$-resistance for clustering. Finally, we provide experiments comparing our approximated $p$-resistance clustering to other $p$-Laplacian based methods.

正則化項 · 方陣 · Analysis · 線性的 · CASE ·

2023 年 6 月 14 日

Refined $F_5$ Algorithms for Ideals of Minors of Square Matrices

Sriram Gopalakrishnan,Vincent Neiger,Mohab Safey El Din

from arxiv, 21 pages, 3 algorithms

We consider the problem of computing a grevlex Gr\"obner basis for the set $F_r(M)$ of minors of size $r$ of an $n\times n$ matrix $M$ of generic linear forms over a field of characteristic zero or large enough. Such sets are not regular sequences; in fact, the ideal $\langle F_r(M) \rangle$ cannot be generated by a regular sequence. As such, when using the general-purpose algorithm $F_5$ to find the sought Gr\"obner basis, some computing time is wasted on reductions to zero. We use known results about the first syzygy module of $F_r(M)$ to refine the $F_5$ algorithm in order to detect more reductions to zero. In practice, our approach avoids a significant number of reductions to zero. In particular, in the case $r=n-2$, we prove that our new algorithm avoids all reductions to zero, and we provide a corresponding complexity analysis which improves upon the previously known estimates.

CC · 優化器 · 核回歸 · 核化 · 近似誤差 ·

2023 年 6 月 14 日

Nearly Optimal Algorithms with Sublinear Computational Complexity for Online Kernel Regression

Junfan Li,Shizhong Liao

The trade-off between regret and computational cost is a fundamental problem for online kernel regression, and previous algorithms worked on the trade-off can not keep optimal regret bounds at a sublinear computational complexity. In this paper, we propose two new algorithms, AOGD-ALD and NONS-ALD, which can keep nearly optimal regret bounds at a sublinear computational complexity, and give sufficient conditions under which our algorithms work. Both algorithms dynamically maintain a group of nearly orthogonal basis used to approximate the kernel mapping, and keep nearly optimal regret bounds by controlling the approximate error. The number of basis depends on the approximate error and the decay rate of eigenvalues of the kernel matrix. If the eigenvalues decay exponentially, then AOGD-ALD and NONS-ALD separately achieves a regret of $O(\sqrt{L(f)})$ and $O(\mathrm{d}_{\mathrm{eff}}(\mu)\ln{T})$ at a computational complexity in $O(\ln^2{T})$. If the eigenvalues decay polynomially with degree $p\geq 1$, then our algorithms keep the same regret bounds at a computational complexity in $o(T)$ in the case of $p>4$ and $p\geq 10$, respectively. $L(f)$ is the cumulative losses of $f$ and $\mathrm{d}_{\mathrm{eff}}(\mu)$ is the effective dimension of the problem. The two regret bounds are nearly optimal and are not comparable.

累積分布函數 · MoDELS · 統計量 · MCMC · AI ·

2023 年 6 月 13 日

Generative AI for Bayesian Computation

Nicholas G. Polson,Vadim Sokolov

from arxiv, arXiv admin note: text overlap with arXiv:2209.02163

Generative AI (Gen-AI) methods are developed for Bayesian Computation. Gen-AI naturally applies to Bayesian models which can be easily simulated. First, we generate a large training dataset of data and parameters from the joint probability model. Secondly, we find a summary/sufficient statistic for dimensionality reduction. Thirdly, we use a deep neural network to uncover the inverse Bayes map between parameters and data. This finds the inverse posterior cumulative distribution function. Bayesian computation then is equivalent to high dimensional regression with dimensionality reduction (a.k.a feature selection) and nonlnearity (a.k.a. deep learning). The main advantage of Gen-AI is the ability to be density-free and hence avoids MCMC simulation of the posterior. Architecture design is important and we propose deep quantile NNs as a general framework for inference and decision making. To illustrate our methodology, we provide three examples: a stylized synthetic example, a traffic flow prediction problem and a satellite data-set. Finally, we conclude with directions for future research.

圖 · 估計/估計量 · 可辨認的 · 有向非循環圖 · DAG ·

2023 年 6 月 13 日

Linear-Time Algorithms for Front-Door Adjustment in Causal Graphs

Marcel Wien?bst,Benito van der Zander,Maciej Li?kiewicz

from arxiv, Removed Theorem 4 from version [v2] because of an error in the proof

Causal effect estimation from observational data is a fundamental task in empirical sciences. It becomes particularly challenging when unobserved confounders are involved in a system. This paper focuses on front-door adjustment -- a classic technique which, using observed mediators allows to identify causal effects even in the presence of unobserved confounding. While the statistical properties of the front-door estimation are quite well understood, its algorithmic aspects remained unexplored for a long time. Recently, Jeong, Tian, and Barenboim [NeurIPS 2022] have presented the first polynomial-time algorithm for finding sets satisfying the front-door criterion in a given directed acyclic graph (DAG), with an $O(n^3(n+m))$ run time, where $n$ denotes the number of variables and $m$ the number of edges of the causal graph. In our work, we give the first linear-time, i.e., $O(n+m)$, algorithm for this task, which thus reaches the asymptotically optimal time complexity. This result implies an $O(n(n+m))$ delay enumeration algorithm of all front-door adjustment sets, again improving previous work by Jeong et al. by a factor of $n^3$. Moreover, we provide the first linear-time algorithm for finding a minimal front-door adjustment set. We offer implementations of our algorithms in multiple programming languages to facilitate practical usage and empirically validate their feasibility, even for large graphs.