亚洲欧洲日产国-国产免费一级无码婬片AA片

Persistent homology is a leading tool in topological data analysis (TDA). Many problems in TDA can be solved via homological -- and indeed, linear -- algebra. However, matrices in this domain are typically large, with rows and columns numbered in billions. Low-rank approximation of such arrays typically destroys essential information; thus, new mathematical and computational paradigms are needed for very large, sparse matrices. We present the U-match matrix factorization scheme to address this challenge. U-match has two desirable features. First, it admits a compressed storage format that reduces the number of nonzero entries held in computer memory by one or more orders of magnitude over other common factorizations. Second, it permits direct solution of diverse problems in linear and homological algebra, without decompressing matrices stored in memory. These problems include look-up and retrieval of rows and columns; evaluation of birth/death times, and extraction of generators in persistent (co)homology; and, calculation of bases for boundary and cycle subspaces of filtered chain complexes. Such bases are key to unlocking a range of other topological techniques for use in TDA, and U-match factorization is designed to make such calculations broadly accessible to practitioners. As an application, we show that individual cycle representatives in persistent homology can be retrieved at time and memory costs orders of magnitude below current state of the art, via global duality. Moreover, the algebraic machinery needed to achieve this computation already exists in many modern solvers.

因子分解 · 稀疏 · 可約的 · 分解的 · 線性的 ·

2021 年 8 月 19 日

U-match factorization: sparse homological algebra, lazy cycle representatives, and dualities in persistent(co)homology

Haibin Hang,Chad Giusti,Lori Ziegelmeier,Gregory Henselman-Petrusek

from arxiv, 50 pages, 4 appendices

Persistent homology is a leading tool in topological data analysis (TDA). Many problems in TDA can be solved via homological -- and indeed, linear -- algebra. However, matrices in this domain are typically large, with rows and columns numbered in billions. Low-rank approximation of such arrays typically destroys essential information; thus, new mathematical and computational paradigms are needed for very large, sparse matrices. We present the U-match matrix factorization scheme to address this challenge. U-match has two desirable features. First, it admits a compressed storage format that reduces the number of nonzero entries held in computer memory by one or more orders of magnitude over other common factorizations. Second, it permits direct solution of diverse problems in linear and homological algebra, without decompressing matrices stored in memory. These problems include look-up and retrieval of rows and columns; evaluation of birth/death times, and extraction of generators in persistent (co)homology; and, calculation of bases for boundary and cycle subspaces of filtered chain complexes. Such bases are key to unlocking a range of other topological techniques for use in TDA, and U-match factorization is designed to make such calculations broadly accessible to practitioners. As an application, we show that individual cycle representatives in persistent homology can be retrieved at time and memory costs orders of magnitude below current state of the art, via global duality. Moreover, the algebraic machinery needed to achieve this computation already exists in many modern solvers.

估計/估計量 · 協方差矩陣 · 線性的 · 線性組合 · 樣本 ·

2021 年 8 月 19 日

Linear pooling of sample covariance matrices

Elias Raninen,David E. Tyler,Esa Ollila

We consider the problem of estimating high-dimensional covariance matrices of $K$-populations or classes in the setting where the samples sizes are comparable to the data dimension. We propose estimating each class covariance matrix as a distinct linear combination of all class sample covariance matrices. This approach is shown to reduce the estimation error when the sample sizes are limited, and the true class covariance matrices share a somewhat similar structure. We develop an effective method for estimating the coefficients in the linear combination that minimize the mean squared error under the general assumption that the samples are drawn from (unspecified) elliptically symmetric distributions possessing finite fourth-order moments. To this end, we utilize the spatial sign covariance matrix, which we show (under rather general conditions) to be an unbiased estimator of the normalized covariance matrix as the dimension grows to infinity. We also show how the proposed method can be used in choosing the regularization parameters for multiple target matrices in a single class covariance matrix estimation problem. We assess the proposed method via numerical simulation studies including an application in global minimum variance portfolio optimization using real stock data.

估計/估計量 · UniFormer · 噪聲 · 線性的 · 設計矩陣 ·

2021 年 8 月 17 日

Non-Asymptotic Bounds for the $\ell_{\infty}$ Estimator in Linear Regression with Uniform Noise

Yufei Yi,Matey Neykov

from arxiv, 34 pages, 1 figure, 1 table

The Chebyshev or $\ell_{\infty}$ estimator is an unconventional alternative to the ordinary least squares in solving linear regressions. It is defined as the minimizer of the $\ell_{\infty}$ objective function \begin{align*} \hat{\boldsymbol{\beta}} := \arg\min_{\boldsymbol{\beta}} \|\boldsymbol{Y} - \mathbf{X}\boldsymbol{\beta}\|_{\infty}. \end{align*} The asymptotic distribution of the Chebyshev estimator under fixed number of covariates were recently studied (Knight, 2020), yet finite sample guarantees and generalizations to high-dimensional settings remain open. In this paper, we develop non-asymptotic upper bounds on the estimation error $\|\hat{\boldsymbol{\beta}}-\boldsymbol{\beta}^*\|_2$ for a Chebyshev estimator $\hat{\boldsymbol{\beta}}$, in a regression setting with uniformly distributed noise $\varepsilon_i\sim U([-a,a])$ where $a$ is either known or unknown. With relatively mild assumptions on the (random) design matrix $\mathbf{X}$, we can bound the error rate by $\frac{C_p}{n}$ with high probability, for some constant $C_p$ depending on the dimension $p$ and the law of the design. Furthermore, we illustrate that there exist designs for which the Chebyshev estimator is (nearly) minimax optimal. In addition we show that "Chebyshev's LASSO" has advantages over the regular LASSO in high dimensional situations, provided that the noise is uniform. Specifically, we argue that it achieves a much faster rate of estimation under certain assumptions on the growth rate of the sparsity level and the ambient dimension with respect to the sample size.

統計量 · 穩健性 · 可約的 · TOOLS · 輸出 ·

2021 年 8 月 5 日

Perturbed M-Estimation: A Further Investigation of Robust Statistics for Differential Privacy

Aleksandra Slavkovic,Roberto Molinari

Differential Privacy (DP) provides an elegant mathematical framework for defining a provable disclosure risk in the presence of arbitrary adversaries; it guarantees that whether an individual is in a database or not, the results of a DP procedure should be similar in terms of their probability distribution. While DP mechanisms are provably effective in protecting privacy, they often negatively impact the utility of the query responses, statistics and/or analyses that come as outputs from these mechanisms. To address this problem, we use ideas from the area of robust statistics which aims at reducing the influence of outlying observations on statistical inference. Based on the preliminary known links between differential privacy and robust statistics, we modify the objective perturbation mechanism by making use of a new bounded function and define a bounded M-Estimator with adequate statistical properties. The resulting privacy mechanism, named "Perturbed M-Estimation", shows important potential in terms of improved statistical utility of its outputs as suggested by some preliminary results. These results consequently support the need to further investigate the use of robust statistical tools for differential privacy.

INFORMS · Networking · 網絡嵌入 · 鏈路預測 · 結點 ·

2018 年 11 月 28 日

Attributed Network Embedding for Incomplete Structure Information

Chengbin Hou,Shan He,Ke Tang

from arxiv, 8 pages, 5 figures

An attributed network enriches a pure network by encoding a part of widely accessible node auxiliary information into node attributes. Learning vector representation of each node a.k.a. Network Embedding (NE) for such an attributed network by considering both structure and attribute information has recently attracted considerable attention, since each node embedding is simply a unified low-dimension vector representation that makes downstream tasks e.g. link prediction more efficient and much easier to realize. Most of previous works have not considered the significant case of a network with incomplete structure information, which however, would often appear in our real-world scenarios e.g. the abnormal users in a social network who intentionally hide their friendships. And different networks obviously have different levels of incomplete structure information, which imposes more challenges to balance two sources of information. To tackle that, we propose a robust NE method called Attributed Biased Random Walks (ABRW) to employ attribute information for compensating incomplete structure information by using transition matrices. The experiments of link prediction and node classification tasks on real-world datasets confirm the robustness and effectiveness of our method to the different levels of the incomplete structure information.

Siamese · 注意力機制 · 孿生網絡 · Re-ID · Extensibility ·

2018 年 11 月 23 日

Re-Identification with Consistent Attentive Siamese Networks

Meng Zheng,Srikrishna Karanam,Ziyan Wu,Richard J. Radke

from arxiv, 10 pages, 8 figures, 3 tables

We propose a new deep architecture for person re-identification (re-id). While re-id has seen much recent progress, spatial localization and view-invariant representation learning for robust cross-view matching remain key, unsolved problems. We address these questions by means of a new attention-driven Siamese learning architecture, called the Consistent Attentive Siamese Network. Our key innovations compared to existing, competing methods include (a) a flexible framework design that produces attention with only identity labels as supervision, (b) explicit mechanisms to enforce attention consistency among images of the same person, and (c) a new Siamese framework that integrates attention and attention consistency, producing principled supervisory signals as well as the first mechanism that can explain the reasoning behind the Siamese framework's predictions. We conduct extensive evaluations on the CUHK03-NP, DukeMTMC-ReID, and Market-1501 datasets, and establish a new state of the art, with our proposed method resulting in mAP performance improvements of 6.4%, 4.2%, and 1.2% respectively.

秩 · MoDELS · 優化器 · 奇異值分解 · 列 ·

2018 年 10 月 18 日

Testing Matrix Rank, Optimally

Maria-Florina Balcan,Yi Li,David P. Woodruff,Hongyang Zhang

from arxiv, 51 pages. To appear in SODA 2019

We show that for the problem of testing if a matrix $A \in F^{n \times n}$ has rank at most $d$, or requires changing an $\epsilon$-fraction of entries to have rank at most $d$, there is a non-adaptive query algorithm making $\widetilde{O}(d^2/\epsilon)$ queries. Our algorithm works for any field $F$. This improves upon the previous $O(d^2/\epsilon^2)$ bound (SODA'03), and bypasses an $\Omega(d^2/\epsilon^2)$ lower bound of (KDD'14) which holds if the algorithm is required to read a submatrix. Our algorithm is the first such algorithm which does not read a submatrix, and instead reads a carefully selected non-adaptive pattern of entries in rows and columns of $A$. We complement our algorithm with a matching query complexity lower bound for non-adaptive testers over any field. We also give tight bounds of $\widetilde{\Theta}(d^2)$ queries in the sensing model for which query access comes in the form of $\langle X_i, A\rangle:=tr(X_i^\top A)$; perhaps surprisingly these bounds do not depend on $\epsilon$. We next develop a novel property testing framework for testing numerical properties of a real-valued matrix $A$ more generally, which includes the stable rank, Schatten-$p$ norms, and SVD entropy. Specifically, we propose a bounded entry model, where $A$ is required to have entries bounded by $1$ in absolute value. We give upper and lower bounds for a wide range of problems in this model, and discuss connections to the sensing model above.

優化器 · 近鄰 · Performer · 邊緣化 · 可行 ·

2018 年 5 月 2 日

Feasibility Based Large Margin Nearest Neighbor Metric Learning

Babak Hosseini,Barbara Hammer

from arxiv, This is the preprint of the conference paper published in ESANN2018

Large margin nearest neighbor (LMNN) is a metric learner which optimizes the performance of the popular $k$NN classifier. However, its resulting metric relies on pre-selected target neighbors. In this paper, we address the feasibility of LMNN's optimization constraints regarding these target points, and introduce a mathematical measure to evaluate the size of the feasible region of the optimization problem. We enhance the optimization framework of LMNN by a weighting scheme which prefers data triplets which yield a larger feasible region. This increases the chances to obtain a good metric as the solution of LMNN's problem. We evaluate the performance of the resulting feasibility-based LMNN algorithm using synthetic and real datasets. The empirical results show an improved accuracy for different types of datasets in comparison to regular LMNN.

秩 · 成對型 · 極大似然 · 排序 · state-of-the-art ·

2018 年 2 月 28 日

SQL-Rank: A Listwise Approach to Collaborative Ranking

Liwei Wu,Cho-Jui Hsieh,James Sharpnack

In this paper, we propose a listwise approach for constructing user-specific rankings in recommendation systems in a collaborative fashion. We contrast the listwise approach to previous pointwise and pairwise approaches, which are based on treating either each rating or each pairwise comparison as an independent instance respectively. By extending the work of (Cao et al. 2007), we cast listwise collaborative ranking as maximum likelihood under a permutation model which applies probability mass to permutations based on a low rank latent score matrix. We present a novel algorithm called SQL-Rank, which can accommodate ties and missing data and can run in linear time. We develop a theoretical framework for analyzing listwise ranking methods based on a novel representation theory for the permutation model. Applying this framework to collaborative ranking, we derive asymptotic statistical rates as the number of users and items grow together. We conclude by demonstrating that our SQL-Rank method often outperforms current state-of-the-art algorithms for implicit feedback such as Weighted-MF and BPR and achieve favorable results when compared to explicit feedback algorithms such as matrix factorization and collaborative ranking.

亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

相關內容