国产成人精品三级在线,欧美精品A在线观看

Multiple Tensor-Times-Matrix (Multi-TTM) is a key computation in algorithms for computing and operating with the Tucker tensor decomposition, which is frequently used in multidimensional data analysis. We establish communication lower bounds that determine how much data movement is required to perform the Multi-TTM computation in parallel. The crux of the proof relies on analytically solving a constrained, nonlinear optimization problem. We also present a parallel algorithm to perform this computation that organizes the processors into a logical grid with twice as many modes as the input tensor. We show that with correct choices of grid dimensions, the communication cost of the algorithm attains the lower bounds and is therefore communication optimal. Finally, we show that our algorithm can significantly reduce communication compared to the straightforward approach of expressing the computation as a sequence of tensor-times-matrix operations.

相關內容

Tensor

關注 0

優化器 · Analysis · 講稿 · CASE · 劃分 ·

2022 年 9 月 19 日

Nonlocal Optimized Schwarz Methods for time-harmonic electromagnetics

Xavier Claeys,Francis Collino,Emile Parolin

We introduce a new domain decomposition strategy for time harmonic Maxwell's equations that is valid in the case of automatically generated subdomain partitions with possible presence of cross-points. The convergence of the algorithm is guaranteed and we present a complete analysis of the matrix form of the method. The method involves transmission matrices responsible for imposing coupling between subdomains. We discuss the choice of such matrices, their construction and the impact of this choice on the convergence of the domain decomposition algorithm. Numerical results and algorithms are provided.

LIDAR · 估計/估計量 · 簇 · 可約的 · 零空間 ·

2022 年 9 月 19 日

Efficient and Consistent Bundle Adjustment on Lidar Point Clouds

Zheng Liu,Xiyuan Liu,Fu Zhang

from arxiv, 30 pages, 15 figures

Bundle Adjustment (BA) refers to the problem of simultaneous determination of sensor poses and scene geometry, which is a fundamental problem in robot vision. This paper presents an efficient and consistent bundle adjustment method for lidar sensors. The method employs edge and plane features to represent the scene geometry, and directly minimizes the natural Euclidean distance from each raw point to the respective geometry feature. A nice property of this formulation is that the geometry features can be analytically solved, drastically reducing the dimension of the numerical optimization. To represent and solve the resultant optimization problem more efficiently, this paper then proposes a novel concept {\it point clusters}, which encodes all raw points associated to the same feature by a compact set of parameters, the {\it point cluster coordinates}. We derive the closed-form derivatives, up to the second order, of the BA optimization based on the point cluster coordinates and show their theoretical properties such as the null spaces and sparsity. Based on these theoretical results, this paper develops an efficient second-order BA solver. Besides estimating the lidar poses, the solver also exploits the second order information to estimate the pose uncertainty caused by measurement noises, leading to consistent estimates of lidar poses. Moreover, thanks to the use of point cluster, the developed solver fundamentally avoids the enumeration of each raw point (which is very time-consuming due to the large number) in all steps of the optimization: cost evaluation, derivatives evaluation and uncertainty evaluation. The implementation of our method is open sourced to benefit the robotics community and beyond.

Analysis · Networking · NOMA · 優化器 · 極大 ·

2022 年 9 月 19 日

Capacity Analysis and Sum Rate Maximization for the SCMA Cellular Network Coexisting with D2D Communications

Yukai Liu,Wen Chen

from arxiv, 15 pages, 9 figures

Sparse code multiple access (SCMA) is the most concerning scheme among non-orthogonal multiple access (NOMA) technologies for 5G wireless communication new interface. Another efficient technique in 5G aimed to improve spectral efficiency for local communications is device-to-device (D2D) communications. Therefore, we utilize the SCMA cellular network coexisting with D2D communications for the connection demand of the Internet of things (IOT), and improve the system sum rate performance of the hybrid network. We first derive the information-theoretic expression of the capacity for all users and find the capacity bound of cellular users based on the mutual interference between cellular users and D2D users. Then we consider the power optimization problem for the cellular users and D2D users jointly to maximize the system sum rate. To tackle the non-convex optimization problem, we propose a geometric programming (GP) based iterative power allocation algorithm. Simulation results demonstrate that the proposed algorithm converges fast and well improves the sum rate performance.

簇 · 有向 · Networking · 圖 · 圖形處理器 ·

2022 年 9 月 18 日

DIGRAC: Digraph Clustering Based on Flow Imbalance

Yixuan He,Gesine Reinert,Mihai Cucuringu

from arxiv, 40 pages (9.5 pages for main text)

Node clustering is a powerful tool in the analysis of networks. We introduce a graph neural network framework to obtain node embeddings for directed networks in a self-supervised manner, including a novel probabilistic imbalance loss, which can be used for network clustering. Here, we propose directed flow imbalance measures, which are tightly related to directionality, to reveal clusters in the network even when there is no density difference between clusters. In contrast to standard approaches in the literature, in this paper, directionality is not treated as a nuisance, but rather contains the main signal. DIGRAC optimizes directed flow imbalance for clustering without requiring label supervision, unlike existing graph neural network methods, and can naturally incorporate node features, unlike existing spectral methods. Extensive experimental results on synthetic data, in the form of directed stochastic block models, and real-world data at different scales, demonstrate that our method, based on flow imbalance, attains state-of-the-art results on directed graph clustering when compared against 10 state-of-the-art methods from the literature, for a wide range of noise and sparsity levels, graph structures and topologies, and even outperforms supervised methods.

Storage · 服務器 · INFORMS · Performance · CASE ·

2022 年 9 月 18 日

Information-Theoretically Private Matrix Multiplication From MDS-Coded Storage

Jinbao Zhu,Songze Li,Jie Li

We study two problems of private matrix multiplication, over a distributed computing system consisting of a master node, and multiple servers who collectively store a family of public matrices using Maximum-Distance-Separable (MDS) codes. In the first problem of Private and Secure Matrix Multiplication from Colluding servers (MDS-C-PSMM), the master intends to compute the product of its confidential matrix $\mathbf{A}$ with a target matrix stored on the servers, without revealing any information about $\mathbf{A}$ and the index of target matrix to some colluding servers. In the second problem of Fully Private Matrix Multiplication from Colluding servers (MDS-C-FPMM), the matrix $\mathbf{A}$ is also selected from another family of public matrices stored at the servers in MDS form. In this case, the indices of the two target matrices should both be kept private from colluding servers. We develop novel strategies for MDS-C-PSMM and MDS-C-FPMM, which simultaneously guarantee information-theoretic data/index privacy and computation correctness. The key ingredient is a careful design of secret sharings of the matrix $\mathbf{A}$ and the private indices, which are tailored to matrix multiplication task and MDS storage structure, such that the computation results from the servers can be viewed as evaluations of a polynomial at distinct points, from which the intended result can be obtained through polynomial interpolation. We compare the proposed MDS-C-PSMM strategy with a previous MDS-PSMM strategy with a weaker privacy guarantee (non-colluding servers), and demonstrate substantial improvements over the previous strategy in terms of communication and computation performance.

Tensor · CP · 估計/估計量 · PCA · 奇異向量 ·

2022 年 9 月 18 日

Tensor Principal Component Analysis in High Dimensional CP Models

Yuefeng Han,Cun-Hui Zhang

The CP decomposition for high dimensional non-orthogonal spiked tensors is an important problem with broad applications across many disciplines. However, previous works with theoretical guarantee typically assume restrictive incoherence conditions on the basis vectors for the CP components. In this paper, we propose new computationally efficient composite PCA and concurrent orthogonalization algorithms for tensor CP decomposition with theoretical guarantees under mild incoherence conditions. The composite PCA applies the principal component or singular value decompositions twice, first to a matrix unfolding of the tensor data to obtain singular vectors and then to the matrix folding of the singular vectors obtained in the first step. It can be used as an initialization for any iterative optimization schemes for the tensor CP decomposition. The concurrent orthogonalization algorithm iteratively estimates the basis vector in each mode of the tensor by simultaneously applying projections to the orthogonal complements of the spaces generated by other CP components in other modes. It is designed to improve the alternating least squares estimator and other forms of the high order orthogonal iteration for tensors with low or moderately high CP ranks, and it is guaranteed to converge rapidly when the error of any given initial estimator is bounded by a small constant. Our theoretical investigation provides estimation accuracy and convergence rates for the two proposed algorithms. Both proposed algorithms are applicable to deterministic tensor, its noisy version, and the order-$2K$ covariance tensor of order-$K$ tensor data in a factor model with uncorrelated factors. Our implementations on synthetic data demonstrate significant practical superiority of our approach over existing methods.

優化器 · Principle · 評論員 · 樣例 · 設計 ·

2022 年 9 月 15 日

Pricing Optimal Outcomes in Coupled and Non-Convex Markets: Theory and Applications to Electricity Markets

Mete ?eref Ahunbay,Martin Bichler,Johannes Kn?rr

from arxiv, 41 pages, 2 figures

Classical results in general equilibrium theory assume divisible goods and convex preferences of market participants. In many real-world markets, participants have non-convex preferences and the allocation problem needs to consider complex constraints. Electricity markets are a prime example. In such markets, Walrasian prices are impossible, and heuristic pricing rules based on the dual of the relaxed allocation problem are used in practice. However, these rules have been criticized for high side-payments and inadequate congestion signals. We show that existing pricing heuristics optimize specific design goals that can be conflicting. The trade-offs can be substantial, and we establish that the design of pricing rules is fundamentally a multi-objective optimization problem addressing different incentives. In addition to traditional multi-objective optimization techniques using weighing of individual objectives, we introduce a novel parameter-free pricing rule that minimizes incentives for market participants to deviate locally. Our findings show how the new pricing rule capitalizes on the upsides of existing pricing rules under scrutiny today. It leads to prices that incur low make-whole payments while providing adequate congestion signals and low lost opportunity costs. Our suggested pricing rule does not require weighing of objectives, it is computationally scalable, and balances trade-offs in a principled manner, addressing an important policy issue in electricity markets.

Weight · 邊 · 圖 · 有向 · 講稿 ·

2022 年 9 月 15 日

Algorithms and Lower Bounds for Replacement Paths under Multiple Edge Failures

Virginia Vassilevska Williams,Eyob Woldeghebriel,Yinzhan Xu

from arxiv, To appear in FOCS 2022; Abstract shortened to fit arXiv requirements

This paper considers a natural fault-tolerant shortest paths problem: for some constant integer $f$, given a directed weighted graph with no negative cycles and two fixed vertices $s$ and $t$, compute (either explicitly or implicitly) for every tuple of $f$ edges, the distance from $s$ to $t$ if these edges fail. We call this problem $f$-Fault Replacement Paths ($f$FRP). We first present an $\tilde{O}(n^3)$ time algorithm for $2$FRP in $n$-vertex directed graphs with arbitrary edge weights and no negative cycles. As $2$FRP is a generalization of the well-studied Replacement Paths problem (RP) that asks for the distances between $s$ and $t$ for any single edge failure, $2$FRP is at least as hard as RP. Since RP in graphs with arbitrary weights is equivalent in a fine-grained sense to All-Pairs Shortest Paths (APSP) [Vassilevska Williams and Williams FOCS'10, J.~ACM'18], $2$FRP is at least as hard as APSP, and thus a substantially subcubic time algorithm in the number of vertices for $2$FRP would be a breakthrough. Therefore, our algorithm in $\tilde{O}(n^3)$ time is conditionally nearly optimal. Our algorithm implies an $\tilde{O}(n^{f+1})$ time algorithm for the $f$FRP problem, giving the first improvement over the straightforward $O(n^{f+2})$ time algorithm. Then we focus on the restriction of $2$FRP to graphs with small integer weights bounded by $M$ in absolute values. Using fast rectangular matrix multiplication, we obtain a randomized algorithm that runs in $\tilde{O}(M^{2/3}n^{2.9153})$ time. This implies an improvement over our $\tilde{O}(n^{f+1})$ time arbitrary weight algorithm for all $f>1$. We also present a data structure variant of the algorithm that can trade off pre-processing and query time. In addition to the algebraic algorithms, we also give an $n^{8/3-o(1)}$ conditional lower bound for combinatorial $2$FRP algorithms in directed unweighted graphs.

矩陣論 · 線性的 · 歐氏空間 · 反向傳播算法 · AIM ·

2022 年 1 月 1 日

Matrix Decomposition and Applications

Jun Lu

from arxiv, arXiv admin note: substantial text overlap with arXiv:2107.02579

In 1954, Alston S. Householder published Principles of Numerical Analysis, one of the first modern treatments on matrix decomposition that favored a (block) LU decomposition-the factorization of a matrix into the product of lower and upper triangular matrices. And now, matrix decomposition has become a core technology in machine learning, largely due to the development of the back propagation algorithm in fitting a neural network. The sole aim of this survey is to give a self-contained introduction to concepts and mathematical tools in numerical linear algebra and matrix analysis in order to seamlessly introduce matrix decomposition techniques and their applications in subsequent sections. However, we clearly realize our inability to cover all the useful and interesting results concerning matrix decomposition and given the paucity of scope to present this discussion, e.g., the separated analysis of the Euclidean space, Hermitian space, Hilbert space, and things in the complex domain. We refer the reader to literature in the field of linear algebra for a more detailed introduction to the related fields.

泛化理論 · 黑盒 · 學成 · INFORMS · 監督學習算法 ·

2021 年 10 月 4 日

Information-theoretic generalization bounds for black-box learning algorithms

Hrayr Harutyunyan,Maxim Raginsky,Greg Ver Steeg,Aram Galstyan

from arxiv, NeurIPS 2021

We derive information-theoretic generalization bounds for supervised learning algorithms based on the information contained in predictions rather than in the output of the training algorithm. These bounds improve over the existing information-theoretic bounds, are applicable to a wider range of algorithms, and solve two key challenges: (a) they give meaningful results for deterministic algorithms and (b) they are significantly easier to estimate. We show experimentally that the proposed bounds closely follow the generalization gap in practical scenarios for deep learning.