Polynomial based approaches, such as the Mat-Dot and entangled polynomial codes (EPC) have been used extensively within coded matrix computations to obtain schemes with good recovery thresholds. However, these schemes are well-recognized to suffer from poor numerical stability in decoding. Moreover, the encoding process in these schemes involves linearly combining a large number of input submatrices, i.e., the encoding weight is high. For the practically relevant case of sparse input matrices, this can have the undesirable effect of significantly increasing the worker node computation time. In this work, we propose a generalization of the EPC scheme by combining the idea of gradient coding along with the basic EPC encoding. Our technique allows us to reduce the weight of the encoding and arrive at schemes that exhibit much better numerical stability; this is achieved at the expense of a worse threshold. By appropriately setting parameters in our scheme, we recover several well-known schemes in the literature. Simulation results show that our scheme provides excellent numerical stability and fast computation speed (for sparse input matrices) as compared to EPC and Mat-Dot codes.
We propose and analyze an approximate message passing (AMP) algorithm for the matrix tensor product model, which is a generalization of the standard spiked matrix models that allows for multiple types of pairwise observations over a collection of latent variables. A key innovation for this algorithm is a method for optimally weighing and combining multiple estimates in each iteration. Building upon an AMP convergence theorem for non-separable functions, we prove a state evolution for non-separable functions that provides an asymptotically exact description of its performance in the high-dimensional limit. We leverage this state evolution result to provide necessary and sufficient conditions for recovery of the signal of interest. Such conditions depend on the singular values of a linear operator derived from an appropriate generalization of a signal-to-noise ratio for our model. Our results recover as special cases a number of recently proposed methods for contextual models (e.g., covariate assisted clustering) as well as inhomogeneous noise models.
Integrating coded caching (CC) techniques into multi-input multi-output (MIMO) setups provides a substantial performance boost in terms of the achievable degrees of freedom (DoF). In this paper, we study cache-aided MIMO setups where a single server with $L$ transmit antennas communicates with a number of users each with $G$ receive antennas. We extend a baseline CC scheme, originally designed for multi-input single-output (MISO) systems, to the considered MIMO setup. However, in a proposed MIMO approach, instead of merely replicating the transmit strategy from the baseline MISO scheme, we adjust the number of users served in each transmission to maximize the achievable DoF. This approach not only makes the extension more flexible in terms of supported network parameters but also results in an improved DoF of $\max_{\beta \le G} \beta \lfloor \frac{L-1}{\beta} \rfloor + \beta (t+1)$, where $t$ is the coded caching gain. In addition, we also propose a high-performance multicast transmission design for the considered MIMO-CC setup by formulating a symmetric rate maximization problem in terms of the transmit covariance matrices for the multicast signals and solving the resulting non-convex problem using successive convex approximation. Finally, we use numerical simulations to verify both improved DoF results and enhanced MIMO multicasting performance.
A novel numerical strategy is introduced for computing approximations of solutions to a Cahn-Hilliard model with degenerate mobilities. This model has recently been introduced as a second-order phase-field approximation for surface diffusion flows. Its numerical discretization is challenging due to the degeneracy of the mobilities, which generally requires an implicit treatment to avoid stability issues at the price of increased complexity costs. To mitigate this drawback, we consider new first- and second-order Scalar Auxiliary Variable (SAV) schemes that, differently from existing approaches, focus on the relaxation of the mobility, rather than the Cahn-Hilliard energy. These schemes are introduced and analysed theoretically in the general context of gradient flows and then specialised for the Cahn-Hilliard equation with mobilities. Various numerical experiments are conducted to highlight the advantages of these new schemes in terms of accuracy, effectiveness and computational cost.
In this paper, we design a novel two-phase unsourced random access (URA) scheme in massive multiple input multiple output (MIMO). In the first phase, we collect a sequence of information bits to jointly acquire the user channel state information (CSI) and the associated information bits. In the second phase, the residual information bits of all the users are partitioned into sub-blocks with a very short length to exhibit a higher spectral efficiency and a lower computational complexity than the existing transmission schemes in massive MIMO URA. By using the acquired CSI in the first phase, the sub-block recovery in the second phase is cast as a compressed sensing (CS) problem. From the perspective of the statistical physics, we provide a theoretical framework for our proposed URA scheme to analyze the induced problem based on the replica method. The analytical results show that the performance metrics of our URA scheme can be linked to the system parameters by a single-valued free entropy function. An AMP-based recovery algorithm is designed to achieve the performance indicated by the proposed theoretical framework. Simulations verify that our scheme outperforms the most recent counterparts.
Federated edge learning (FEL) can training a global model from terminal nodes' local dataset, which can make full use of the computing resources of terminal nodes and performs more extensive and efficient machine learning on terminal nodes with protecting user information requirements. Performance of FEL will be suffered from long delay or fault decision as the master collects partial gradients from stragglers which cannot return correct results within a deadline. Inspired by this, in this paper, we propose a novel coded FEL to mitigate stragglers for synchronous gradient with a two-stage dynamic scheme, where we start with part of workers for a duration of before starting the second stage, and on completion of at the first stage, we start remaining workers in the second stage. In particular, the computation latency and transmission latency is essential and should be quantitatively analyzed. Then the dynamically coded coefficients scheme is proposed which is based on historical information including worker completion time. For performance optimization of FEL, a Lyapunov function is designed to maximize admission data balancing fairness and two stage dynamic coding scheme is designed to maximize arrival data among workers. Experimental evidence verifies the derived properties and demonstrates that our proposed solution achieves a better performance for practical network parameters and benchmark datasets in terms of accuracy and resource utilization in the FEL system.
In multivariate time series analysis, the coherence measures the linear dependency between two-time series at different frequencies. However, real data applications often exhibit nonlinear dependency in the frequency domain. Conventional coherence analysis fails to capture such dependency. The quantile coherence, on the other hand, characterizes nonlinear dependency by defining the coherence at a set of quantile levels based on trigonometric quantile regression. Although quantile coherence is a more powerful tool, its estimation remains challenging due to the high level of noise. This paper introduces a new estimation technique for quantile coherence. The proposed method is semi-parametric, which uses the parametric form of the spectrum of the vector autoregressive (VAR) model as an approximation to the quantile spectral matrix, along with nonparametric smoothing across quantiles. For each fixed quantile level, we obtain the VAR parameters from the quantile periodograms, then, using the Durbin-Levinson algorithm, we calculate the preliminary estimate of quantile coherence using the VAR parameters. Finally, we smooth the preliminary estimate of quantile coherence across quantiles using a nonparametric smoother. Numerical results show that the proposed estimation method outperforms nonparametric methods. We show that quantile coherence-based bivariate time series clustering has advantages over the ordinary VAR coherence. For applications, the identified clusters of financial stocks by quantile coherence with a market benchmark are shown to have an intriguing and more accurate structure of diversified investment portfolios that may be used by investors to make better decisions.
In this paper, due to the important value in practical applications, we consider the coded distributed matrix multiplication problem of computing $AA^\top$ in a distributed computing system with $N$ worker nodes and a master node, where the input matrices $A$ and $A^\top$ are partitioned into $m$-by-$p$ and $p$-by-$m$ blocks of equal-size sub-matrices respectively. For effective straggler mitigation, we propose a novel computation strategy, named \emph{folded polynomial code}, which is obtained by modifying the entangled polynomial codes. Moreover, we characterize a lower bound on the optimal recovery threshold among all linear computation strategies when the underlying field is the real number field, and our folded polynomial codes can achieve this bound in the case of $m=1$. Compared with all known computation strategies for coded distributed matrix multiplication, our folded polynomial codes outperform them in terms of recovery threshold, download cost, and decoding complexity.
This paper considers the computation of the matrix exponential $\mathrm{e}^A$ with numerical quadrature. Although several quadrature-based algorithms have been proposed, they focus on (near) Hermitian matrices. In order to deal with non-Hermitian matrices, we use another integral representation including an oscillatory term and consider applying the double exponential (DE) formula specialized to Fourier integrals. The DE formula transforms the given integral into another integral whose interval is infinite, and therefore it is necessary to truncate the infinite interval. In this paper, to utilize the DE formula, we analyze the truncation error and propose two algorithms. The first one approximates $\mathrm{e}^A$ with the fixed mesh size which is a parameter in the DE formula affecting the accuracy. Second one computes $\mathrm{e}^A$ based on the first one with automatic selection of the mesh size depending on the given error tolerance.
Graph Neural Networks (GNNs) have been successfully used in many problems involving graph-structured data, achieving state-of-the-art performance. GNNs typically employ a message-passing scheme, in which every node aggregates information from its neighbors using a permutation-invariant aggregation function. Standard well-examined choices such as the mean or sum aggregation functions have limited capabilities, as they are not able to capture interactions among neighbors. In this work, we formalize these interactions using an information-theoretic framework that notably includes synergistic information. Driven by this definition, we introduce the Graph Ordering Attention (GOAT) layer, a novel GNN component that captures interactions between nodes in a neighborhood. This is achieved by learning local node orderings via an attention mechanism and processing the ordered representations using a recurrent neural network aggregator. This design allows us to make use of a permutation-sensitive aggregator while maintaining the permutation-equivariance of the proposed GOAT layer. The GOAT model demonstrates its increased performance in modeling graph metrics that capture complex information, such as the betweenness centrality and the effective size of a node. In practical use-cases, its superior modeling capability is confirmed through its success in several real-world node classification benchmarks.
Substantial progress has been made recently on developing provably accurate and efficient algorithms for low-rank matrix factorization via nonconvex optimization. While conventional wisdom often takes a dim view of nonconvex optimization algorithms due to their susceptibility to spurious local minima, simple iterative methods such as gradient descent have been remarkably successful in practice. The theoretical footings, however, had been largely lacking until recently. In this tutorial-style overview, we highlight the important role of statistical models in enabling efficient nonconvex optimization with performance guarantees. We review two contrasting approaches: (1) two-stage algorithms, which consist of a tailored initialization step followed by successive refinement; and (2) global landscape analysis and initialization-free algorithms. Several canonical matrix factorization problems are discussed, including but not limited to matrix sensing, phase retrieval, matrix completion, blind deconvolution, robust principal component analysis, phase synchronization, and joint alignment. Special care is taken to illustrate the key technical insights underlying their analyses. This article serves as a testament that the integrated consideration of optimization and statistics leads to fruitful research findings.