Unsplittable flow problems cover a wide range of telecommunication and transportation problems and their efficient resolution is key to a number of applications. In this work, we study algorithms that can scale up to large graphs and important numbers of commodities. We present and analyze in detail a heuristic based on the linear relaxation of the problem and randomized rounding. We provide empirical evidence that this approach is competitive with state-of-the-art resolution methods either by its scaling performance or by the quality of its solutions. We provide a variation of the heuristic which has the same approximation factor as the state-of-the-art approximation algorithm. We also derive a tighter analysis for the approximation factor of both the variation and the state-of-the-art algorithm. We introduce a new objective function for the unsplittable flow problem and discuss its differences with the classical congestion objective function. Finally, we discuss the gap in practical performance and theoretical guarantees between all the aforementioned algorithms.
For solving linear inverse problems, particularly of the type that appear in tomographic imaging and compressive sensing, this paper develops two new approaches. The first approach is an iterative algorithm that minimizers a regularized least squares objective function where the regularization is based on a compound Gaussian prior distribution. The Compound Gaussian prior subsumes many of the commonly used priors in image reconstruction, including those of sparsity-based approaches. The developed iterative algorithm gives rise to the paper's second new approach, which is a deep neural network that corresponds to an "unrolling" or "unfolding" of the iterative algorithm. Unrolled deep neural networks have interpretable layers and outperform standard deep learning methods. This paper includes a detailed computational theory that provides insight into the construction and performance of both algorithms. The conclusion is that both algorithms outperform other state-of-the-art approaches to tomographic image formation and compressive sensing, especially in the difficult regime of low training.
The prevalence of large-scale graphs poses great challenges in time and storage for training and deploying graph neural networks (GNNs). Several recent works have explored solutions for pruning the large original graph into a small and highly-informative one, such that training and inference on the pruned and large graphs have comparable performance. Although empirically effective, current researches focus on static or non-temporal graphs, which are not directly applicable to dynamic scenarios. In addition, they require labels as ground truth to learn the informative structure, limiting their applicability to new problem domains where labels are hard to obtain. To solve the dilemma, we propose and study the problem of unsupervised graph pruning on dynamic graphs. We approach the problem by our proposed STEP, a self-supervised temporal pruning framework that learns to remove potentially redundant edges from input dynamic graphs. From a technical and industrial viewpoint, our method overcomes the trade-offs between the performance and the time & memory overheads. Our results on three real-world datasets demonstrate the advantages on improving the efficacy, robustness, and efficiency of GNNs on dynamic node classification tasks. Most notably, STEP is able to prune more than 50% of edges on a million-scale industrial graph Alipay (7M nodes, 21M edges) while approximating up to 98% of the original performance. Code is available at //github.com/EdisonLeeeee/STEP.
AI-enabled precision medicine promises a transformational improvement in healthcare outcomes by enabling data-driven personalized diagnosis, prognosis, and treatment. However, the well-known "curse of dimensionality" and the clustered structure of biomedical data together interact to present a joint challenge in the high dimensional, limited observation precision medicine regime. To overcome both issues simultaneously we propose a simple and scalable approach to joint clustering and embedding that combines standard embedding methods with a convex clustering penalty in a modular way. This novel, cluster-aware embedding approach overcomes the complexity and limitations of current joint embedding and clustering methods, which we show with straightforward implementations of hierarchically clustered principal component analysis (PCA), locally linear embedding (LLE), and canonical correlation analysis (CCA). Through both numerical experiments and real-world examples, we demonstrate that our approach outperforms traditional and contemporary clustering methods on highly underdetermined problems (e.g., with just tens of observations) as well as on large sample datasets. Importantly, our approach does not require the user to choose the desired number of clusters, but instead yields interpretable dendrograms of hierarchically clustered embeddings. Thus our approach improves significantly on existing methods for identifying patient subgroups in multiomics and neuroimaging data, enabling scalable and interpretable biomarkers for precision medicine.
This work, for the first time, introduces two constant factor approximation algorithms with linear query complexity for non-monotone submodular maximization over a ground set of size $n$ subject to a knapsack constraint, $\mathsf{DLA}$ and $\mathsf{RLA}$. $\mathsf{DLA}$ is a deterministic algorithm that provides an approximation factor of $6+\epsilon$ while $\mathsf{RLA}$ is a randomized algorithm with an approximation factor of $4+\epsilon$. Both run in $O(n \log(1/\epsilon)/\epsilon)$ query complexity. The key idea to obtain a constant approximation ratio with linear query lies in: (1) dividing the ground set into two appropriate subsets to find the near-optimal solution over these subsets with linear queries, and (2) combining a threshold greedy with properties of two disjoint sets or a random selection process to improve solution quality. In addition to the theoretical analysis, we have evaluated our proposed solutions with three applications: Revenue Maximization, Image Summarization, and Maximum Weighted Cut, showing that our algorithms not only return comparative results to state-of-the-art algorithms but also require significantly fewer queries.
The problem of packing equal spheres in a spherical container is a classic global optimization problem, which has attracted enormous studies in academia and found various applications in industry. This problem is computationally challenging, and many efforts focus on small-scale instances with the number of spherical items less than 200 in the literature. In this work, we propose an efficient local search heuristic algorithm named solution space exploring and descent for solving this problem, which can quantify the solution's quality to determine the number of exploring actions and quickly discover a high-quality solution. Besides, we propose an adaptive neighbor object maintenance method to speed up the convergence of the continuous optimization process and reduce the time consumption. Computational experiments on a large number of benchmark instances with $5 \leq n \leq 400$ spherical items show that our algorithm significantly outperforms the state-of-the-art algorithm. In particular, it improves the 274 best-known results and matches the 84 best-known results out of the 396 well-known benchmark instances.
We introduce a filtering technique for Discontinuous Galerkin approximations of hyperbolic problems. Following an approach already proposed for the Hamilton-Jacobi equations by other authors, we aim at reducing the spurious oscillations that arise in presence of discontinuities when high order spatial discretizations are employed. This goal is achieved using a filter function that keeps the high order scheme when the solution is regular and switches to a monotone low order approximation if it is not. The method has been implemented in the framework of the $deal.II$ numerical library, whose mesh adaptation capabilities are also used to reduce the region in which the low order approximation is used. A number of numerical experiments demonstrate the potential of the proposed filtering technique.
Many machine learning applications and tasks rely on the stochastic gradient descent (SGD) algorithm and its variants. Effective step length selection is crucial for the success of these algorithms, which has motivated the development of algorithms such as ADAM or AdaGrad. In this paper, we propose a novel algorithm for adaptive step length selection in the classical SGD framework, which can be readily adapted to other stochastic algorithms. Our proposed algorithm is inspired by traditional nonlinear optimization techniques and is supported by analytical findings. We show that under reasonable conditions, the algorithm produces step lengths in line with well-established theoretical requirements, and generates iterates that converge to a stationary neighborhood of a solution in expectation. We test the proposed algorithm on logistic regressions and deep neural networks and demonstrate that the algorithm can generate step lengths comparable to the best step length obtained from manual tuning.
Given a randomized experiment with binary outcomes, exact confidence intervals for the average causal effect of the treatment can be computed through a series of permutation tests. This approach requires minimal assumptions and is valid for all sample sizes, as it does not rely on large-sample approximations such as the central limit theorem. We show that these confidence intervals can be found in $O(n \log n)$ permutation tests in the case of balanced designs, where the treatment and control groups have equal sizes, and $O(n^2)$ permutation tests in the general case. Prior to this work, the most efficient known constructions required $O(n^2)$ such tests in the balanced case [Li and Ding, 2016], and $O(n^4)$ tests in the general case [Rigdon and Hudgens, 2015]. Our results thus facilitate exact inference as a viable option for randomized experiments far larger than those accessible by previous methods.
Isolated training with Gaussian priors (TGP) of the component autoencoders of turbo-autoencoder architectures enables faster, more consistent training and better generalization to arbitrary decoding iterations than training based on deep unfolding. We propose fitting the components via extrinsic information transfer (EXIT) charts to a desired behavior which enables scaling to larger message lengths ($k \approx 1000$) while retaining competitive performance. To the best of our knowledge, this is the first autoencoder that performs close to classical codes in this regime. Although the binary cross-entropy (BCE) loss function optimizes the bit error rate (BER) of the components, the design via EXIT charts enables to focus on the block error rate (BLER). In serially concatenated systems the component-wise TGP approach is well known for inner components with a fixed outer binary interface, e.g., a learned inner code or equalizer, with an outer binary error correcting code. In this paper we extend the component training to structures with an inner and outer autoencoder, where we propose a new 1-bit quantization strategy for the encoder outputs based on the underlying communication problem. Finally, we discuss the model complexity of the learned components during design time (training) and inference and show that the number of weights in the encoder can be reduced by 99.96 %.
Tensor ring (TR) decomposition has recently received increased attention due to its superior expressive performance for high-order tensors. However, the applicability of traditional TR decomposition algorithms to real-world applications is hindered by prevalent large data sizes, missing entries, and corruption with outliers. In this work, we propose a scalable and robust TR decomposition algorithm capable of handling large-scale tensor data with missing entries and gross corruptions. We first develop a novel auto-weighted steepest descent method that can adaptively fill the missing entries and identify the outliers during the decomposition process. Further, taking advantage of the tensor ring model, we develop a novel fast Gram matrix computation (FGMC) approach and a randomized subtensor sketching (RStS) strategy which yield significant reduction in storage and computational complexity. Experimental results demonstrate that the proposed method outperforms existing TR decomposition methods in the presence of outliers, and runs significantly faster than existing robust tensor completion algorithms.