Several novel mixed-integer linear and bilinear formulations are proposed for the optimum communication spanning tree problem. They implement the distance-based approach: graph distances are directly modeled by continuous, integral, or binary variables, and interconnection between distance variables is established using the recursive Bellman-type conditions or using matrix equations from algebraic graph theory. These non-linear relations are used either directly giving rise to the bilinear formulations, or, through the big-M reformulation, resulting in the linear programs. A branch-and-bound framework of Gurobi 9.0 optimization software is employed to compare performance of the novel formulations on the example of an optimum requirement spanning tree problem with additional vertex degree constraints. Several real-world requirements matrices from transportation industry are used to generate a number of examples of different size, and computational experiments show the superiority of the two novel linear distance-based formulations over the the traditional multicommodity flow model.
Model-based planning holds great promise for improving both sample efficiency and generalization in reinforcement learning (RL). We show that energy-based models (EBMs) are a promising class of models to use for model-based planning. EBMs naturally support inference of intermediate states given start and goal state distributions. We provide an online algorithm to train EBMs while interacting with the environment, and show that EBMs allow for significantly better online learning than corresponding feed-forward networks. We further show that EBMs support maximum entropy state inference and are able to generate diverse state space plans. We show that inference purely in state space - without planning actions - allows for better generalization to previously unseen obstacles in the environment and prevents the planner from exploiting the dynamics model by applying uncharacteristic action sequences. Finally, we show that online EBM training naturally leads to intentionally planned state exploration which performs significantly better than random exploration.
We consider sparse superposition codes (SPARCs) over complex AWGN channels. Such codes can be efficiently decoded by an approximate message passing (AMP) decoder, whose performance can be predicted via so-called state evolution in the large-system limit. In this paper, we mainly focus on how to use concatenation of SPARCs and cyclic redundancy check (CRC) codes on the encoding side and use list decoding on the decoding side to improve the finite-length performance of the AMP decoder for SPARCs over complex AWGN channels. Simulation results show that such a concatenated coding scheme works much better than SPARCs with the original AMP decoder and results in a steep waterfall-like behavior in the bit-error rate performance curves. Furthermore, we apply our proposed concatenated coding scheme to spatially coupled SPARCs. Besides that, we also introduce a novel class of design matrices, i.e., matrices that describe the encoding process, based on circulant matrices derived from Frank or from Milewski sequences. This class of design matrices has comparable encoding and decoding computational complexity as well as very close performance with the commonly-used class of design matrices based on discrete Fourier transform (DFT) matrices, but gives us more degrees of freedom when designing SPARCs for various applications.
We are interested in the numerical solution of coupled nonlinear partial differential equations (PDEs) in two and three dimensions. Under certain assumptions on the domain, we take advantage of the Kronecker structure arising in standard space discretizations of the differential operators and illustrate how the resulting system of ordinary differential equations (ODEs) can be treated directly in matrix or tensor form. Moreover, in the framework of the proper orthogonal decomposition (POD) and the discrete empirical interpolation method (DEIM) we derive a two- and three-sided model order reduction strategy that is applied directly to the ODE system in matrix and tensor form respectively. We discuss how to integrate the reduced order model and, in particular, how to solve the tensor-valued linear system arising at each timestep of a semi-implicit time discretization scheme. We illustrate the efficiency of the proposed method through a comparison to existing techniques on classical benchmark problems such as the two- and three-dimensional Burgers equation.
Modern neural network architectures typically have many millions of parameters and can be pruned significantly without substantial loss in effectiveness which demonstrates they are over-parameterized. The contribution of this work is two-fold. The first is a method for approximating a multivariate Bernoulli random variable by means of a deterministic and differentiable transformation of any real-valued multivariate random variable. The second is a method for model selection by element-wise multiplication of parameters with approximate binary gates that may be computed deterministically or stochastically and take on exact zero values. Sparsity is encouraged by the inclusion of a surrogate regularization to the $L_0$ loss. Since the method is differentiable it enables straightforward and efficient learning of model architectures by an empirical risk minimization procedure with stochastic gradient descent and theoretically enables conditional computation during training. The method also supports any arbitrary group sparsity over parameters or activations and therefore offers a framework for unstructured or flexible structured model pruning. To conclude experiments are performed to demonstrate the effectiveness of the proposed approach.
In this paper, we investigate how to minimize the age of information when a source can transmit status updates over two heterogeneous channels. The work is motivated by recent developments of 5G mmWave technology, where transmissions may occur over an unreliable but fast (e.g., mmWave) channel or a slow reliable (e.g., sub-6GHz) channel. The unreliable channel is modeled using the Gilbert-Elliot channel model, where information can be transmitted at a high rate when the channel is in the 'ON' state. The reliable channel is assumed to provide a deterministic but lower data rate. The scheduling strategy is to select which channel to transmit on over time in order to minimize the time-average age of information. The problem can be formulated as a Markov Decision Process (MDP). The MDP structures based on two largely different channels with time correlation is complicated, which makes our problem challenging. However, we still efficiently derive an exact solution. We first show that there exists an optimal threshold-type scheduling policy to minimize age. We then develop a low-complexity algorithm to derive the exact value of the optimal thresholds. Numerical simulations are provided to compare different policies.
Rank-metric code-based cryptography relies on the hardness of decoding a random linear code in the rank metric. The Rank Support Learning problem (RSL) is a variant where an attacker has access to N decoding instances whose errors have the same support and wants to solve one of them. This problem is for instance used in the Durandal signature scheme. In this paper, we propose an algebraic attack on RSL which clearly outperforms the previous attacks to solve this problem. We build upon Bardet et al., Asiacrypt 2020, where similar techniques are used to solve MinRank and RD. However, our analysis is simpler and overall our attack relies on very elementary assumptions compared to standard Gr{\"o}bner bases attacks. In particular, our results show that key recovery attacks on Durandal are more efficient than was previously thought.
Importance sampling is one of the most widely used variance reduction strategies in Monte Carlo rendering. In this paper, we propose a novel importance sampling technique that uses a neural network to learn how to sample from a desired density represented by a set of samples. Our approach considers an existing Monte Carlo rendering algorithm as a black box. During a scene-dependent training phase, we learn to generate samples with a desired density in the primary sample space of the rendering algorithm using maximum likelihood estimation. We leverage a recent neural network architecture that was designed to represent real-valued non-volume preserving ('Real NVP') transformations in high dimensional spaces. We use Real NVP to non-linearly warp primary sample space and obtain desired densities. In addition, Real NVP efficiently computes the determinant of the Jacobian of the warp, which is required to implement the change of integration variables implied by the warp. A main advantage of our approach is that it is agnostic of underlying light transport effects, and can be combined with many existing rendering techniques by treating them as a black box. We show that our approach leads to effective variance reduction in several practical scenarios.
In this paper, we propose a listwise approach for constructing user-specific rankings in recommendation systems in a collaborative fashion. We contrast the listwise approach to previous pointwise and pairwise approaches, which are based on treating either each rating or each pairwise comparison as an independent instance respectively. By extending the work of (Cao et al. 2007), we cast listwise collaborative ranking as maximum likelihood under a permutation model which applies probability mass to permutations based on a low rank latent score matrix. We present a novel algorithm called SQL-Rank, which can accommodate ties and missing data and can run in linear time. We develop a theoretical framework for analyzing listwise ranking methods based on a novel representation theory for the permutation model. Applying this framework to collaborative ranking, we derive asymptotic statistical rates as the number of users and items grow together. We conclude by demonstrating that our SQL-Rank method often outperforms current state-of-the-art algorithms for implicit feedback such as Weighted-MF and BPR and achieve favorable results when compared to explicit feedback algorithms such as matrix factorization and collaborative ranking.
Superpixel segmentation has become an important research problem in image processing. In this paper, we propose an Iterative Spanning Forest (ISF) framework, based on sequences of Image Foresting Transforms, where one can choose i) a seed sampling strategy, ii) a connectivity function, iii) an adjacency relation, and iv) a seed pixel recomputation procedure to generate improved sets of connected superpixels (supervoxels in 3D) per iteration. The superpixels in ISF structurally correspond to spanning trees rooted at those seeds. We present five ISF methods to illustrate different choices of its components. These methods are compared with approaches from the state-of-the-art in effectiveness and efficiency. The experiments involve 2D and 3D datasets with distinct characteristics, and a high level application, named sky image segmentation. The theoretical properties of ISF are demonstrated in the supplementary material and the results show that some of its methods are competitive with or superior to the best baselines in effectiveness and efficiency.
This paper considers the integrated problem of quay crane assignment, quay crane scheduling, yard location assignment, and vehicle dispatching operations at a container terminal. The main objective is to minimize vessel turnover times and maximize the terminal throughput, which are key economic drivers in terminal operations. Due to their computational complexities, these problems are not optimized jointly in existing work. This paper revisits this limitation and proposes Mixed Integer Programming (MIP) and Constraint Programming (CP) models for the integrated problem, under some realistic assumptions. Experimental results show that the MIP formulation can only solve small instances, while the CP model finds optimal solutions in reasonable times for realistic instances derived from actual container terminal operations.