Shannon-Hartley theorem can accurately calculate the channel capacity when the signal observation time is infinite. However, the calculation of finite-time capacity, which remains unknown, is essential for guiding the design of practical communication systems. In this paper, we investigate the capacity between two correlated Gaussian processes within a finite-time observation window. We first derive the finite-time capacity by providing a limit expression. Then we numerically compute the maximum transmission rate within a single finite-time window. We reveal that the number of bits transmitted per second within the finite-time window can exceed the classical Shannon capacity, which is called as the Exceed-Shannon phenomenon. Furthermore, we derive a finite-time capacity formula under a typical signal autocorrelation case by utilizing the Mercer expansion of trace class operators, and reveal the connection between the finite-time capacity problem and the operator theory. Finally, we analytically prove the existence of the Exceed-Shannon phenomenon in this typical case, and demonstrate the achievability of the finite-time capacity and its compatibility with the classical Shannon capacity.
Let $W$ be a binary-input memoryless symmetric (BMS) channel with Shannon capacity $I(W)$ and fix any $\alpha > 0$. We construct, for any sufficiently small $\delta > 0$, binary linear codes of block length $O(1/\delta^{2+\alpha})$ and rate $I(W)-\delta$ that enable reliable communication on $W$ with quasi-linear time encoding and decoding. Shannon's noisy coding theorem established the \emph{existence} of such codes (without efficient constructions or decoding) with block length $O(1/\delta^2)$. This quadratic dependence on the gap $\delta$ to capacity is known to be best possible. Our result thus yields a constructive version of Shannon's theorem with near-optimal convergence to capacity as a function of the block length. This resolves a central theoretical challenge associated with the attainment of Shannon capacity. Previously such a result was only known for the erasure channel. Our codes are a variant of Ar{\i}kan's polar codes based on multiple carefully constructed local kernels, one for each intermediate channel that arises in the decoding. A crucial ingredient in the analysis is a strong converse of the noisy coding theorem when communicating using random linear codes on arbitrary BMS channels. Our converse theorem shows extreme unpredictability of even a single message bit for random coding at rates slightly above capacity.
We investigate the problem of message transmission over time-varying single-user multiple-input multiple-output (MIMO) Rayleigh fading channels with average power constraint and with complete channel state information available at the receiver side (CSIR). To describe the channel variations over the time, we consider a first-order Gauss-Markov model. We completely solve the problem by giving a single-letter characterization of the channel capacity in closed form and by providing a rigorous proof of it.
In this paper, we investigate the secure rate-splitting for the two-user multiple-input multiple-output (MIMO) broadcast channel with imperfect channel state information at the transmitter (CSIT) and a multiple-antenna jammer, where each receiver has equal number of antennas and the jammer has perfect channel state information (CSI). Specifically, we design the secure rate-splitting multiple-access in this scenario, where the security of splitted private and common messages is ensured by precoder design with joint nulling and aligning the leakage information, regarding to different antenna configurations. As a result, we show that the sum-secure degrees-of-freedom (SDoF) achieved by secure rate-splitting outperforms that by conventional zero-forcing. Therefore, we validate the superiority of rate-splitting for the secure purpose in the two-user MIMO broadcast channel with imperfect CSIT and a jammer.
We consider a parallel server system with so-called cancel-on-completion redundancy. There are $n$ servers and multiple job classes $j$. An arriving class $j$ job consists of $d_j$ components, placed on a randomly selected subset of servers; the job service is complete as soon as $k_j$ components out of $d_j$ (with $k_j \le d_j$) complete their service, at which point the unfinished service of all remaining $d_j-k_j$ components is canceled. The system is in general non-work-conserving, in the sense that the average amount of new workload added to the system by an arriving class $j$ job is not defined a priori -- it depends on the system state at the time of arrival. This poses the main challenge for the system analysis. For the system with a fixed number of servers $n$ our main results include: the stability properties; the property that the stationary distributions of the relative server workloads remain tight, uniformly in the system load. We also consider the mean-field asymptotic regime when $n\to\infty$ while each job class arrival rate per server remains constant. The main question we address here is: under which conditions the steady-state asymptotic independence (SSAI) of server workloads holds, and in particular when the SSAI for the full range of loads (SSAI-FRL) holds. (Informally, SSAI-FRL means that SSAI holds for any system load less than $1$.) We obtain sufficient conditions for SSAI and SSAI-FRL. In particular, we prove that SSAI-FRL holds in the important special case when job components of each class $j$ are i.i.d. with an increasing-hazard-rate distribution.
We introduce an approach to designing FPGA-accelerated middleboxes that simplifies development, debugging, and performance tuning by decoupling the tasks of hardware accelerator implementation and software application programming. Shire is a framework that links hardware accelerators to a high-performance packet processing pipeline through a standardized hardware/software interface. This separation of concerns allows hardware developers to focus on optimizing custom accelerators while freeing software programmers to reuse, configure, and debug accelerators in a fashion akin to software libraries. We show the benefits of Shire framework by building a firewall based on a large blacklist and porting the Pigasus IDS pattern-matching accelerator in less than a month. Our experiments demonstrate Shire delivers high performance, serving ~200 Gbps of traffic while adding only 0.7-7 microseconds of latency.
Reward is the driving force for reinforcement-learning agents. This paper is dedicated to understanding the expressivity of reward as a way to capture tasks that we would want an agent to perform. We frame this study around three new abstract notions of "task" that might be desirable: (1) a set of acceptable behaviors, (2) a partial ordering over behaviors, or (3) a partial ordering over trajectories. Our main results prove that while reward can express many of these tasks, there exist instances of each task type that no Markov reward function can capture. We then provide a set of polynomial-time algorithms that construct a Markov reward function that allows an agent to optimize tasks of each of these three types, and correctly determine when no such reward function exists. We conclude with an empirical study that corroborates and illustrates our theoretical findings.
Modern neural network architectures can leverage large amounts of data to generalize well within the training distribution. However, they are less capable of systematic generalization to data drawn from unseen but related distributions, a feat that is hypothesized to require compositional reasoning and reuse of knowledge. In this work, we present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules, which we call \emph{functions}. Inputs to the model are routed through a sequence of functions in a way that is end-to-end learned. The proposed architecture can flexibly compose computation along width and depth, and lends itself well to capacity extension after training. To demonstrate the versatility of Neural Interpreters, we evaluate it in two distinct settings: image classification and visual abstract reasoning on Raven Progressive Matrices. In the former, we show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner. In the latter, we find that Neural Interpreters are competitive with respect to the state-of-the-art in terms of systematic generalization
Many important real-world problems have action spaces that are high-dimensional, continuous or both, making full enumeration of all possible actions infeasible. Instead, only small subsets of actions can be sampled for the purpose of policy evaluation and improvement. In this paper, we propose a general framework to reason in a principled way about policy evaluation and improvement over such sampled action subsets. This sample-based policy iteration framework can in principle be applied to any reinforcement learning algorithm based upon policy iteration. Concretely, we propose Sampled MuZero, an extension of the MuZero algorithm that is able to learn in domains with arbitrarily complex action spaces by planning over sampled actions. We demonstrate this approach on the classical board game of Go and on two continuous control benchmark domains: DeepMind Control Suite and Real-World RL Suite.
Conventional neural autoregressive decoding commonly assumes a fixed left-to-right generation order, which may be sub-optimal. In this work, we propose a novel decoding algorithm -- InDIGO -- which supports flexible sequence generation in arbitrary orders through insertion operations. We extend Transformer, a state-of-the-art sequence generation model, to efficiently implement the proposed approach, enabling it to be trained with either a pre-defined generation order or adaptive orders obtained from beam-search. Experiments on four real-world tasks, including word order recovery, machine translation, image caption and code generation, demonstrate that our algorithm can generate sequences following arbitrary orders, while achieving competitive or even better performance compared to the conventional left-to-right generation. The generated sequences show that InDIGO adopts adaptive generation orders based on input information.
When a recurrent neural network language model is used for caption generation, the image information can be fed to the neural network either by directly incorporating it in the RNN -- conditioning the language model by `injecting' image features -- or in a layer following the RNN -- conditioning the language model by `merging' image features. While both options are attested in the literature, there is as yet no systematic comparison between the two. In this paper we empirically show that it is not especially detrimental to performance whether one architecture is used or another. The merge architecture does have practical advantages, as conditioning by merging allows the RNN's hidden state vector to shrink in size by up to four times. Our results suggest that the visual and linguistic modalities for caption generation need not be jointly encoded by the RNN as that yields large, memory-intensive models with few tangible advantages in performance; rather, the multimodal integration should be delayed to a subsequent stage.