It is known that fiber nonlinearities induce crosstalk in a wavelength division multiplexed (WDM) system, which limits the capacity of such systems as the transmitted signal power is increased. A network user in a WDM system is an entity that operates around a given optical wavelength. Traditionally, the channel capacity of a WDM system has been analyzed under different assumptions for the transmitted signals of the other users, while treating the interference arising from these users as noise. In this paper, we instead take a multiuser information theoretic view and treat the optical WDM system impaired by cross-phase modulation and dispersion as an interference channel. We characterize an outer bound on the capacity region of simultaneously achievable rate pairs, assuming a simplified K-user perturbative channel model using genie-aided techniques. Furthermore, an achievable rate region is obtained by time-sharing between certain single-user strategies. It is shown that such time-sharing can achieve better rate tuples compared to treating nonlinear interference as noise. For the single-polarization single-span system under consideration and a power 4.4 dB above the optimum launch power, treating nonlinear interference as noise results in a rate of 1.67 bit/sym, while time-sharing gives a rate of 6.33 bit/sym.
Communication efficiency plays an important role in accelerating the distributed training of Deep Neural Networks (DNN). All-reduce is the key communication primitive to reduce model parameters in distributed DNN training. Most existing all-reduce algorithms are designed for traditional electrical interconnect systems, which cannot meet the communication requirements for distributed training of large DNNs. One of the promising alternatives for electrical interconnect is optical interconnect, which can provide high bandwidth, low transmission delay, and low power cost. We propose an efficient scheme called WRHT (Wavelength Reused Hierarchical Tree) for implementing all-reduce operation in optical interconnect system, which can take advantage of WDM (Wavelength Division Multiplexing) to reduce the communication time of distributed data-parallel DNN training. We further derive the minimum number of communication steps and communication time to realize the all-reduce using WRHT. Simulation results show that the communication time of WRHT is reduced by 75.59%, 49.25%, and 70.1% respectively compared with three traditional all-reduce algorithms simulated in optical interconnect system. Simulation results also show that WRHT can reduce the communication time for all-reduce operation by 86.69% and 84.71% in comparison with two existing all-reduce algorithms in electrical interconnect system.
The hypergraph Moore bound is an elegant statement that characterizes the extremal trade-off between the girth - the number of hyperedges in the smallest cycle or even cover (a subhypergraph with all degrees even) and size - the number of hyperedges in a hypergraph. For graphs (i.e., $2$-uniform hypergraphs), a bound tight up to the leading constant was proven in a classical work of Alon, Hoory and Linial [AHL02]. For hypergraphs of uniformity $k>2$, an appropriate generalization was conjectured by Feige [Fei08]. The conjecture was settled up to an additional $\log^{4k+1} n$ factor in the size in a recent work of Guruswami, Kothari and Manohar [GKM21]. Their argument relies on a connection between the existence of short even covers and the spectrum of a certain randomly signed Kikuchi matrix. Their analysis, especially for the case of odd $k$, is significantly complicated. In this work, we present a substantially simpler and shorter proof of the hypergraph Moore bound. Our key idea is the use of a new reweighted Kikuchi matrix and an edge deletion step that allows us to drop several involved steps in [GKM21]'s analysis such as combinatorial bucketing of rows of the Kikuchi matrix and the use of the Schudy-Sviridenko polynomial concentration. Our simpler proof also obtains tighter parameters: in particular, the argument gives a new proof of the classical Moore bound of [AHL02] with no loss (the proof in [GKM21] loses a $\log^3 n$ factor), and loses only a single logarithmic factor for all $k>2$-uniform hypergraphs. As in [GKM21], our ideas naturally extend to yield a simpler proof of the full trade-off for strongly refuting smoothed instances of constraint satisfaction problems with similarly improved parameters.
We propose a novel automatic parameter selection strategy for variational imaging problems under Poisson noise corruption. The selection of a suitable regularization parameter, whose value is crucial in order to achieve high quality reconstructions, is known to be a particularly hard task in low photon-count regimes. In this work, we extend the so-called residual whiteness principle originally designed for additive white noise to Poisson data. The proposed strategy relies on the study of the whiteness property of a standardized Poisson noise process. After deriving the theoretical properties that motivate our proposal, we solve the target minimization problem with a linearized version of the alternating direction method of multipliers, which is particularly suitable in presence of a general linear forward operator. Our strategy is extensively tested on image restoration and computed tomography reconstruction problems, and compared to the well-known discrepancy principle for Poisson noise proposed by Zanella at al. and with a nearly exact version of it previously proposed by the authors.
Uncertainty quantification techniques such as the time-dependent generalized polynomial chaos (TD-gPC) use an adaptive orthogonal basis to better represent the stochastic part of the solution space (aka random function space) in time. However, because the random function space is constructed using tensor products, TD-gPC-based methods are known to suffer from the curse of dimensionality. In this paper, we introduce a new numerical method called the 'flow-driven spectral chaos' (FSC) which overcomes this curse of dimensionality at the random-function-space level. The proposed method is not only computationally more efficient than existing TD-gPC-based methods but is also far more accurate. The FSC method uses the concept of 'enriched stochastic flow maps' to track the evolution of a finite-dimensional random function space efficiently in time. To transfer the probability information from one random function space to another, two approaches are developed and studied herein. In the first approach, the probability information is transferred in the mean-square sense, whereas in the second approach the transfer is done exactly using a new theorem that was developed for this purpose. The FSC method can quantify uncertainties with high fidelity, especially for the long-time response of stochastic dynamical systems governed by ODEs of arbitrary order. Six representative numerical examples, including a nonlinear problem (the Van-der-Pol oscillator), are presented to demonstrate the performance of the FSC method and corroborate the claims of its superior numerical properties. Finally, a parametric, high-dimensional stochastic problem is used to demonstrate that when the FSC method is used in conjunction with Monte Carlo integration, the curse of dimensionality can be overcome altogether.
We derive conditions for the existence of fixed points of cone mappings without assuming scalability of functions. Monotonicity and scalability are often inseparable in the literature in the context of searching for fixed points of interference mappings. In applications, such mappings are approximated by non-negative neural networks. It turns out, however, that the process of training non-negative networks requires imposing an artificial constraint on the weights of the model. However, in the case of specific non-negative data, it cannot be said that if the mapping is non-negative, it has only non-negative weights. Therefore, we considered the problem of the existence of fixed points for general neural networks, assuming the conditions of tangency conditions with respect to specific cones. This does not relax the physical assumptions, because even assuming that the input and output are to be non-negative, the weights can have (small, but) less than zero values. Such properties (often found in papers on the interpretability of weights of neural networks) lead to the weakening of the assumptions about the monotonicity or scalability of the mapping associated with the neural network. To the best of our knowledge, this paper is the first to study this phenomenon.
The fundamental challenge of drawing causal inference is that counterfactual outcomes are not fully observed for any unit. Furthermore, in observational studies, treatment assignment is likely to be confounded. Many statistical methods have emerged for causal inference under unconfoundedness conditions given pre-treatment covariates, including propensity score-based methods, prognostic score-based methods, and doubly robust methods. Unfortunately for applied researchers, there is no `one-size-fits-all' causal method that can perform optimally universally. In practice, causal methods are primarily evaluated quantitatively on handcrafted simulated data. Such data-generative procedures can be of limited value because they are typically stylized models of reality. They are simplified for tractability and lack the complexities of real-world data. For applied researchers, it is critical to understand how well a method performs for the data at hand. Our work introduces a deep generative model-based framework, Credence, to validate causal inference methods. The framework's novelty stems from its ability to generate synthetic data anchored at the empirical distribution for the observed sample, and therefore virtually indistinguishable from the latter. The approach allows the user to specify ground truth for the form and magnitude of causal effects and confounding bias as functions of covariates. Thus simulated data sets are used to evaluate the potential performance of various causal estimation methods when applied to data similar to the observed sample. We demonstrate Credence's ability to accurately assess the relative performance of causal estimation techniques in an extensive simulation study and two real-world data applications from Lalonde and Project STAR studies.
When users exchange data with Unmanned Aerial vehicles - (UAVs) over air-to-ground (A2G) wireless communication networks, they expose the link to attacks that could increase packet loss and might disrupt connectivity. For example, in emergency deliveries, losing control information (i.e data related to the UAV control communication) might result in accidents that cause UAV destruction and damage to buildings or other elements in a city. To prevent these problems, these issues must be addressed in 5G and 6G scenarios. This research offers a deep learning (DL) approach for detecting attacks in UAVs equipped with orthogonal frequency division multiplexing (OFDM) receivers on Clustered Delay Line (CDL) channels in highly complex scenarios involving authenticated terrestrial users, as well as attackers in unknown locations. We use the two observable parameters available in 5G UAV connections: the Received Signal Strength Indicator (RSSI) and the Signal to Interference plus Noise Ratio (SINR). The prospective algorithm is generalizable regarding attack identification, which does not occur during training. Further, it can identify all the attackers in the environment with 20 terrestrial users. A deeper investigation into the timing requirements for recognizing attacks show that after training, the minimum time necessary after the attack begins is 100 ms, and the minimum attack power is 2 dBm, which is the same power that the authenticated UAV uses. Our algorithm also detects moving attackers from a distance of 500 m.
Knowledge graphs capture interlinked information between entities and they represent an attractive source of structured information that can be harnessed for recommender systems. However, existing recommender engines use knowledge graphs by manually designing features, do not allow for end-to-end training, or provide poor scalability. Here we propose Knowledge Graph Convolutional Networks (KGCN), an end-to-end trainable framework that harnesses item relationships captured by the knowledge graph to provide better recommendations. Conceptually, KGCN computes user-specific item embeddings by first applying a trainable function that identifies important knowledge graph relations for a given user and then transforming the knowledge graph into a user-specific weighted graph. Then, KGCN applies a graph convolutional neural network that computes an embedding of an item node by propagating and aggregating knowledge graph neighborhood information. Moreover, to provide better inductive bias KGCN uses label smoothness (LS), which provides regularization over edge weights and we prove that it is equivalent to label propagation scheme on a graph. Finally, We unify KGCN and LS regularization, and present a scalable minibatch implementation for KGCN-LS model. Experiments show that KGCN-LS outperforms strong baselines in four datasets. KGCN-LS also achieves great performance in sparse scenarios and is highly scalable with respect to the knowledge graph size.
Graph Neural Networks (GNNs) for representation learning of graphs broadly follow a neighborhood aggregation framework, where the representation vector of a node is computed by recursively aggregating and transforming feature vectors of its neighboring nodes. Many GNN variants have been proposed and have achieved state-of-the-art results on both node and graph classification tasks. However, despite GNNs revolutionizing graph representation learning, there is limited understanding of their representational properties and limitations. Here, we present a theoretical framework for analyzing the expressive power of GNNs in capturing different graph structures. Our results characterize the discriminative power of popular GNN variants, such as Graph Convolutional Networks and GraphSAGE, and show that they cannot learn to distinguish certain simple graph structures. We then develop a simple architecture that is provably the most expressive among the class of GNNs and is as powerful as the Weisfeiler-Lehman graph isomorphism test. We empirically validate our theoretical findings on a number of graph classification benchmarks, and demonstrate that our model achieves state-of-the-art performance.
To address the sparsity and cold start problem of collaborative filtering, researchers usually make use of side information, such as social networks or item attributes, to improve recommendation performance. This paper considers the knowledge graph as the source of side information. To address the limitations of existing embedding-based and path-based methods for knowledge-graph-aware recommendation, we propose Ripple Network, an end-to-end framework that naturally incorporates the knowledge graph into recommender systems. Similar to actual ripples propagating on the surface of water, Ripple Network stimulates the propagation of user preferences over the set of knowledge entities by automatically and iteratively extending a user's potential interests along links in the knowledge graph. The multiple "ripples" activated by a user's historically clicked items are thus superposed to form the preference distribution of the user with respect to a candidate item, which could be used for predicting the final clicking probability. Through extensive experiments on real-world datasets, we demonstrate that Ripple Network achieves substantial gains in a variety of scenarios, including movie, book and news recommendation, over several state-of-the-art baselines.