Semantic communication, which focuses on conveying the meaning of information rather than exact bit reconstruction, has gained considerable attention in recent years. Meanwhile, reconfigurable intelligent surface (RIS) is a promising technology that can achieve high spectral and energy efficiency by dynamically reflecting incident signals through programmable passive components. In this paper, we put forth a semantic communication scheme aided by RIS. Using text transmission as an example, experimental results demonstrate that the RIS-assisted semantic communication system outperforms the point-to-point semantic communication system in terms of BLEU scores in Rayleigh fading channels, especially at low signal-to-noise ratio (SNR) regimes. In addition, the RIS-assisted semantic communication system exhibits superior robustness against channel estimation errors compared to its point-to-point counterpart. RIS can improve performance as it provides extra line-of-sight (LoS) paths and enhances signal propagation conditions compared to point-to-point systems.
Rare event simulation and rare event probability estimation are important tasks within the analysis of systems subject to uncertainty and randomness. Simultaneously, accurately estimating rare event probabilities is an inherently difficult task that calls for dedicated tools and methods. One way to improve estimation efficiency on difficult rare event estimation problems is to leverage gradients of the computational model representing the system in consideration, e.g., to explore the rare event faster and more reliably. We present a novel approach for estimating rare event probabilities using such model gradients by drawing on a technique to generate samples from non-normalized posterior distributions in Bayesian inference - the Stein variational gradient descent. We propagate samples generated from a tractable input distribution towards a near-optimal rare event importance sampling distribution by exploiting a similarity of the latter with Bayesian posterior distributions. Sample propagation takes the shape of passing samples through a sequence of invertible transforms such that their densities can be tracked and used to construct an unbiased importance sampling estimate of the rare event probability - the Stein variational rare event estimator. We discuss settings and parametric choices of the algorithm and suggest a method for balancing convergence speed with stability by choosing the step width or base learning rate adaptively. We analyze the method's performance on several analytical test functions and two engineering examples in low to high stochastic dimensions ($d = 2 - 869$) and find that it consistently outperforms other state-of-the-art gradient-based rare event simulation methods.
Internet service providers (ISPs) have a variety of quality attributes that determine their attractiveness for data transmission, ranging from quality-of-service metrics such as jitter to security properties such as the presence of DDoS defense systems. ISPs should optimize these attributes in line with their profit objective, i.e., maximize revenue from attracted traffic while minimizing attribute-related cost, all in the context of alternative offers by competing ISPs. However, this attribute optimization is difficult not least because many aspects of ISP competition are barely understood on a systematic level, e.g., the multi-dimensional and cost-driving nature of path quality, and the distributed decision making of ISPs on the same path. In this paper, we improve this understanding by analyzing how ISP competition affects path quality and ISP profits. To that end, we develop a game-theoretic model in which ISPs (i) affect path quality via multiple attributes that entail costs, (ii) are on paths together with other selfish ISPs, and (iii) are in competition with alternative paths when attracting traffic. The model enables an extensive theoretical analysis, surprisingly showing that competition can have both positive and negative effects on path quality and ISP profits, depending on the network topology and the cost structure of ISPs. However, a large-scale simulation, which draws on real-world data to instantiate the model, shows that the positive effects will likely prevail in practice: If the number of selectable paths towards any destination increases from 1 to 5, the prevalence of quality attributes increases by at least 50%, while 75% of ISPs improve their profit.
Interface problems have long been a major focus of scientific computing, leading to the development of various numerical methods. Traditional mesh-based methods often employ time-consuming body-fitted meshes with standard discretization schemes or unfitted meshes with tailored schemes to achieve controllable accuracy and convergence rate. Along another line, mesh-free methods bypass mesh generation but lack robustness in terms of convergence and accuracy due to the low regularity of solutions. In this study, we propose a novel method for solving interface problems within the framework of the random feature method. This approach utilizes random feature functions in conjunction with a partition of unity as approximation functions. It evaluates partial differential equations, boundary conditions, and interface conditions on collocation points in equal footing, and solves a linear least-squares system to obtain the approximate solution. To address the issue of low regularity, two sets of random feature functions are used to approximate the solution on each side of the interface, which are then coupled together via interface conditions. We validate our method through a series of increasingly complex numerical examples. Our findings show that despite the solution often being only continuous or even discontinuous, our method not only eliminates the need for mesh generation but also maintains high accuracy, akin to the spectral collocation method for smooth solutions. Remarkably, for the same accuracy requirement, our method requires two to three orders of magnitude fewer degrees of freedom than traditional methods, demonstrating its significant potential for solving interface problems with complex geometries.
Quotient regularization models (QRMs) are a class of powerful regularization techniques that have gained considerable attention in recent years, due to their ability to handle complex and highly nonlinear data sets. However, the nonconvex nature of QRM poses a significant challenge in finding its optimal solution. We are interested in scenarios where both the numerator and the denominator of QRM are absolutely one-homogeneous functions, which is widely applicable in the fields of signal processing and image processing. In this paper, we utilize a gradient flow to minimize such QRM in combination with a quadratic data fidelity term. Our scheme involves solving a convex problem iteratively.The convergence analysis is conducted on a modified scheme in a continuous formulation, showing the convergence to a stationary point. Numerical experiments demonstrate the effectiveness of the proposed algorithm in terms of accuracy, outperforming the state-of-the-art QRM solvers.
Neural networks are vulnerable to adversarial attacks: adding well-crafted, imperceptible perturbations to their input can modify their output. Adversarial training is one of the most effective approaches to training robust models against such attacks. Unfortunately, this method is much slower than vanilla training of neural networks since it needs to construct adversarial examples for the entire training data at every iteration. By leveraging the theory of coreset selection, we show how selecting a small subset of training data provides a principled approach to reducing the time complexity of robust training. To this end, we first provide convergence guarantees for adversarial coreset selection. In particular, we show that the convergence bound is directly related to how well our coresets can approximate the gradient computed over the entire training data. Motivated by our theoretical analysis, we propose using this gradient approximation error as our adversarial coreset selection objective to reduce the training set size effectively. Once built, we run adversarial training over this subset of the training data. Unlike existing methods, our approach can be adapted to a wide variety of training objectives, including TRADES, $\ell_p$-PGD, and Perceptual Adversarial Training. We conduct extensive experiments to demonstrate that our approach speeds up adversarial training by 2-3 times while experiencing a slight degradation in the clean and robust accuracy.
The explosive growth of computation and energy cost of artificial intelligence has spurred strong interests in new computing modalities as potential alternatives to conventional electronic processors. Photonic processors that execute operations using photons instead of electrons, have promised to enable optical neural networks with ultra-low latency and power consumption. However, existing optical neural networks, limited by the underlying network designs, have achieved image recognition accuracy much lower than state-of-the-art electronic neural networks. In this work, we close this gap by introducing a large-kernel spatially-varying convolutional neural network learned via low-dimensional reparameterization techniques. We experimentally instantiate the network with a flat meta-optical system that encompasses an array of nanophotonic structures designed to induce angle-dependent responses. Combined with an extremely lightweight electronic backend with approximately 2K parameters we demonstrate a nanophotonic neural network reaches 73.80\% blind test classification accuracy on CIFAR-10 dataset, and, as such, the first time, an optical neural network outperforms the first modern digital neural network -- AlexNet (72.64\%) with 57M parameters, bringing optical neural network into modern deep learning era.
Graph Neural Networks (GNNs) have been successfully used in many problems involving graph-structured data, achieving state-of-the-art performance. GNNs typically employ a message-passing scheme, in which every node aggregates information from its neighbors using a permutation-invariant aggregation function. Standard well-examined choices such as the mean or sum aggregation functions have limited capabilities, as they are not able to capture interactions among neighbors. In this work, we formalize these interactions using an information-theoretic framework that notably includes synergistic information. Driven by this definition, we introduce the Graph Ordering Attention (GOAT) layer, a novel GNN component that captures interactions between nodes in a neighborhood. This is achieved by learning local node orderings via an attention mechanism and processing the ordered representations using a recurrent neural network aggregator. This design allows us to make use of a permutation-sensitive aggregator while maintaining the permutation-equivariance of the proposed GOAT layer. The GOAT model demonstrates its increased performance in modeling graph metrics that capture complex information, such as the betweenness centrality and the effective size of a node. In practical use-cases, its superior modeling capability is confirmed through its success in several real-world node classification benchmarks.
Recently, a considerable literature has grown up around the theme of Graph Convolutional Network (GCN). How to effectively leverage the rich structural information in complex graphs, such as knowledge graphs with heterogeneous types of entities and relations, is a primary open challenge in the field. Most GCN methods are either restricted to graphs with a homogeneous type of edges (e.g., citation links only), or focusing on representation learning for nodes only instead of jointly propagating and updating the embeddings of both nodes and edges for target-driven objectives. This paper addresses these limitations by proposing a novel framework, namely the Knowledge Embedding based Graph Convolutional Network (KE-GCN), which combines the power of GCNs in graph-based belief propagation and the strengths of advanced knowledge embedding (a.k.a. knowledge graph embedding) methods, and goes beyond. Our theoretical analysis shows that KE-GCN offers an elegant unification of several well-known GCN methods as specific cases, with a new perspective of graph convolution. Experimental results on benchmark datasets show the advantageous performance of KE-GCN over strong baseline methods in the tasks of knowledge graph alignment and entity classification.
Graph Neural Networks (GNN) is an emerging field for learning on non-Euclidean data. Recently, there has been increased interest in designing GNN that scales to large graphs. Most existing methods use "graph sampling" or "layer-wise sampling" techniques to reduce training time. However, these methods still suffer from degrading performance and scalability problems when applying to graphs with billions of edges. This paper presents GBP, a scalable GNN that utilizes a localized bidirectional propagation process from both the feature vectors and the training/testing nodes. Theoretical analysis shows that GBP is the first method that achieves sub-linear time complexity for both the precomputation and the training phases. An extensive empirical study demonstrates that GBP achieves state-of-the-art performance with significantly less training/testing time. Most notably, GBP can deliver superior performance on a graph with over 60 million nodes and 1.8 billion edges in less than half an hour on a single machine.
We investigate a lattice-structured LSTM model for Chinese NER, which encodes a sequence of input characters as well as all potential words that match a lexicon. Compared with character-based methods, our model explicitly leverages word and word sequence information. Compared with word-based methods, lattice LSTM does not suffer from segmentation errors. Gated recurrent cells allow our model to choose the most relevant characters and words from a sentence for better NER results. Experiments on various datasets show that lattice LSTM outperforms both word-based and character-based LSTM baselines, achieving the best results.