We investigate the possibility of solving continuous non-convex optimization problems using a network of interacting quantum optical oscillators. We propose a native encoding of continuous variables in analog signals associated with the quadrature operators of a set of quantum optical modes. Optical coupling of the modes and noise introduced by vacuum fluctuations from external reservoirs or by weak measurements of the modes are used to optically simulate a diffusion process on a set of continuous random variables. The process is run sufficiently long for it to relax into the steady state of an energy potential defined on a continuous domain. As a first demonstration, we numerically benchmark solving box-constrained quadratic programming (BoxQP) problems using these settings. We consider delay-line and measurement-feedback variants of the experiment. Our benchmarking results demonstrate that in both cases the optical network is capable of solving BoxQP problems over three orders of magnitude faster than a state-of-the-art classical heuristic.
In this paper, simultaneously transmitting and reflecting (STAR) reconfigurable intelligent surface (RIS) is investigated in the multi-user mobile edge computing (MEC) system to improve the computation rate. Compared with traditional RIS-aided MEC, STAR-RIS extends the service coverage from half-space to full-space and provides new flexibility for improving the computation rate for end users. However, the STAR-RIS-aided MEC system design is a challenging problem due to the non-smooth and non-convex binary amplitude coefficients with coupled phase shifters. To fill this gap, this paper formulates a computation rate maximization problem via the joint design of the STAR-RIS phase shifts, reflection and transmission amplitude coefficients, the receive beamforming vectors, and energy partition strategies for local computing and offloading. To tackle the discontinuity caused by binary variables, we propose an efficient smoothing-based method to decrease convergence error, in contrast to the conventional penalty-based method, which brings many undesired stationary points and local optima. Furthermore, a fast iterative algorithm is proposed to obtain a stationary point for the joint optimization problem, with each subproblem solved by a low-complexity algorithm, making the proposed design scalable to a massive number of users and STAR-RIS elements. Simulation results validate the strength of the proposed smoothing-based method and show that the proposed fast iterative algorithm achieves a higher computation rate than the conventional method while saving the computation time by at least an order of magnitude. Moreover, the resultant STAR-RIS-aided MEC system significantly improves the computation rate compared to other baseline schemes with conventional reflect-only/transmit-only RIS.
Integer linear programming models a wide range of practical combinatorial optimization problems and has significant impacts in industry and management sectors. This work develops the first standalone local search solver for general integer linear programming validated on a large heterogeneous problem dataset. We propose a local search framework that switches in three modes, namely Search, Improve, and Restore modes, and design tailored operators adapted to different modes, thus improve the quality of the current solution according to different situations. For the Search and Restore modes, we propose an operator named tight move, which adaptively modifies variables' values trying to make some constraint tight. For the Improve mode, an efficient operator lift move is proposed to improve the quality of the objective function while maintaining feasibility. Putting these together, we develop a local search solver for integer linear programming called Local-ILP. Experiments conducted on the MIPLIB dataset show the effectiveness of our solver in solving large-scale hard integer linear programming problems within a reasonably short time. Local-ILP is competitive and complementary to the state-of-the-art commercial solver Gurobi and significantly outperforms the state-of-the-art non-commercial solver SCIP. Moreover, our solver establishes new records for 6 MIPLIB open instances.
As phasor measurement units (PMUs) become more widely used in transmission power systems, a fast state estimation (SE) algorithm that can take advantage of their high sample rates is needed. To accomplish this, we present a method that uses graph neural networks (GNNs) to learn complex bus voltage estimates from PMU voltage and current measurements. We propose an original implementation of GNNs over the power system's factor graph to simplify the integration of various types and quantities of measurements on power system buses and branches. Furthermore, we augment the factor graph to improve the robustness of GNN predictions. This model is highly efficient and scalable, as its computational complexity is linear with respect to the number of nodes in the power system. Training and test examples were generated by randomly sampling sets of power system measurements and annotated with the exact solutions of linear SE with PMUs. The numerical results demonstrate that the GNN model provides an accurate approximation of the SE solutions. Furthermore, errors caused by PMU malfunctions or communication failures that would normally make the SE problem unobservable have a local effect and do not deteriorate the results in the rest of the power system.
We study the power of price discrimination via an intermediary in bilateral trade, when there is a revenue-maximizing seller selling an item to a buyer with a private value drawn from a prior. Between the seller and the buyer, there is an intermediary that can segment the market by releasing information about the true values to the seller. This is termed signaling, and enables the seller to price discriminate. In this setting, Bergemann et al. showed the existence of a signaling scheme that simultaneously raises the optimal consumer surplus, guarantees the item always sells, and ensures the seller's revenue does not increase. Our work extends the positive result of Bergemann et al. to settings where the type space is larger, and where optimal auction is randomized, possibly over a menu that can be exponentially large. In particular, we consider two settings motivated by budgets: The first is when there is a publicly known budget constraint on the price the seller can charge and the second is the FedEx problem where the buyer has a private deadline or service level (equivalently, a private budget that is guaranteed to never bind). For both settings, we present a novel signaling scheme and its analysis via a continuous construction process that recreates the optimal consumer surplus guarantee of Bergemann et al. The settings we consider are special cases of the more general problem where the buyer has a private budget constraint in addition to a private value. We finally show that our positive results do not extend to this more general setting. Here, we show that any efficient signaling scheme necessarily transfers almost all the surplus to the seller instead of the buyer.
Neural networks are rapidly gaining interest in nonlinear system identification due to the model's ability to capture complex input-output relations directly from data. However, despite the flexibility of the approach, there are still concerns about the safety of these models in this context, as well as the need for large amounts of potentially expensive data. Aluminum electrolysis is a highly nonlinear production process, and most of the data must be sampled manually, making the sampling process expensive and infrequent. In the case of infrequent measurements of state variables, the accuracy and open-loop stability of the long-term predictions become highly important. Standard neural networks struggle to provide stable long-term predictions with limited training data. In this work, we investigate the effect of combining concatenated skip-connections and the sparsity-promoting $\ell_1$ regularization on the open-loop stability and accuracy of forecasts with short, medium, and long prediction horizons. The case study is conducted on a high-dimensional and nonlinear simulator representing an aluminum electrolysis cell's mass and energy balance. The proposed model structure contains concatenated skip connections from the input layer and all intermittent layers to the output layer, referred to as InputSkip. $\ell_1$ regularized InputSkip is called sparse InputSkip. The results show that sparse InputSkip outperforms dense and sparse standard feedforward neural networks and dense InputSkip regarding open-loop stability and long-term predictive accuracy. The results are significant when models are trained on datasets of all sizes (small, medium, and large training sets) and for all prediction horizons (short, medium, and long prediction horizons.)
Increasing activity and the number of devices online are leading to increasing and more diverse cyber attacks. This continuously evolving attack activity makes signature-based detection methods ineffective. Once malware has infiltrated into a LAN, bypassing an external gateway or entering via an unsecured mobile device, it can potentially infect all nodes in the LAN as well as carry out nefarious activities such as stealing valuable data, leading to financial damage and loss of reputation. Such infiltration could be viewed as an insider attack, increasing the need for LAN monitoring and security. In this paper we aim to detect such inner-LAN activity by studying the variations in Address Resolution Protocol (ARP) calls within the LAN. We find anomalous nodes by modelling inner-LAN traffic using hierarchical forecasting methods. We substantially reduce the false positives ever present in anomaly detection, by using an extreme value theory based method. We use a dataset from a real inner-LAN monitoring project, containing over 10M ARP calls from 362 nodes. Furthermore, the small number of false positives generated using our methods, is a potential solution to the "alert fatigue" commonly reported by security experts.
ChatGPT is a large language model recently released by the OpenAI company. In this technical report, we explore for the first time the capability of ChatGPT for programming numerical algorithms. Specifically, we examine the capability of GhatGPT for generating codes for numerical algorithms in different programming languages, for debugging and improving written codes by users, for completing missed parts of numerical codes, rewriting available codes in other programming languages, and for parallelizing serial codes. Additionally, we assess if ChatGPT can recognize if given codes are written by humans or machines. To reach this goal, we consider a variety of mathematical problems such as the Poisson equation, the diffusion equation, the incompressible Navier-Stokes equations, compressible inviscid flow, eigenvalue problems, solving linear systems of equations, storing sparse matrices, etc. Furthermore, we exemplify scientific machine learning such as physics-informed neural networks and convolutional neural networks with applications to computational physics. Through these examples, we investigate the successes, failures, and challenges of ChatGPT. Examples of failures are producing singular matrices, operations on arrays with incompatible sizes, programming interruption for relatively long codes, etc. Our outcomes suggest that ChatGPT can successfully program numerical algorithms in different programming languages, but certain limitations and challenges exist that require further improvement of this machine learning model.
Recently, graph neural networks have been gaining a lot of attention to simulate dynamical systems due to their inductive nature leading to zero-shot generalizability. Similarly, physics-informed inductive biases in deep-learning frameworks have been shown to give superior performance in learning the dynamics of physical systems. There is a growing volume of literature that attempts to combine these two approaches. Here, we evaluate the performance of thirteen different graph neural networks, namely, Hamiltonian and Lagrangian graph neural networks, graph neural ODE, and their variants with explicit constraints and different architectures. We briefly explain the theoretical formulation highlighting the similarities and differences in the inductive biases and graph architecture of these systems. We evaluate these models on spring, pendulum, gravitational, and 3D deformable solid systems to compare the performance in terms of rollout error, conserved quantities such as energy and momentum, and generalizability to unseen system sizes. Our study demonstrates that GNNs with additional inductive biases, such as explicit constraints and decoupling of kinetic and potential energies, exhibit significantly enhanced performance. Further, all the physics-informed GNNs exhibit zero-shot generalizability to system sizes an order of magnitude larger than the training system, thus providing a promising route to simulate large-scale realistic systems.
Graph neural networks (GNNs) are a popular class of machine learning models whose major advantage is their ability to incorporate a sparse and discrete dependency structure between data points. Unfortunately, GNNs can only be used when such a graph-structure is available. In practice, however, real-world graphs are often noisy and incomplete or might not be available at all. With this work, we propose to jointly learn the graph structure and the parameters of graph convolutional networks (GCNs) by approximately solving a bilevel program that learns a discrete probability distribution on the edges of the graph. This allows one to apply GCNs not only in scenarios where the given graph is incomplete or corrupted but also in those where a graph is not available. We conduct a series of experiments that analyze the behavior of the proposed method and demonstrate that it outperforms related methods by a significant margin.
Dynamic programming (DP) solves a variety of structured combinatorial problems by iteratively breaking them down into smaller subproblems. In spite of their versatility, DP algorithms are usually non-differentiable, which hampers their use as a layer in neural networks trained by backpropagation. To address this issue, we propose to smooth the max operator in the dynamic programming recursion, using a strongly convex regularizer. This allows to relax both the optimal value and solution of the original combinatorial problem, and turns a broad class of DP algorithms into differentiable operators. Theoretically, we provide a new probabilistic perspective on backpropagating through these DP operators, and relate them to inference in graphical models. We derive two particular instantiations of our framework, a smoothed Viterbi algorithm for sequence prediction and a smoothed DTW algorithm for time-series alignment. We showcase these instantiations on two structured prediction tasks and on structured and sparse attention for neural machine translation.