The radio access network (RAN) part of the next-generation wireless networks will require efficient solutions for satisfying low latency and high-throughput services. The open RAN (O-RAN) is one of the candidates to achieve this goal, in addition to increasing vendor diversity and promoting openness. In the O-RAN architecture, network functions are executed in central units (CU), distributed units (DU), and radio units (RU). These entities are virtualized on general-purpose CPUs and form a processing pool. These processing pools can be located in different geographical places and have limited capacity, affecting the energy consumption and the performance of networks. Additionally, since user demand is not deterministic, special attention should be paid to allocating resource blocks to users by ensuring their expected quality of service for latency-sensitive traffic flows. In this paper, we propose a joint optimization solution to enhance energy efficiency and provide delay guarantees to the users in the O-RAN architecture. We formulate this novel problem and linearize it to provide a solution with a mixed-integer linear problem (MILP) solver. We compare this with a baseline that addresses this optimization problem using a disjoint approach. The results show that our approach outperforms the baseline method in terms of energy efficiency.
For decision making under uncertainty, min-max regret has been established as a popular methodology to find robust solutions. In this approach, we compare the performance of our solution against the best possible performance had we known the true scenario in advance. We introduce a generalization of this setting which allows us to compare against solutions that are also affected by uncertainty, which we call balanced regret. Using budgeted uncertainty sets, this allows for a wider range of possible alternatives the decision maker may choose from. We analyze this approach for general combinatorial problems, providing an iterative solution method and insights into solution properties. We then consider a type of selection problem in more detail and show that, while the classic regret setting with budgeted uncertainty sets can be solved in polynomial time, the balanced regret problem becomes NP-hard. In computational experiments using random and real-world data, we show that balanced regret solutions provide a useful trade-off for the performance in classic performance measures.
Intelligent reflecting surfaces (IRSs) are emerging as promising enablers for the next generation of wireless communication systems, because of their ability to customize favorable radio propagation environments. However, with the conventional passive architecture, IRSs can only adjust the phase of the incident signals limiting the achievable beamforming gain. To fully unleash the potential of IRSs, in this paper, we consider a more general IRS architecture, i.e., active IRSs, which can adapt the phase and amplify the magnitude of the reflected incident signal simultaneously with the support of an additional power source. To realize green communication in active IRS-assisted multiuser systems, we jointly optimize the reflection matrix at the IRS and the beamforming vector at the base station (BS) for the minimization of the BS transmit power. The resource allocation algorithm design is formulated as an optimization problem taking into account the maximum power budget of the active IRS and the quality-of-service (QoS) requirements of the users. To handle the non-convex design problem, we develop a novel and computationally efficient algorithm based on the bilinear transformation and inner approximation methods. The proposed algorithm is guaranteed to converge to a locally optimal solution of the considered problem. Simulation results illustrate the effectiveness of the proposed scheme compared to the two baseline schemes. Moreover, the results unveil that deploying active IRSs is a promising approach to enhance the system performance compared to conventional passive IRSs, especially when strong direct links exist.
Bregman proximal point algorithm (BPPA), as one of the centerpieces in the optimization toolbox, has been witnessing emerging applications. With simple and easy to implement update rule, the algorithm bears several compelling intuitions for empirical successes, yet rigorous justifications are still largely unexplored. We study the computational properties of BPPA through classification tasks with separable data, and demonstrate provable algorithmic regularization effects associated with BPPA. We show that BPPA attains non-trivial margin, which closely depends on the condition number of the distance generating function inducing the Bregman divergence. We further demonstrate that the dependence on the condition number is tight for a class of problems, thus showing the importance of divergence in affecting the quality of the obtained solutions. In addition, we extend our findings to mirror descent (MD), for which we establish similar connections between the margin and Bregman divergence. We demonstrate through a concrete example, and show BPPA/MD converges in direction to the maximal margin solution with respect to the Mahalanobis distance. Our theoretical findings are among the first to demonstrate the benign learning properties BPPA/MD, and also provide corroborations for a careful choice of divergence in the algorithmic design.
Many problems in signal processing and machine learning can be formalized as weak submodular optimization tasks. For such problems, a simple greedy algorithm (\textsc{Greedy}) is guaranteed to find a solution achieving the objective with a value no worse than $1-e^{-1/c}$ of the optimal, where $c$ is the multiplicative weak-submodularity constant. Due to the high cost of querying large-scale systems, the complexity of \textsc{Greedy} becomes prohibitive in contemporary applications. In this work, we study the tradeoff between performance and complexity when one resorts to random sampling strategies to reduce the query complexity of \textsc{Greedy}. Specifically, we quantify the effect of uniform sampling strategies on \textsc{Greedy}'s performance through two metrics: (i) probability of identifying an optimal subset, and (ii) suboptimality with respect to the optimal solution. The latter implies that uniform sampling strategies with a fixed sampling size achieve a non-trivial approximation factor; however, we show that with overwhelming probability, these methods fail to find the optimal subset. Our analysis shows that the failure of uniform sampling strategies with fixed sample size can be circumvented by successively increasing the size of the search space. Building upon this insight, we propose a simple progressive stochastic greedy algorithm and study its approximation guarantees. Moreover, we demonstrate effectiveness of the proposed method in dimensionality reduction applications and feature selection tasks for clustering and object tracking.
As spiking-based deep learning inference applications are increasing in embedded systems, these systems tend to integrate neuromorphic accelerators such as $\mu$Brain to improve energy efficiency. We propose a $\mu$Brain-based scalable many-core neuromorphic hardware design to accelerate the computations of spiking deep convolutional neural networks (SDCNNs). To increase energy efficiency, cores are designed to be heterogeneous in terms of their neuron and synapse capacity (big cores have higher capacity than the little ones), and they are interconnected using a parallel segmented bus interconnect, which leads to lower latency and energy compared to a traditional mesh-based Network-on-Chip (NoC). We propose a system software framework called SentryOS to map SDCNN inference applications to the proposed design. SentryOS consists of a compiler and a run-time manager. The compiler compiles an SDCNN application into subnetworks by exploiting the internal architecture of big and little $\mu$Brain cores. The run-time manager schedules these sub-networks onto cores and pipeline their execution to improve throughput. We evaluate the proposed big little many-core neuromorphic design and the system software framework with five commonlyused SDCNN inference applications and show that the proposed solution reduces energy (between 37% and 98%), reduces latency (between 9% and 25%), and increases application throughput (between 20% and 36%). We also show that SentryOS can be easily extended for other spiking neuromorphic accelerators.
Reconfigurable intelligent surfaces (RISs) represent a promising candidate for sixth-generation (6G) wireless networks, as the RIS technology provides a new solution to control the propagation channel in order to improve the efficiency of a wireless link through enhancing the received signal power. In this paper, we propose RIS-assisted receive quadrature space-shift keying (RIS-RQSSK), which enhances the spectral efficiency of an RIS-based index modulation (IM) system by using the real and imaginary dimensions independently for the purpose of IM. Therefore, the error rate performance of the system is improved as all RIS elements reflect the incident transmit signal toward both selected receive antennas. At the receiver, a low-complexity but effective greedy detector (GD) can be employed which determines the maximum energy per dimension at the receive antennas. A max-min optimization problem is defined to maximize the received signal-to-noise ratio (SNR) components at both selected receive antennas; an analytical solution is provided based on Lagrange duality. In particular, the multi-variable optimization problem is shown to reduce to the solution of a single-variable equation, which results in a very simple design procedure. In addition, we investigate the average bit error probability (ABEP) of the proposed RIS-RQSSK system and derive a closed-form approximate upper bound on the ABEP. We also provide extensive numerical simulations to validate our derivations. Numerical results show that the proposed RIS-RQSSK scheme substantially outperforms recent prominent benchmark schemes. This enhancement considerably increases with an increasing number of receive antennas.
A new conservative finite element solver for the three-dimensional steady magnetohydrodynamic (MHD) kinematics equations is presented.The solver utilizes magnetic vector potential and current density as solution variables, which are discretized by H(curl)-conforming edge-element and H(div)-conforming face element respectively. As a result, the divergence-free constraints of discrete current density and magnetic induction are both satisfied. Moreover the solutions also preserve the total magnetic helicity. The generated linear algebraic equation is a typical dual saddle-point problem that is ill-conditioned and indefinite. To efficiently solve it, we develop a block preconditioner based on constraint preconditioning framework and devise a preconditioned FGMRES solver. Numerical experiments verify the conservative properties, the convergence rate of the discrete solutions and the robustness of the preconditioner.
In this paper, we provide an optimal additive noise mechanism for database queries with discrete answers on a finite support. The noise provides the minimum error rate for a given $(\epsilon,\delta)$ pair. Popular schemes apply random additive noise with infinite support and then clamp the resulting query response to the desired range. Clamping, unfortunately, compromises the privacy guarantees. Using modulo addition, rather than clamping, we show that, for any $(\epsilon,\delta)$ pair, the optimum additive noise distribution can be obtained by solving a mixed integer linear program (MILP). Having introduced our optimal noise design formulation, we derive closed form solutions for the optimal noise probability mass function (PMF) and the probability of error for two special cases. In our performance studies, we show that the proposed optimal mechanism outperforms state of the art for a given probability of error and for any budget $\epsilon >0$.
Tensors, which provide a powerful and flexible model for representing multi-attribute data and multi-way interactions, play an indispensable role in modern data science across various fields in science and engineering. A fundamental task is to faithfully recover the tensor from highly incomplete measurements in a statistically and computationally efficient manner. Harnessing the low-rank structure of tensors in the Tucker decomposition, this paper develops a scaled gradient descent (ScaledGD) algorithm to directly recover the tensor factors with tailored spectral initializations, and shows that it provably converges at a linear rate independent of the condition number of the ground truth tensor for two canonical problems -- tensor completion and tensor regression -- as soon as the sample size is above the order of $n^{3/2}$ ignoring other parameter dependencies, where $n$ is the dimension of the tensor. This leads to an extremely scalable approach to low-rank tensor estimation compared with prior art, which suffers from at least one of the following drawbacks: extreme sensitivity to ill-conditioning, high per-iteration costs in terms of memory and computation, or poor sample complexity guarantees. To the best of our knowledge, ScaledGD is the first algorithm that achieves near-optimal statistical and computational complexities simultaneously for low-rank tensor completion with the Tucker decomposition. Our algorithm highlights the power of appropriate preconditioning in accelerating nonconvex statistical estimation, where the iteration-varying preconditioners promote desirable invariance properties of the trajectory with respect to the underlying symmetry in low-rank tensor factorization.
We present a new clustering method in the form of a single clustering equation that is able to directly discover groupings in the data. The main proposition is that the first neighbor of each sample is all one needs to discover large chains and finding the groups in the data. In contrast to most existing clustering algorithms our method does not require any hyper-parameters, distance thresholds and/or the need to specify the number of clusters. The proposed algorithm belongs to the family of hierarchical agglomerative methods. The technique has a very low computational overhead, is easily scalable and applicable to large practical problems. Evaluation on well known datasets from different domains ranging between 1077 and 8.1 million samples shows substantial performance gains when compared to the existing clustering techniques.