Every computer system -- from schedulers in clouds (e.g. Amazon) to computer networks to operating systems -- performs resource allocation across system users. The defacto allocation policies are max-min fairness (MMF) for single resources and dominant resource fairness (DRF) for multiple resources which guarantee properties like incentive compatibility, envy-freeness, and Pareto efficiency, assuming user demands are static (time-independent). However, in real-world systems, user demands are dynamic, i.e. time-dependant. As a result, there is now a fundamental mismatch between the goals of computer systems and the properties enabled by classic resource allocation policies. We aim to bridge this mismatch. When demands are dynamic, instant-by-instant MMF can be extremely unfair over longer periods of time, i.e. lead to unbalanced user allocations as previous allocations have no effect in the present. We consider a natural generalization of MMF and DRF for multiple resources and users with dynamic demands: this algorithm ensures that user allocations are as max-min fair as possible up to any time instant, given past allocations. This dynamic mechanism remains Pareto optimal and envy-free, but not incentive compatible. However, our results show that the possible increase in utility by misreporting is bounded and, since this can lead to significant decrease in overall useful allocation, this suggests that it is not a useful strategy. Our main result is to show that our dynamic DRF algorithm is $(1+\rho)$-incentive compatible, where $\rho$ quantifies the relative importance of a resource for different users; we show that this factor is tight even with only two resources. We also present a $3/2$ upper bound and a $\sqrt 2$ lower bound for incentive compatibility when there is only one resource. We also offer extensions for the case when users are weighted to prioritize them differently.
This paper investigates a new downlink nonorthogonal multiple access (NOMA) system, where a multiantenna unmanned aerial vehicle (UAV) is powered by wireless power transfer (WPT) and serves as the base station for multiple pairs of ground users (GUs) running NOMA in each pair. An energy efficiency (EE) maximization problem is formulated to jointly optimize the WPT time and the placement for the UAV, and the allocation of the UAV's transmit power between different NOMA user pairs and within each pair. To efficiently solve this nonconvex problem, we decompose the problem into three subproblems using block coordinate descent. For the subproblem of intra-pair power allocation within each NOMA user pair, we construct a supermodular game with confirmed convergence to a Nash equilibrium. Given the intra-pair power allocation, successive convex approximation is applied to convexify and solve the subproblem of WPT time allocation and inter-pair power allocation between the user pairs. Finally, we solve the subproblem of UAV placement by using the Lagrange multiplier method. Simulations show that our approach can substantially outperform its alternatives that do not use NOMA and WPT techniques or that do not optimize the UAV location.
We consider the question of adaptive data analysis within the framework of convex optimization. We ask how many samples are needed in order to compute $\epsilon$-accurate estimates of $O(1/\epsilon^2)$ gradients queried by gradient descent, and we provide two intermediate answers to this question. First, we show that for a general analyst (not necessarily gradient descent) $\Omega(1/\epsilon^3)$ samples are required. This rules out the possibility of a foolproof mechanism. Our construction builds upon a new lower bound (that may be of interest of its own right) for an analyst that may ask several non adaptive questions in a batch of fixed and known $T$ rounds of adaptivity and requires a fraction of true discoveries. We show that for such an analyst $\Omega (\sqrt{T}/\epsilon^2)$ samples are necessary. Second, we show that, under certain assumptions on the oracle, in an interaction with gradient descent $\tilde \Omega(1/\epsilon^{2.5})$ samples are necessary. Our assumptions are that the oracle has only \emph{first order access} and is \emph{post-hoc generalizing}. First order access means that it can only compute the gradients of the sampled function at points queried by the algorithm. Our assumption of \emph{post-hoc generalization} follows from existing lower bounds for statistical queries. More generally then, we provide a generic reduction from the standard setting of statistical queries to the problem of estimating gradients queried by gradient descent. These results are in contrast with classical bounds that show that with $O(1/\epsilon^2)$ samples one can optimize the population risk to accuracy of $O(\epsilon)$ but, as it turns out, with spurious gradients.
Multi-Agent Systems (MAS) are notoriously complex and hard to verify. In fact, it is not trivial to model a MAS, and even when a model is built, it is not always possible to verify, in a formal way, that it is actually behaving as we expect. Usually, it is relevant to know whether an agent is capable of fulfilling its own goals. One possible way to check this is through Model Checking. Specifically, by verifying Alternating-time Temporal Logic (ATL) properties, where the notion of strategies for achieving goals can be described. Unfortunately, the resulting model checking problem is not decidable in general. In this paper, we present a verification procedure based on combining Model Checking and Runtime Verification, where sub-models of the MAS model belonging to decidable fragments are verified by a model checker, and runtime monitors are used to verify the rest. Furthermore, we implement our technique and show experimental results.
Following the research agenda initiated by Munoz & Vassilvitskii [1] and Lykouris & Vassilvitskii [2] on learning-augmented online algorithms for classical online optimization problems, in this work, we consider the Online Facility Location problem under this framework. In Online Facility Location (OFL), demands arrive one-by-one in a metric space and must be (irrevocably) assigned to an open facility upon arrival, without any knowledge about future demands. We present an online algorithm for OFL that exploits potentially imperfect predictions on the locations of the optimal facilities. We prove that the competitive ratio decreases smoothly from sublogarithmic in the number of demands to constant, as the error, i.e., the total distance of the predicted locations to the optimal facility locations, decreases towards zero. We complement our analysis with a matching lower bound establishing that the dependence of the algorithm's competitive ratio on the error is optimal, up to constant factors. Finally, we evaluate our algorithm on real world data and compare our learning augmented approach with the current best online algorithm for the problem.
As machine learning algorithms become increasingly integrated in crucial decision-making scenarios, such as healthcare, recruitment, and risk assessment, there have been increasing concerns about the privacy and fairness of such systems. Federated learning has been viewed as a promising solution for collaboratively training of machine learning models among multiple parties while maintaining the privacy of their local data. However, federated learning also poses new challenges in mitigating the potential bias against certain populations (e.g., demographic groups), as this typically requires centralized access to the sensitive information (e.g., race, gender) of each data point. Motivated by the importance and challenges of group fairness in federated learning, in this work, we propose FairFed, a novel algorithm to enhance group fairness via a fairness-aware aggregation method, which aims to provide fair model performance across different sensitive groups (e.g., racial, gender groups) while maintaining high utility. This formulation can further provide more flexibility in the customized local debiasing strategies for each client. We build our FairFed algorithm around the secure aggregation protocol of federated learning. When running federated training on widely investigated fairness datasets, we demonstrate that our proposed method outperforms the state-of-the-art fair federated learning frameworks under a high heterogeneous sensitive attribute distribution. We also investigate the performance of FairFed on naturally distributed real-life data collected from different geographical locations or departments within an organization.
On-demand delivery has become increasingly popular around the world. Motivated by a large grocery chain store who offers fast on-demand delivery services, we model and solve a stochastic dynamic driver dispatching and routing problem for last-mile delivery systems where on-time performance is the main target. The system operator needs to dispatch a set of drivers and specify their delivery routes facing random demand that arrives over a fixed number of periods. The resulting stochastic dynamic program is challenging to solve due to the curse of dimensionality. We propose a novel structured approximation framework to approximate the value function via a parametrized dispatching and routing policy. We analyze the structural properties of the approximation framework and establish its performance guarantee under large-demand scenarios. We then develop efficient exact algorithms for the approximation problem based on Benders decomposition and column generation, which deliver verifiably optimal solutions within minutes. The evaluation results on a real-world data set show that our framework outperforms the current policy of the company by 36.53% on average in terms of delivery time. We also perform several policy experiments to understand the value of dynamic dispatching and routing with varying fleet sizes and dispatch frequencies.
Reinforcement Learning (RL) approaches are lately deployed for orchestrating wireless communications empowered by Reconfigurable Intelligent Surfaces (RISs), leveraging their online optimization capabilities. Most commonly, in RL-based formulations for realistic RISs with low resolution phase-tunable elements, each configuration is modeled as a distinct reflection action, resulting to inefficient exploration due to the exponential nature of the search space. In this paper, we consider RISs with 1-bit phase resolution elements, and model the action of each of them as a binary vector including the feasible reflection coefficients. We then introduce two variations of the well-established Deep Q-Network (DQN) and Deep Deterministic Policy Gradient (DDPG) agents, aiming for effective exploration of the binary action spaces. For the case of DQN, we make use of an efficient approximation of the Q-function, whereas a discretization post-processing step is applied to the output of DDPG. Our simulation results showcase that the proposed techniques greatly outperform the baseline in terms of the rate maximization objective, when large-scale RISs are considered. In addition, when dealing with moderate scale RIS sizes, where the conventional DQN based on configuration-based action spaces is feasible, the performance of the latter technique is similar to the proposed learning approach.
Representation learning enables us to automatically extract generic feature representations from a dataset to solve another machine learning task. Recently, extracted feature representations by a representation learning algorithm and a simple predictor have exhibited state-of-the-art performance on several machine learning tasks. Despite its remarkable progress, there exist various ways to evaluate representation learning algorithms depending on the application because of the flexibility of representation learning. To understand the current representation learning, we review evaluation methods of representation learning algorithms and theoretical analyses. On the basis of our evaluation survey, we also discuss the future direction of representation learning. Note that this survey is the extended version of Nozawa and Sato (2022).
We provide a decision theoretic analysis of bandit experiments. The setting corresponds to a dynamic programming problem, but solving this directly is typically infeasible. Working within the framework of diffusion asymptotics, we define suitable notions of asymptotic Bayes and minimax risk for bandit experiments. For normally distributed rewards, the minimal Bayes risk can be characterized as the solution to a nonlinear second-order partial differential equation (PDE). Using a limit of experiments approach, we show that this PDE characterization also holds asymptotically under both parametric and non-parametric distribution of the rewards. The approach further describes the state variables it is asymptotically sufficient to restrict attention to, and therefore suggests a practical strategy for dimension reduction. The upshot is that we can approximate the dynamic programming problem defining the bandit experiment with a PDE which can be efficiently solved using sparse matrix routines. We derive the optimal Bayes and minimax policies from the numerical solutions to these equations. The proposed policies substantially dominate existing methods such as Thompson sampling. The framework also allows for substantial generalizations to the bandit problem such as time discounting and pure exploration motives.
Deep convolutional neural networks (CNNs) have recently achieved great success in many visual recognition tasks. However, existing deep neural network models are computationally expensive and memory intensive, hindering their deployment in devices with low memory resources or in applications with strict latency requirements. Therefore, a natural thought is to perform model compression and acceleration in deep networks without significantly decreasing the model performance. During the past few years, tremendous progress has been made in this area. In this paper, we survey the recent advanced techniques for compacting and accelerating CNNs model developed. These techniques are roughly categorized into four schemes: parameter pruning and sharing, low-rank factorization, transferred/compact convolutional filters, and knowledge distillation. Methods of parameter pruning and sharing will be described at the beginning, after that the other techniques will be introduced. For each scheme, we provide insightful analysis regarding the performance, related applications, advantages, and drawbacks etc. Then we will go through a few very recent additional successful methods, for example, dynamic capacity networks and stochastic depths networks. After that, we survey the evaluation matrix, the main datasets used for evaluating the model performance and recent benchmarking efforts. Finally, we conclude this paper, discuss remaining challenges and possible directions on this topic.