We study a general online combinatorial auction problem in algorithmic mechanism design. A provider allocates multiple types of capacity-limited resources to customers that arrive in a sequential and arbitrary manner. Each customer has a private valuation function on bundles of resources that she can purchase (e.g., a combination of different resources such as CPU and RAM in cloud computing). The provider charges payment from customers who purchase a bundle of resources and incurs an increasing supply cost with respect to the totality of resources allocated. The goal is to maximize the social welfare, namely, the total valuation of customers for their purchased bundles, minus the total supply cost of the provider for all the resources that have been allocated. We adopt the competitive analysis framework and provide posted-price mechanisms with optimal competitive ratios. Our pricing mechanism is optimal in the sense that no other online algorithms can achieve a better competitive ratio. We validate the theoretic results via empirical studies of online resource allocation in cloud computing. Our numerical results demonstrate that the proposed pricing mechanism is competitive and robust against system uncertainties and outperforms existing benchmarks.
How can we collect the most useful labels to learn a model selection policy, when presented with arbitrary heterogeneous data streams? In this paper, we formulate this task as an online contextual active model selection problem, where at each round the learner receives an unlabeled data point along with a context. The goal is to output the best model for any given context without obtaining an excessive amount of labels. In particular, we focus on the task of selecting pre-trained classifiers, and propose a contextual active model selection algorithm (CAMS), which relies on a novel uncertainty sampling query criterion defined on a given policy class for adaptive model selection. In comparison to prior art, our algorithm does not assume a globally optimal model. We provide rigorous theoretical analysis for the regret and query complexity under both adversarial and stochastic settings. Our experiments on several benchmark classification datasets demonstrate the algorithm's effectiveness in terms of both regret and query complexity. Notably, to achieve the same accuracy, CAMS incurs less than 10% of the label cost when compared to the best online model selection baselines on CIFAR10.
A biomechanical model often requires parameter estimation and selection in a known but complicated nonlinear function. Motivated by observing that data from a head-neck position tracking system, one of biomechanical models, show multiplicative time dependent errors, we develop a modified penalized weighted least squares estimator. The proposed method can be also applied to a model with non-zero mean time dependent additive errors. Asymptotic properties of the proposed estimator are investigated under mild conditions on a weight matrix and the error process. A simulation study demonstrates that the proposed estimation works well in both parameter estimation and selection with time dependent error. The analysis and comparison with an existing method for head-neck position tracking data show better performance of the proposed method in terms of the variance accounted for (VAF).
Deep Reinforcement Learning (RL) is mainly studied in a setting where the training and the testing environments are similar. But in many practical applications, these environments may differ. For instance, in control systems, the robot(s) on which a policy is learned might differ from the robot(s) on which a policy will run. It can be caused by different internal factors (e.g., calibration issues, system attrition, defective modules) or also by external changes (e.g., weather conditions). There is a need to develop RL methods that generalize well to variations of the training conditions. In this article, we consider the simplest yet hard to tackle generalization setting where the test environment is unknown at train time, forcing the agent to adapt to the system's new dynamics. This online adaptation process can be computationally expensive (e.g., fine-tuning) and cannot rely on meta-RL techniques since there is just a single train environment. To do so, we propose an approach where we learn a subspace of policies within the parameter space. This subspace contains an infinite number of policies that are trained to solve the training environment while having different parameter values. As a consequence, two policies in that subspace process information differently and exhibit different behaviors when facing variations of the train environment. Our experiments carried out over a large variety of benchmarks compare our approach with baselines, including diversity-based methods. In comparison, our approach is simple to tune, does not need any extra component (e.g., discriminator) and learns policies able to gather a high reward on unseen environments.
In this paper, we investigate the robust resource allocation design for secure communication in an integrated sensing and communication (ISAC) system. A multi-antenna dual-functional radar-communication (DFRC) base station (BS) serves multiple single-antenna legitimate users and senses for targets simultaneously, where already identified targets are treated as potential single-antenna eavesdroppers. The DFRC BS scans a sector with a sequence of dedicated beams, and the ISAC system takes a snapshot of the environment during the transmission of each beam. Based on the sensing information, the DFRC BS can acquire the channel state information (CSI) of the potential eavesdroppers. Different from existing works that focused on the resource allocation design for a single snapshot, in this paper, we propose a novel optimization framework that jointly optimizes the communication and sensing resources over a sequence of snapshots with adjustable durations. To this end, we jointly optimize the duration of each snapshot, the beamforming vector, and the covariance matrix of the AN for maximization of the system sum secrecy rate over a sequence of snapshots while guaranteeing a minimum required average achievable rate and a maximum information leakage constraint for each legitimate user. The resource allocation algorithm design is formulated as a non-convex optimization problem, where we account for the imperfect CSI of both the legitimate users and the potential eavesdroppers. To make the problem tractable, we derive a bound for the uncertainty region of the potential eavesdroppers' small-scale fading based on a safe approximation, which facilitates the development of a block coordinate descent-based iterative algorithm for obtaining an efficient suboptimal solution.
In randomized experiments and observational studies, weighting methods are often used to generalize and transport treatment effect estimates to a target population. Traditional methods construct the weights by separately modeling the treatment assignment and study selection probabilities and then multiplying functions (e.g., inverses) of their estimates. However, these estimated multiplicative weights may not produce adequate covariate balance and can be highly variable, resulting in biased and unstable estimators, especially when there is limited covariate overlap across populations or treatment groups. To address these limitations, we propose a general weighting approach that weights each treatment group towards the target population in a single step. We present a framework and provide a justification for this one-step approach in terms of generic probability distributions. We show a formal connection between our method and inverse probability and inverse odds weighting. By construction, the proposed approach balances covariates and produces stable estimators. We show that our estimator for the target average treatment effect is consistent, asymptotically Normal, multiply robust, and semiparametrically efficient. We demonstrate the performance of this approach using a simulation study and a randomized case study on the effects of physician racial diversity on preventive healthcare utilization among Black men in California.
Understanding the impact of the most effective policies or treatments on a response variable of interest is desirable in many empirical works in economics, statistics and other disciplines. Due to the widespread winner's curse phenomenon, conventional statistical inference assuming that the top policies are chosen independent of the random sample may lead to overly optimistic evaluations of the best policies. In recent years, given the increased availability of large datasets, such an issue can be further complicated when researchers include many covariates to estimate the policy or treatment effects in an attempt to control for potential confounders. In this manuscript, to simultaneously address the above-mentioned issues, we propose a resampling-based procedure that not only lifts the winner's curse in evaluating the best policies observed in a random sample, but also is robust to the presence of many covariates. The proposed inference procedure yields accurate point estimates and valid frequentist confidence intervals that achieve the exact nominal level as the sample size goes to infinity for multiple best policy effect sizes. We illustrate the finite-sample performance of our approach through Monte Carlo experiments and two empirical studies, evaluating the most effective policies in charitable giving and the most beneficial group of workers in the National Supported Work program.
We study a submodular maximization problem motivated by applications in online retail. A platform displays a list of products to a user in response to a search query. The user inspects the first $k$ items in the list for a $k$ chosen at random from a given distribution, and decides whether to purchase an item from that set based on a choice model. The goal of the platform is to maximize the engagement of the shopper defined as the probability of purchase. This problem gives rise to a less-studied variation of submodular maximization in which we are asked to choose an $\textit{ordering}$ of a set of elements to maximize a linear combination of different submodular functions. First, using a reduction to maximizing submodular functions over matroids, we give an optimal $\left(1-1/e\right)$-approximation for this problem. We then consider a variant in which the platform cares not only about user engagement, but also about diversification across various groups of users, that is, guaranteeing a certain probability of purchase in each group. We characterize the polytope of feasible solutions and give a bi-criteria $((1-1/e)^2,(1-1/e)^2)$-approximation for this problem by rounding an approximate solution of a linear programming relaxation. For rounding, we rely on our reduction and the particular rounding techniques for matroid polytopes. For the special case in which underlying submodular functions are coverage functions -- which is practically relevant in online retail -- we propose an alternative LP relaxation and a simpler randomized rounding for the problem. This approach yields to an optimal bi-criteria $(1-1/e,1-1/e)$-approximation algorithm for the special case of the problem with coverage functions.
Motivated by applications in cloud computing spot markets and selling banner ads on popular websites, we study the online resource allocation problem with "costly buyback". To model this problem, we consider the classic edge-weighted fractional online matching problem with a tweak, where the decision maker can recall (i.e., buyback) any fraction of an offline resource that is pre-allocated to an earlier online vertex; however, by doing so not only the decision maker loses the previously allocated reward (which equates the edge-weight), it also has to pay a non-negative constant factor $f$ of this edge-weight as an extra penalty. Parameterizing the problem by the buyback factor $f$, our main result is obtaining optimal competitive algorithms for all possible values of $f$ through a novel primal-dual family of algorithms. We establish the optimality of our results by obtaining separate lower-bounds for each of small and large buyback factor regimes, and showing how our primal-dual algorithm exactly matches this lower-bound by appropriately tuning a parameter as a function of $f$. We further study lower and upper bounds on the competitive ratio in variants of this model, e.g., single-resource with different demand sizes, or matching with deterministic integral allocations. We show how algorithms in the our family of primal-dual algorithms can obtain the exact optimal competitive ratio in all of these variants -- which in turn demonstrates the power of our algorithmic framework for online resource allocations with costly buyback.
Deep neural networks (DNNs) have achieved unprecedented success in the field of artificial intelligence (AI), including computer vision, natural language processing and speech recognition. However, their superior performance comes at the considerable cost of computational complexity, which greatly hinders their applications in many resource-constrained devices, such as mobile phones and Internet of Things (IoT) devices. Therefore, methods and techniques that are able to lift the efficiency bottleneck while preserving the high accuracy of DNNs are in great demand in order to enable numerous edge AI applications. This paper provides an overview of efficient deep learning methods, systems and applications. We start from introducing popular model compression methods, including pruning, factorization, quantization as well as compact model design. To reduce the large design cost of these manual solutions, we discuss the AutoML framework for each of them, such as neural architecture search (NAS) and automated pruning and quantization. We then cover efficient on-device training to enable user customization based on the local data on mobile devices. Apart from general acceleration techniques, we also showcase several task-specific accelerations for point cloud, video and natural language processing by exploiting their spatial sparsity and temporal/token redundancy. Finally, to support all these algorithmic advancements, we introduce the efficient deep learning system design from both software and hardware perspectives.
Visual recognition is currently one of the most important and active research areas in computer vision, pattern recognition, and even the general field of artificial intelligence. It has great fundamental importance and strong industrial needs. Deep neural networks (DNNs) have largely boosted their performances on many concrete tasks, with the help of large amounts of training data and new powerful computation resources. Though recognition accuracy is usually the first concern for new progresses, efficiency is actually rather important and sometimes critical for both academic research and industrial applications. Moreover, insightful views on the opportunities and challenges of efficiency are also highly required for the entire community. While general surveys on the efficiency issue of DNNs have been done from various perspectives, as far as we are aware, scarcely any of them focused on visual recognition systematically, and thus it is unclear which progresses are applicable to it and what else should be concerned. In this paper, we present the review of the recent advances with our suggestions on the new possible directions towards improving the efficiency of DNN-related visual recognition approaches. We investigate not only from the model but also the data point of view (which is not the case in existing surveys), and focus on three most studied data types (images, videos and points). This paper attempts to provide a systematic summary via a comprehensive survey which can serve as a valuable reference and inspire both researchers and practitioners who work on visual recognition problems.