We investigate the age of information (AoI) of a relay-assisted cooperative communication system, where a source node sends status update packets to the destination node as timely as possible with the aid of a relay node. For time-slotted systems without relaying, prior works have shown that the source should generate and send a new packet to the destination every time slot to minimize the average AoI, regardless of whether the destination has successfully decoded the packet in the previous slot. However, when a dedicated relay is involved, whether the relay can improve the AoI performance requires an in-depth study. In particular, the packet generation and transmission strategy of the source should be carefully designed to cooperate with the relay. Depending on whether the source and the relay are allowed to transmit simultaneously, two relay-assisted schemes are investigated: time division multiple access (TDMA) and non-orthogonal multiple access (NOMA) schemes. A key challenge in deriving their theoretical average AoI is that the destination has different probabilities of successfully receiving an update packet in different time slots. We model each scheme using a Markov chain to derive the corresponding closed-form average AoI. Interestingly, our theoretical analysis indicates that the relay-assisted schemes can only outperform the non-relay scheme in average AoI when the signal-to-noise ratio of the source-destination link is below -2dB. Furthermore, comparing the merits of relay-assisted schemes, simulation results show that the TDMA scheme has a lower energy consumption, while the NOMA counterpart typically achieves a lower average AoI.
The categorization of massive e-Commerce data is a crucial, well-studied task, which is prevalent in industrial settings. In this work, we aim to improve an existing product categorization model that is already in use by a major web company, serving multiple applications. At its core, the product categorization model is a text classification model that takes a product title as an input and outputs the most suitable category out of thousands of available candidates. Upon a closer inspection, we found inconsistencies in the labeling of similar items. For example, minor modifications of the product title pertaining to colors or measurements majorly impacted the model's output. This phenomenon can negatively affect downstream recommendation or search applications, leading to a sub-optimal user experience. To address this issue, we propose a new framework for consistent text categorization. Our goal is to improve the model's consistency while maintaining its production-level performance. We use a semi-supervised approach for data augmentation and presents two different methods for utilizing unlabeled samples. One method relies directly on existing catalogs, while the other uses a generative model. We compare the pros and cons of each approach and present our experimental results.
Federated learning (FL) is a new distributed machine learning framework known for its benefits on data privacy and communication efficiency. Since full client participation in many cases is infeasible due to constrained resources, partial participation FL algorithms have been investigated that proactively select/sample a subset of clients, aiming to achieve learning performance close to the full participation case. This paper studies a passive partial client participation scenario that is much less well understood, where partial participation is a result of external events, namely client dropout, rather than a decision of the FL algorithm. We cast FL with client dropout as a special case of a larger class of FL problems where clients can submit substitute (possibly inaccurate) local model updates. Based on our convergence analysis, we develop a new algorithm FL-FDMS that discovers friends of clients (i.e., clients whose data distributions are similar) on-the-fly and uses friends' local updates as substitutes for the dropout clients, thereby reducing the substitution error and improving the convergence performance. A complexity reduction mechanism is also incorporated into FL-FDMS, making it both theoretically sound and practically useful. Experiments on MNIST and CIFAR-10 confirmed the superior performance of FL-FDMS in handling client dropout in FL.
Mobile edge computing (MEC) enables resource-limited IoT devices to complete computation-intensive or delay-sensitive task by offloading the task to adjacent edge server deployed at the base station (BS), thus becoming an important technology in 5G and beyond. Due to channel occlusion, some users may not be able to access the computation capability directly from the BS. Confronted with this issue, many other devices in the MEC system can serve as cooperative nodes to collect the tasks of these users and further forward them to the BS. In this paper, we study a MEC system in which multiple users continuously generate the tasks and offload the tasks to the BS through a cooperative node. As the tasks are continuously generated, users should simultaneously execute the task generation in the current time frame and the task offloading of the last time frame, i.e. the task is processed in a streaming model. To optimize the power consumption of the users and the cooperative node for finishing these streaming tasks, we investigate the duration of each step in finishing the tasks together with multiuser offloading ratio and bandwidth allocation within two cases: the BS has abundant computation capacity (Case I) and the BS has limited computation capacity (Case II). For both cases, the formulated optimization problems are nonconvex due to fractional structure of the objective function and complicated variable coupling. To address this issue, we propose optimal solution algorithm with low complexity. Finally, simulation is carried out to verify the effectiveness of the proposed methods and reveal the performance of the considered system.
We propose a new auto-regressive model for the statistical analysis of multivariate distributional time series. The data of interest consist of a collection of multiple series of probability measures supported over a bounded interval of the real line, and that are indexed by distinct time instants. The probability measures are modelled as random objects in the Wasserstein space. We establish the auto-regressive model in the tangent space at the Lebesgue measure by first centering all the raw measures so that their Fr\'echet means turn to be the Lebesgue measure. Using the theory of iterated random function systems, results on the existence, uniqueness and stationarity of the solution of such a model are provided. We also propose a consistent estimator for the model coefficient. In addition to the analysis of simulated data, the proposed model is illustrated with two real data sets made of observations from age distribution in different countries and bike sharing network in Paris. Finally, due to the positive and boundedness constraints that we impose on the model coefficients, the proposed estimator that is learned under these constraints, naturally has a sparse structure. The sparsity allows furthermore the application of the proposed model in learning a graph of temporal dependency from the multivariate distributional time series.
Risk-limiting audits (RLAs) are a significant tool in increasing confidence in the accuracy of elections. They consist of randomized algorithms which check that an election's vote tally, as reported by a vote tabulation system, corresponds to the correct candidates winning. If an initial vote count leads to the wrong election winner, an RLA guarantees to identify the error with high probability over its own randomness. These audits operate by sequentially sampling and examining ballots until they can either confirm the reported winner or identify the true winner. The first part of this work suggests a new generic method, called ``Batchcomp", for converting classical (ballot-level) RLAs into ones that operate on batches. As a concrete application of the suggested method, we develop the first ballot-level RLA for the Israeli Knesset elections, and convert it to one which operates on batches. We ran the suggested ``Batchcomp" procedure on the results of 22nd, 23rd and 24th Knesset elections, both with and without errors. The second part of this work suggests a new use-case for RLAs: verifying that a population census leads to the correct allocation of political power to a nation's districts or federal-states. We present an adaptation of ALPHA, an existing RLA method, to a method which applies to censuses. Our census-RLA is applicable in nations where parliament seats are allocated to geographical regions in proportion to their population according to a certain class of functions (highest averages). It relies on data from both the census and from an additional procedure which is already conducted in many countries today, called a post-enumeration survey.
Most existing studies on linear bandits focus on the one-dimensional characterization of the overall system. While being representative, this formulation may fail to model applications with high-dimensional but favorable structures, such as the low-rank tensor representation for recommender systems. To address this limitation, this work studies a general tensor bandits model, where actions and system parameters are represented by tensors as opposed to vectors, and we particularly focus on the case that the unknown system tensor is low-rank. A novel bandit algorithm, coined TOFU (Tensor Optimism in the Face of Uncertainty), is developed. TOFU first leverages flexible tensor regression techniques to estimate low-dimensional subspaces associated with the system tensor. These estimates are then utilized to convert the original problem to a new one with norm constraints on its system parameters. Lastly, a norm-constrained bandit subroutine is adopted by TOFU, which utilizes these constraints to avoid exploring the entire high-dimensional parameter space. Theoretical analyses show that TOFU improves the best-known regret upper bound by a multiplicative factor that grows exponentially in the system order. A novel performance lower bound is also established, which further corroborates the efficiency of TOFU.
Multipath QUIC is a transport protocol that allows for the use of multiple network interfaces for a single connection. It thereby offers, on the one hand, the possibility to gather a higher throughput, while, on the other hand, multiple paths can also be used to transmit data redundantly. Selective redundancy combines these two applications and thereby offers the potential to transmit time-critical data. This paper considers scenarios where data with real-time requirements are transmitted redundantly while at the same time, non-critical data should make use of the aggregated throughput. A new model called congestion window reservation is proposed, which enables an immediate transmission of time-critical data. The performance of this method and its combination with selective redundancy is evaluated using emulab with real data. The results show that this technique leads to a smaller end-to-end latency and reliability for periodically generated priority data.
We consider a joint sampling and compression system for timely status updates. Samples are taken, quantized and encoded into binary sequences, which are sent to the destination. We formulate an optimization problem to jointly design sampler, quantizer and encoder, minimizing the age of information (AoI) on the basis of satisfying a mean-squared error (MSE) distortion constraint of the samples. We prove that the zero-wait sampling, the uniform quantization, and the real-valued AoI-optimal coding policies together provide an asymptotically optimal solution to this problem, i.e., as the average distortion approaches zero, the combination achieves the minimum AoI asymptotically. Furthermore, we prove that the AoI of this solution is asymptotically linear with respect to the log MSE distortion with a slope of $-\frac{3}{4}$. We also show that the real-valued Shannon coding policy suffices to achieve the optimal performance asymptotically. Numerical simulations corroborate the analysis.
Bid optimization for online advertising from single advertiser's perspective has been thoroughly investigated in both academic research and industrial practice. However, existing work typically assume competitors do not change their bids, i.e., the wining price is fixed, leading to poor performance of the derived solution. Although a few studies use multi-agent reinforcement learning to set up a cooperative game, they still suffer the following drawbacks: (1) They fail to avoid collusion solutions where all the advertisers involved in an auction collude to bid an extremely low price on purpose. (2) Previous works cannot well handle the underlying complex bidding environment, leading to poor model convergence. This problem could be amplified when handling multiple objectives of advertisers which are practical demands but not considered by previous work. In this paper, we propose a novel multi-objective cooperative bid optimization formulation called Multi-Agent Cooperative bidding Games (MACG). MACG sets up a carefully designed multi-objective optimization framework where different objectives of advertisers are incorporated. A global objective to maximize the overall profit of all advertisements is added in order to encourage better cooperation and also to protect self-bidding advertisers. To avoid collusion, we also introduce an extra platform revenue constraint. We analyze the optimal functional form of the bidding formula theoretically and design a policy network accordingly to generate auction-level bids. Then we design an efficient multi-agent evolutionary strategy for model optimization. Offline experiments and online A/B tests conducted on the Taobao platform indicate both single advertiser's objective and global profit have been significantly improved compared to state-of-art methods.
Recently, neural networks have been widely used in e-commerce recommender systems, owing to the rapid development of deep learning. We formalize the recommender system as a sequential recommendation problem, intending to predict the next items that the user might be interacted with. Recent works usually give an overall embedding from a user's behavior sequence. However, a unified user embedding cannot reflect the user's multiple interests during a period. In this paper, we propose a novel controllable multi-interest framework for the sequential recommendation, called ComiRec. Our multi-interest module captures multiple interests from user behavior sequences, which can be exploited for retrieving candidate items from the large-scale item pool. These items are then fed into an aggregation module to obtain the overall recommendation. The aggregation module leverages a controllable factor to balance the recommendation accuracy and diversity. We conduct experiments for the sequential recommendation on two real-world datasets, Amazon and Taobao. Experimental results demonstrate that our framework achieves significant improvements over state-of-the-art models. Our framework has also been successfully deployed on the offline Alibaba distributed cloud platform.