We consider signal source localization from range-difference measurements. First, we give some readily-checked conditions on measurement noises and sensor deployment to guarantee the asymptotic identifiability of the model and show the consistency and asymptotic normality of the maximum likelihood (ML) estimator. Then, we devise an estimator that owns the same asymptotic property as the ML one. Specifically, we prove that the negative log-likelihood function converges to a function, which has a unique minimum and positive definite Hessian at the true source's position. Hence, it is promising to execute local iterations, e.g., the Gauss-Newton (GN) algorithm, following a consistent estimate. The main issue involved is obtaining a preliminary consistent estimate. To this aim, we construct a linear least-squares problem via algebraic operation and constraint relaxation and obtain a closed-form solution. We then focus on deriving and eliminating the bias of the linear least-squares estimator, which yields an asymptotically unbiased (thus consistent) estimate. Noting that the bias is a function of the noise variance, we further devise a consistent noise variance estimator that involves $3$-order polynomial rooting. Based on the preliminary consistent location estimate, a one-step GN iteration suffices to achieve the same asymptotic property as the ML estimator. Simulation results demonstrate the superiority of our proposed algorithm in the large sample case.
In this article, we propose an approach for federated domain adaptation, a setting where distributional shift exists among clients and some have unlabeled data. The proposed framework, FedDaDiL, tackles the resulting challenge through dictionary learning of empirical distributions. In our setting, clients' distributions represent particular domains, and FedDaDiL collectively trains a federated dictionary of empirical distributions. In particular, we build upon the Dataset Dictionary Learning framework by designing collaborative communication protocols and aggregation operations. The chosen protocols keep clients' data private, thus enhancing overall privacy compared to its centralized counterpart. We empirically demonstrate that our approach successfully generates labeled data on the target domain with extensive experiments on (i) Caltech-Office, (ii) TEP, and (iii) CWRU benchmarks. Furthermore, we compare our method to its centralized counterpart and other benchmarks in federated domain adaptation.
We study the design of prior-independent auctions in a setting with heterogeneous bidders. In particular, we consider the setting of selling to $n$ bidders whose values are drawn from $n$ independent but not necessarily identical distributions. We work in the robust auction design regime, where we assume the seller has no knowledge of the bidders' value distributions and must design a mechanism that is prior-independent. While there have been many strong results on prior-independent auction design in the i.i.d. setting, not much is known for the heterogeneous setting, even though the latter is of significant practical importance. Unfortunately, no prior-independent mechanism can hope to always guarantee any approximation to Myerson's revenue in the heterogeneous setting; similarly, no prior-independent mechanism can consistently do better than the second-price auction. In light of this, we design a family of (parametrized) randomized auctions which approximates at least one of these benchmarks: For heterogeneous bidders with regular value distributions, our mechanisms either achieve a good approximation of the expected revenue of an optimal mechanism (which knows the bidders' distributions) or exceeds that of the second-price auction by a certain multiplicative factor. The factor in the latter case naturally trades off with the approximation ratio of the former case. We show that our mechanism is optimal for such a trade-off between the two cases by establishing a matching lower bound. Our result extends to selling $k$ identical items to heterogeneous bidders with an additional $O\big(\ln^2 k\big)$-factor in our trade-off between the two cases.
In this study, we delve into the dynamic landscape of machine learning research evolution. Initially, through the utilization of Latent Dirichlet Allocation, we discern pivotal themes and fundamental concepts that have emerged within the realm of machine learning. Subsequently, we undertake a comprehensive analysis to track the evolutionary trajectories of these identified themes. To quantify the novelty and divergence of research contributions, we employ the Kullback-Leibler Divergence metric. This statistical measure serves as a proxy for ``surprise'', indicating the extent of differentiation between the content of academic papers and the subsequent developments in research. By amalgamating these insights, we gain the ability to ascertain the pivotal roles played by prominent researchers and the significance of specific academic venues (periodicals and conferences) within the machine learning domain.
Many trials are designed to collect outcomes at or around pre-specified times after randomization. In practice, there can be substantial variability in the times when participants are actually assessed. Such irregular assessment times pose a challenge to learning the effect of treatment since not all participants have outcome assessments at the times of interest. Furthermore, observed outcome values may not be representative of all participants' outcomes at a given time. This problem, known as informative assessment times, can arise if participants tend to have assessments when their outcomes are better (or worse) than at other times, or if participants with better outcomes tend to have more (or fewer) assessments. Methods have been developed that account for some types of informative assessment; however, since these methods rely on untestable assumptions, sensitivity analyses are needed. We develop a sensitivity analysis methodology by extending existing weighting methods. Our method accounts for the possibility that participants with worse outcomes at a given time are more (or less) likely than other participants to have an assessment at that time, even after controlling for variables observed earlier in the study. We apply our method to a randomized trial of low-income individuals with uncontrolled asthma. We illustrate implementation of our influence-function based estimation procedure in detail, and we derive the large-sample distribution of our estimator and evaluate its finite-sample performance.
In this work, we consider a real-time IoT monitoring system in which an energy harvesting sensor with a finite-size battery measures a physical process and transmits the status updates to an aggregator. The aggregator, equipped with caching capabilities, can serve the external requests of a destination network with either a stored update or a fresh update from the sensor. We assume the destination network acts as a gossiping network in which the update packets are forwarded among the nodes in a randomized setting. We utilize the Markov Decision Process framework to model and optimize the network's average Version Age of Information (AoI) and obtain the optimal policy at the aggregator. The structure of the optimal policy is analytically demonstrated and numerically verified. Numerical results highlight the effect of the system parameters on the average Version AoI. The simulations reveal the superior performance of the optimal policy compared to a set of baseline policies.
In this paper, we consider simultaneous estimation of Poisson parameters in situations where we can use side information in aggregated data. We use standardized squared error and entropy loss functions. Bayesian shrinkage estimators are derived based on conjugate priors. We compare the risk functions of direct estimators and Bayesian estimators with respect to different priors that are constructed based on different subsets of observations. We obtain conditions for domination and also prove minimaxity and admissibility in a simple setting.
Self-driving software pipelines include components that are learned from a significant number of training examples, yet it remains challenging to evaluate the overall system's safety and generalization performance. Together with scaling up the real-world deployment of autonomous vehicles, it is of critical importance to automatically find simulation scenarios where the driving policies will fail. We propose a method that efficiently generates adversarial simulation scenarios for autonomous driving by solving an optimal control problem that aims to maximally perturb the policy from its nominal trajectory. Given an image-based driving policy, we show that we can inject new objects in a neural rendering representation of the deployment scene, and optimize their texture in order to generate adversarial sensor inputs to the policy. We demonstrate that adversarial scenarios discovered purely in the neural renderer (surrogate scene) can often be successfully transferred to the deployment scene, without further optimization. We demonstrate this transfer occurs both in simulated and real environments, provided the learned surrogate scene is sufficiently close to the deployment scene.
It has been shown that deep neural networks are prone to overfitting on biased training data. Towards addressing this issue, meta-learning employs a meta model for correcting the training bias. Despite the promising performances, super slow training is currently the bottleneck in the meta learning approaches. In this paper, we introduce a novel Faster Meta Update Strategy (FaMUS) to replace the most expensive step in the meta gradient computation with a faster layer-wise approximation. We empirically find that FaMUS yields not only a reasonably accurate but also a low-variance approximation of the meta gradient. We conduct extensive experiments to verify the proposed method on two tasks. We show our method is able to save two-thirds of the training time while still maintaining the comparable or achieving even better generalization performance. In particular, our method achieves the state-of-the-art performance on both synthetic and realistic noisy labels, and obtains promising performance on long-tailed recognition on standard benchmarks.
Deep neural networks (DNNs) are successful in many computer vision tasks. However, the most accurate DNNs require millions of parameters and operations, making them energy, computation and memory intensive. This impedes the deployment of large DNNs in low-power devices with limited compute resources. Recent research improves DNN models by reducing the memory requirement, energy consumption, and number of operations without significantly decreasing the accuracy. This paper surveys the progress of low-power deep learning and computer vision, specifically in regards to inference, and discusses the methods for compacting and accelerating DNN models. The techniques can be divided into four major categories: (1) parameter quantization and pruning, (2) compressed convolutional filters and matrix factorization, (3) network architecture search, and (4) knowledge distillation. We analyze the accuracy, advantages, disadvantages, and potential solutions to the problems with the techniques in each category. We also discuss new evaluation metrics as a guideline for future research.
To quickly obtain new labeled data, we can choose crowdsourcing as an alternative way at lower cost in a short time. But as an exchange, crowd annotations from non-experts may be of lower quality than those from experts. In this paper, we propose an approach to performing crowd annotation learning for Chinese Named Entity Recognition (NER) to make full use of the noisy sequence labels from multiple annotators. Inspired by adversarial learning, our approach uses a common Bi-LSTM and a private Bi-LSTM for representing annotator-generic and -specific information. The annotator-generic information is the common knowledge for entities easily mastered by the crowd. Finally, we build our Chinese NE tagger based on the LSTM-CRF model. In our experiments, we create two data sets for Chinese NER tasks from two domains. The experimental results show that our system achieves better scores than strong baseline systems.