Reserve systems are used to accommodate multiple essential or underrepresented groups in allocating indivisible scarce resources by creating categories that prioritize their respective beneficiaries. Some applications include the optimal allocation of vaccines, or assignment of minority students to elite colleges in India. An allocation is called smart if it optimizes the number of units distributed. Previous literature mostly assumed baseline priorities, which impose significant interdependencies between the priority ordering of different categories. It also assumes either everybody is eligible for receiving a unit from any category, or only the beneficiaries are eligible. The comprehensive Threshold Model we propose allows independent priority orderings among categories and arbitrary beneficiary and eligibility thresholds, enabling policymakers to avoid comparing incomparables in affirmative action systems. We present a new smart reserve system that optimizes two objectives simultaneously to allocate scarce resources. Our Smart Pipeline Matching Mechanism achieves all desirable properties in the most general domain possible. Our results apply to any resource allocation market, but we focus our attention on the vaccine allocation problem.
The recently introduced graphical continuous Lyapunov models provide a new approach to statistical modeling of correlated multivariate data. The models view each observation as a one-time cross-sectional snapshot of a multivariate dynamic process in equilibrium. The covariance matrix for the data is obtained by solving a continuous Lyapunov equation that is parametrized by the drift matrix of the dynamic process. In this context, different statistical models postulate different sparsity patterns in the drift matrix, and it becomes a crucial problem to clarify whether a given sparsity assumption allows one to uniquely recover the drift matrix parameters from the covariance matrix of the data. We study this identifiability problem by representing sparsity patterns by directed graphs. Our main result proves that the drift matrix is globally identifiable if and only if the graph for the sparsity pattern is simple (i.e., does not contain directed two-cycles). Moreover, we present a necessary condition for generic identifiability and provide a computational classification of small graphs with up to 5 nodes.
Gradual verification, which supports explicitly partial specifications and verifies them with a combination of static and dynamic checks, makes verification more incremental and provides earlier feedback to developers. While an abstract, weakest precondition-based approach to gradual verification was previously proven sound, the approach did not provide sufficient guidance for implementation and optimization of the required run-time checks. More recently, gradual verification was implemented using symbolic execution techniques, but the soundness of the approach (as with related static checkers based on implicit dynamic frames) was an open question. This paper puts practical gradual verification on a sound footing with a formalization of symbolic execution, optimized run-time check generation, and run time execution. We prove our approach is sound; our proof also covers a core subset of the Viper tool, for which we are aware of no previous soundness result. Our formalization enabled us to find a soundness bug in an implemented gradual verification tool and describe the fix necessary to make it sound.
Quantile regression and conditional density estimation can reveal structure that is missed by mean regression, such as multimodality and skewness. In this paper, we introduce a deep learning generative model for joint quantile estimation called Penalized Generative Quantile Regression (PGQR). Our approach simultaneously generates samples from many random quantile levels, allowing us to infer the conditional distribution of a response variable given a set of covariates. Our method employs a novel variability penalty to avoid the problem of vanishing variability, or memorization, in deep generative models. Further, we introduce a new family of partial monotonic neural networks (PMNN) to circumvent the problem of crossing quantile curves. A major benefit of PGQR is that it can be fit using a single optimization, thus bypassing the need to repeatedly train the model at multiple quantile levels or use computationally expensive cross-validation to tune the penalty parameter. We illustrate the efficacy of PGQR through extensive simulation studies and analysis of real datasets. Code to implement our method is available at //github.com/shijiew97/PGQR.
We consider dynamic pricing with covariates under a generalized linear demand model: a seller can dynamically adjust the price of a product over a horizon of $T$ time periods, and at each time period $t$, the demand of the product is jointly determined by the price and an observable covariate vector $x_t\in\mathbb{R}^d$ through a generalized linear model with unknown co-efficients. Most of the existing literature assumes the covariate vectors $x_t$'s are independently and identically distributed (i.i.d.); the few papers that relax this assumption either sacrifice model generality or yield sub-optimal regret bounds. In this paper, we show that UCB and Thompson sampling-based pricing algorithms can achieve an $O(d\sqrt{T}\log T)$ regret upper bound without assuming any statistical structure on the covariates $x_t$. Our upper bound on the regret matches the lower bound up to logarithmic factors. We thus show that (i) the i.i.d. assumption is not necessary for obtaining low regret, and (ii) the regret bound can be independent of the (inverse) minimum eigenvalue of the covariance matrix of the $x_t$'s, a quantity present in previous bounds. Moreover, we consider a constrained setting of the dynamic pricing problem where there is a limited and unreplenishable inventory and we develop theoretical results that relate the best achievable algorithm performance to a variation measure with respect to the temporal distribution shift of the covariates. We also discuss conditions under which a better regret is achievable and demonstrate the proposed algorithms' performance with numerical experiments.
We define a new class of predicates called equilevel predicates on a distributive lattice which eases the analysis of parallel algorithms. Many combinatorial problems such as the vertex cover problem, the bipartite matching problem, and the minimum spanning tree problem can be modeled as detecting an equilevel predicate. The problem of detecting an equilevel problem is NP-complete, but equilevel predicates with the helpful property can be detected in polynomial time in an online manner. An equilevel predicate has the helpful property with a polynomial time algorithm if the algorithm can return a nonempty set of indices such that advancing on any of them can be used to detect the predicate. Furthermore, the refined independently helpful property allows online parallel detection of such predicates in NC. When the independently helpful property holds, advancing on all the specified indices in parallel can be used to detect the predicate in polylogarithmic time. We also define a special class of equilevel predicates called solitary predicates. Unless NP = RP, this class of predicate also does not admit efficient algorithms. Earlier work has shown that solitary predicates with the efficient advancement can be detected in polynomial time. We introduce two properties called the antimonotone advancement and the efficient rejection which yield the detection of solitary predicates in NC. Finally, we identify the minimum spanning tree, the shortest path, and the conjunctive predicate detection as problems satisfying such properties, giving alternative certifications of their NC memberships as a result.
A community reveals the features and connections of its members that are different from those in other communities in a network. Detecting communities is of great significance in network analysis. Despite the classical spectral clustering and statistical inference methods, we notice a significant development of deep learning techniques for community detection in recent years with their advantages in handling high dimensional network data. Hence, a comprehensive overview of community detection's latest progress through deep learning is timely to both academics and practitioners. This survey devises and proposes a new taxonomy covering different categories of the state-of-the-art methods, including deep learning-based models upon deep neural networks, deep nonnegative matrix factorization and deep sparse filtering. The main category, i.e., deep neural networks, is further divided into convolutional networks, graph attention networks, generative adversarial networks and autoencoders. The survey also summarizes the popular benchmark data sets, model evaluation metrics, and open-source implementations to address experimentation settings. We then discuss the practical applications of community detection in various domains and point to implementation scenarios. Finally, we outline future directions by suggesting challenging topics in this fast-growing deep learning field.
Graph neural networks (GNNs) are a popular class of machine learning models whose major advantage is their ability to incorporate a sparse and discrete dependency structure between data points. Unfortunately, GNNs can only be used when such a graph-structure is available. In practice, however, real-world graphs are often noisy and incomplete or might not be available at all. With this work, we propose to jointly learn the graph structure and the parameters of graph convolutional networks (GCNs) by approximately solving a bilevel program that learns a discrete probability distribution on the edges of the graph. This allows one to apply GCNs not only in scenarios where the given graph is incomplete or corrupted but also in those where a graph is not available. We conduct a series of experiments that analyze the behavior of the proposed method and demonstrate that it outperforms related methods by a significant margin.
It is important to detect anomalous inputs when deploying machine learning systems. The use of larger and more complex inputs in deep learning magnifies the difficulty of distinguishing between anomalous and in-distribution examples. At the same time, diverse image and text data are available in enormous quantities. We propose leveraging these data to improve deep anomaly detection by training anomaly detectors against an auxiliary dataset of outliers, an approach we call Outlier Exposure (OE). This enables anomaly detectors to generalize and detect unseen anomalies. In extensive experiments on natural language processing and small- and large-scale vision tasks, we find that Outlier Exposure significantly improves detection performance. We also observe that cutting-edge generative models trained on CIFAR-10 may assign higher likelihoods to SVHN images than to CIFAR-10 images; we use OE to mitigate this issue. We also analyze the flexibility and robustness of Outlier Exposure, and identify characteristics of the auxiliary dataset that improve performance.
Deep neural networks (DNNs) have been found to be vulnerable to adversarial examples resulting from adding small-magnitude perturbations to inputs. Such adversarial examples can mislead DNNs to produce adversary-selected results. Different attack strategies have been proposed to generate adversarial examples, but how to produce them with high perceptual quality and more efficiently requires more research efforts. In this paper, we propose AdvGAN to generate adversarial examples with generative adversarial networks (GANs), which can learn and approximate the distribution of original instances. For AdvGAN, once the generator is trained, it can generate adversarial perturbations efficiently for any instance, so as to potentially accelerate adversarial training as defenses. We apply AdvGAN in both semi-whitebox and black-box attack settings. In semi-whitebox attacks, there is no need to access the original target model after the generator is trained, in contrast to traditional white-box attacks. In black-box attacks, we dynamically train a distilled model for the black-box model and optimize the generator accordingly. Adversarial examples generated by AdvGAN on different target models have high attack success rate under state-of-the-art defenses compared to other attacks. Our attack has placed the first with 92.76% accuracy on a public MNIST black-box attack challenge.
The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.8 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.