Quantum no-cloning theorem gives rise to the intriguing possibility of quantum copy protection where we encode a program in a quantum state such that a user in possession of k such states cannot create k + 1 working copies. Introduced by Aaronson (CCC 09) over a decade ago, copy protection has proven to be notoriously hard to achieve. In this work, we construct public-key encryption and functional encryption schemes whose secret keys are copy-protected against unbounded collusions in the plain model (i.e. without any idealized oracles), assuming (post-quantum) subexponentially secure iO, one-way functions and LWE. This resolves a long-standing open question of constructing fully collusion-resistant copy-protected functionalities raised by multiple previous works. Prior to our work, copy-protected functionalities were known only in restricted collusion models where either an a-priori bound on the collusion size was needed, in the plain model with the same assumptions as ours (Liu, Liu, Qian, Zhandry [TCC 22]), or adversary was only prevented from doubling their number of working programs, in a structured quantum oracle model (Aaronson [CCC 09]). We obtain our results through a novel technique which uses identity-based encryption to construct unbounded collusion resistant copy-protection schemes from 1-to-2 secure schemes. This is analogous to the technique of using digital signatures to construct full-fledged quantum money from single banknote schemes1 (Lutomirski et al. [ICS 09], Farhi et al. [ITCS 12], Aaronson and Christiano [STOC 12]). We believe our technique is of independent interest. Along the way, we also construct a puncturable functional encryption scheme whose master secret key can be punctured at all functions f such that f (m0) != f (m1). This might also be of independent interest.
Classical methods for quantile regression fail in cases where the quantile of interest is extreme and only few or no training data points exceed it. Asymptotic results from extreme value theory can be used to extrapolate beyond the range of the data, and several approaches exist that use linear regression, kernel methods or generalized additive models. Most of these methods break down if the predictor space has more than a few dimensions or if the regression function of extreme quantiles is complex. We propose a method for extreme quantile regression that combines the flexibility of random forests with the theory of extrapolation. Our extremal random forest (ERF) estimates the parameters of a generalized Pareto distribution, conditional on the predictor vector, by maximizing a local likelihood with weights extracted from a quantile random forest. We penalize the shape parameter in this likelihood to regularize its variability in the predictor space. Under general domain of attraction conditions, we show consistency of the estimated parameters in both the unpenalized and penalized case. Simulation studies show that our ERF outperforms both classical quantile regression methods and existing regression approaches from extreme value theory. We apply our methodology to extreme quantile prediction for U.S. wage data.
We propose a new method for estimating subject-specific mean functions from longitudinal data. We aim to do this in a flexible manner (without restrictive assumptions about the shape of the subject-specific mean functions), while exploiting similarities in the mean functions between different subjects. Functional principal components analysis fulfils both requirements, and methods for functional principal components analysis have been developed for longitudinal data. However, we find that these existing methods sometimes give fitted mean functions which are more complex than needed to provide a good fit to the data. We develop a new penalised likelihood approach to flexibly model longitudinal data, with a penalty term to control the balance between fit to the data and smoothness of the subject-specific mean curves. We run simulation studies to demonstrate that the new method substantially improves the quality of inference relative to existing methods across a range of examples, and apply the method to data on changes in body composition in adolescent girls.
In the context of interactive proofs, a "folding scheme" (popularized by Nova) is a way to combine multiple instances of a constraint system into a single instance, so the validity of the multiple instances can statistically be reduced to the validity of a single one. We show how Nova folding can be generalized to ``custom'' gates and extra rounds of verifier randomness. As an application of this extension, we present Origami, the first (to our knowledge) known example of a folding scheme for lookups.
The paper introduces a new estimation method for the standard linear regression model. The procedure is not driven by the optimisation of any objective function rather, it is a simple weighted average of slopes from observation pairs. The paper shows that such estimator is consistent for carefully selected weights. Other properties, such as asymptotic distributions, have also been derived to facilitate valid statistical inference. Unlike traditional methods, such as Least Squares and Maximum Likelihood, among others, the estimated residual of this estimator is not by construction orthogonal to the explanatory variables of the model. This property allows a wide range of practical applications, such as the testing of endogeneity, i.e.,the correlation between the explanatory variables and the disturbance terms, and potentially several others.
Machinery for data analysis often requires a numeric representation of the input. Towards that, a common practice is to embed components of structured data into a high-dimensional vector space. We study the embedding of the tuples of a relational database, where existing techniques are often based on optimization tasks over a collection of random walks from the database. The focus of this paper is on the recent FoRWaRD algorithm that is designed for dynamic databases, where walks are sampled by following foreign keys between tuples. Importantly, different walks have different schemas, or "walk schemes", that are derived by listing the relations and attributes along the walk. Also importantly, different walk schemes describe relationships of different natures in the database. We show that by focusing on a few informative walk schemes, we can obtain tuple embedding significantly faster, while retaining the quality. We define the problem of scheme selection for tuple embedding, devise several approaches and strategies for scheme selection, and conduct a thorough empirical study of the performance over a collection of downstream tasks. Our results confirm that with effective strategies for scheme selection, we can obtain high-quality embeddings considerably (e.g., three times) faster, preserve the extensibility to newly inserted tuples, and even achieve an increase in the precision of some tasks.
Aggregated HPC resources have rigid allocation systems and programming models which struggle to adapt to diverse and changing workloads. Consequently, HPC systems fail to efficiently use the large pools of unused memory and increase the utilization of idle computing resources. Prior work attempted to increase the throughput and efficiency of supercomputing systems through workload co-location and resource disaggregation. However, these methods fall short of providing a solution that can be applied to existing systems without major hardware modifications and performance losses. In this paper, we improve the utilization of supercomputers by employing the new cloud paradigm of serverless computing. We show how serverless functions provide fine-grained access to the resources of batch-managed cluster nodes. We present an HPC-oriented Function-as-a-Service (FaaS) that satisfies the requirements of high-performance applications. We demonstrate a \emph{software resource disaggregation} approach where placing functions on unallocated and underutilized nodes allows idle cores and accelerators to be utilized while retaining near-native performance.
Machine learning models often deteriorate in their performance when they are used to predict the outcomes over data on which they were not trained. These scenarios can often arise in real world when the distribution of data changes gradually or abruptly due to major events like a pandemic. There have been many attempts in machine learning research to come up with techniques that are resilient to such Concept drifts. However, there is no principled framework to identify the drivers behind the drift in model performance. In this paper, we propose a novel framework - DBShap that uses Shapley values to identify the main contributors of the drift and quantify their respective contributions. The proposed framework not only quantifies the importance of individual features in driving the drift but also includes the change in the underlying relation between the input and output as a possible driver. The explanation provided by DBShap can be used to understand the root cause behind the drift and use it to make the model resilient to the drift.
Humans perceive the world by concurrently processing and fusing high-dimensional inputs from multiple modalities such as vision and audio. Machine perception models, in stark contrast, are typically modality-specific and optimised for unimodal benchmarks, and hence late-stage fusion of final representations or predictions from each modality (`late-fusion') is still a dominant paradigm for multimodal video classification. Instead, we introduce a novel transformer based architecture that uses `fusion bottlenecks' for modality fusion at multiple layers. Compared to traditional pairwise self-attention, our model forces information between different modalities to pass through a small number of bottleneck latents, requiring the model to collate and condense the most relevant information in each modality and only share what is necessary. We find that such a strategy improves fusion performance, at the same time reducing computational cost. We conduct thorough ablation studies, and achieve state-of-the-art results on multiple audio-visual classification benchmarks including Audioset, Epic-Kitchens and VGGSound. All code and models will be released.
It is important to detect anomalous inputs when deploying machine learning systems. The use of larger and more complex inputs in deep learning magnifies the difficulty of distinguishing between anomalous and in-distribution examples. At the same time, diverse image and text data are available in enormous quantities. We propose leveraging these data to improve deep anomaly detection by training anomaly detectors against an auxiliary dataset of outliers, an approach we call Outlier Exposure (OE). This enables anomaly detectors to generalize and detect unseen anomalies. In extensive experiments on natural language processing and small- and large-scale vision tasks, we find that Outlier Exposure significantly improves detection performance. We also observe that cutting-edge generative models trained on CIFAR-10 may assign higher likelihoods to SVHN images than to CIFAR-10 images; we use OE to mitigate this issue. We also analyze the flexibility and robustness of Outlier Exposure, and identify characteristics of the auxiliary dataset that improve performance.
We propose a new method for event extraction (EE) task based on an imitation learning framework, specifically, inverse reinforcement learning (IRL) via generative adversarial network (GAN). The GAN estimates proper rewards according to the difference between the actions committed by the expert (or ground truth) and the agent among complicated states in the environment. EE task benefits from these dynamic rewards because instances and labels yield to various extents of difficulty and the gains are expected to be diverse -- e.g., an ambiguous but correctly detected trigger or argument should receive high gains -- while the traditional RL models usually neglect such differences and pay equal attention on all instances. Moreover, our experiments also demonstrate that the proposed framework outperforms state-of-the-art methods, without explicit feature engineering.