For simulation-based systems, finding a set of test cases with the least cost by exploring multiple goals is a complex task. Domain-specific optimization goals (e.g. maximize output variance) are useful for guiding the rapid selection of test cases via mutation. But evaluating the selected test cases via mutation (that can distinguish the current program from the mutated systems) is a different goal to domain-specific optimizations. While the optimization goals can be used to guide the mutation analysis, that guidance should be viewed as a weak indicator since it can hurt the mutation effectiveness goals by focusing too much on the optimization goals. Based on the above, this paper proposes DoLesS (Domination with Least Squares Approximation) that selects the minimal and effective test cases by averaging over a coarse-grained grid of the information gained from multiple optimizations goals. DoLesS applies an inverted least squares approximation approach to find a minimal set of tests that can distinguish better from worse parts of the optimization goals. When tested on multiple simulation-based systems, DoLesS performs as well or even better as the prior state-of-the-art, while running 80-360 times faster on average (seconds instead of hours).
This work aims to address the general order manipulation issue in blockchain-based decentralized exchanges (DEX) by exploring the benefits of employing differentially order-fair atomic broadcast (of-ABC) mechanisms for transaction ordering and frequent batch auction (FBA) for execution. In the suggested of-ABC approach, transactions submitted to a sufficient number of blockchain validators are ordered before or along with later transactions. FBA then executes transactions with a uniform price double auction that prioritizes price instead of transaction order within the same committed batch. To demonstrate the effectiveness of our order-but-not-execute-in-order design, we compare the welfare loss and liquidity provision in DEX under FBA and its continuous counterpart, Central Limit Order Book (CLOB). Assuming that the exchange is realized over an of-ABC protocol, we find that FBA achieves better social welfare compared to CLOB when (1) public information affecting the fundamental value of an asset is revealed more frequently than private information, or (2) the block generation interval is sufficiently large, or (3) the priority fees attached to submitted transactions are small compared to the asset price changes. Further, our findings also indicate that (4) liquidity provision is better under FBA when the market is not thin, meaning that a higher number of transactions are submitted by investors and traders in a block, or (5) when fewer privately informed traders are present. Overall, in the settings mentioned above, the adoption of FBA and of-ABC mechanisms in DEX demonstrates improved performance in terms of social welfare and liquidity provision compared to the continuous CLOB model.
Repeated use of a data sample via adaptively chosen queries can rapidly lead to overfitting, wherein the empirical evaluation of queries on the sample significantly deviates from their mean with respect to the underlying data distribution. It turns out that simple noise addition algorithms suffice to prevent this issue, and differential privacy-based analysis of these algorithms shows that they can handle an asymptotically optimal number of queries. However, differential privacy's worst-case nature entails scaling such noise to the range of the queries even for highly-concentrated queries, or introducing more complex algorithms. In this paper, we prove that straightforward noise-addition algorithms already provide variance-dependent guarantees that also extend to unbounded queries. This improvement stems from a novel characterization that illuminates the core problem of adaptive data analysis. We show that the harm of adaptivity results from the covariance between the new query and a Bayes factor-based measure of how much information about the data sample was encoded in the responses given to past queries. We then leverage this characterization to introduce a new data-dependent stability notion that can bound this covariance.
Extended Dynamic Mode Decomposition (EDMD) is a data-driven tool for forecasting and model reduction of dynamics, which has been extensively taken up in the physical sciences. While the method is conceptually simple, in deterministic chaos it is unclear what its properties are or even what it converges to. In particular, it is not clear how EDMD's least-squares approximation treats the classes of regular functions needed to make sense of chaotic dynamics. We develop for the first time a general, rigorous theory of EDMD on the simplest examples of chaotic maps: analytic expanding maps of the circle. To do this, we prove a new, basic approximation result in the theory of orthogonal polynomials on the unit circle (OPUC) and apply methods from transfer operator theory. We show that in the infinite-data limit, the least-squares projection error is exponentially small for trigonometric polynomial observable dictionaries. As a result, we show that the forecasts and Koopman spectral data produced using EDMD in this setting converge to the physically meaningful limits, exponentially fast with respect to the size of the dictionary. This demonstrates that with only a relatively small polynomial dictionary, EDMD can be very effective, even when the sampling measure is not uniform. Furthermore, our OPUC result suggests that data-based least-squares projections may be a very effective approximation strategy.
Policy learning is an important component of many real-world learning systems. A major challenge in policy learning is how to adapt efficiently to unseen environments or tasks. Recently, it has been suggested to exploit invariant conditional distributions to learn models that generalize better to unseen environments. However, assuming invariance of entire conditional distributions (which we call full invariance) may be too strong of an assumption in practice. In this paper, we introduce a relaxation of full invariance called effect-invariance (e-invariance for short) and prove that it is sufficient, under suitable assumptions, for zero-shot policy generalization. We also discuss an extension that exploits e-invariance when we have a small sample from the test environment, enabling few-shot policy generalization. Our work does not assume an underlying causal graph or that the data are generated by a structural causal model; instead, we develop testing procedures to test e-invariance directly from data. We present empirical results using simulated data and a mobile health intervention dataset to demonstrate the effectiveness of our approach.
Elliptical distribution is a basic assumption underlying many multivariate statistical methods. For example, in sufficient dimension reduction and statistical graphical models, this assumption is routinely imposed to simplify the data dependence structure. Before applying such methods, we need to decide whether the data are elliptically distributed. Currently existing tests either focus exclusively on spherical distributions, or rely on bootstrap to determine the null distribution, or require specific forms of the alternative distribution. In this paper, we introduce a general nonparametric test for elliptical distribution based on kernel embedding of the probability measure that embodies the two properties that characterize an elliptical distribution: namely, after centering and rescaling, (1) the direction and length of the random vector are independent, and (2) the directional vector is uniformly distributed on the unit sphere. We derive the null asymptotic distribution of the test statistic via von-Mises expansion, develop the sample-level procedure to determine the rejection region, and establish the consistency and validity of the proposed test. We apply our test to a SENIC dataset with and without a transformation aimed to achieve ellipticity.
Supervised machine learning (ML) and deep learning (DL) algorithms excel at predictive tasks, but it is commonly assumed that they often do so by exploiting non-causal correlations, which may limit both interpretability and generalizability. Here, we show that this trade-off between explanation and prediction is not as deep and fundamental as expected. Whereas ML and DL algorithms will indeed tend to use non-causal features for prediction when fed indiscriminately with all data, it is possible to constrain the learning process of any ML and DL algorithm by selecting features according to Pearl's backdoor adjustment criterion. In such a situation, some algorithms, in particular deep neural networks, can provide near unbiased effect estimates under feature collinearity. Remaining biases are explained by the specific algorithmic structures as well as hyperparameter choice. Consequently, optimal hyperparameter settings are different when tuned for prediction or inference, confirming the general expectation of a trade-off between prediction and explanation. However, the effect of this trade-off is small compared to the effect of a causally constrained feature selection. Thus, once the causal relationship between the features is accounted for, the difference between prediction and explanation may be much smaller than commonly assumed. We also show that such causally constrained models generalize better to new data with altered collinearity structures, suggesting generalization failure may often be due to a lack of causal learning. Our results not only provide a perspective for using ML for inference of (causal) effects but also help to improve the generalizability of fitted ML and DL models to new data.
This paper considers a single-trajectory system identification problem for linear systems under general nonlinear and/or time-varying policies with i.i.d. random excitation noises. The problem is motivated by safe learning-based control for constrained linear systems, where the safe policies during the learning process are usually nonlinear and time-varying for satisfying the state and input constraints. In this paper, we provide a non-asymptotic error bound for least square estimation when the data trajectory is generated by any nonlinear and/or time-varying policies as long as the generated state and action trajectories are bounded. This significantly generalizes the existing non-asymptotic guarantees for linear system identification, which usually consider i.i.d. random inputs or linear policies. Interestingly, our error bound is consistent with that for linear policies with respect to the dependence on the trajectory length, system dimensions, and excitation levels. Lastly, we demonstrate the applications of our results by safe learning with robust model predictive control and provide numerical analysis.
Machine learning methods are commonly evaluated and compared by their performance on data sets from public repositories. This allows for multiple methods, oftentimes several thousands, to be evaluated under identical conditions and across time. The highest ranked performance on a problem is referred to as state-of-the-art (SOTA) performance, and is used, among other things, as a reference point for publication of new methods. Using the highest-ranked performance as an estimate for SOTA is a biased estimator, giving overly optimistic results. The mechanisms at play are those of multiplicity, a topic that is well-studied in the context of multiple comparisons and multiple testing, but has, as far as the authors are aware of, been nearly absent from the discussion regarding SOTA estimates. The optimistic state-of-the-art estimate is used as a standard for evaluating new methods, and methods with substantial inferior results are easily overlooked. In this article, we provide a probability distribution for the case of multiple classifiers so that known analyses methods can be engaged and a better SOTA estimate can be provided. We demonstrate the impact of multiplicity through a simulated example with independent classifiers. We show how classifier dependency impacts the variance, but also that the impact is limited when the accuracy is high. Finally, we discuss a real-world example; a Kaggle competition from 2020.
In privacy under continual observation we study how to release differentially private estimates based on a dataset that evolves over time. The problem of releasing private prefix sums of $x_1,x_2,x_3,\dots \in\{0,1\}$ (where the value of each $x_i$ is to be private) is particularly well-studied, and a generalized form is used in state-of-the-art methods for private stochastic gradient descent (SGD). The seminal binary mechanism privately releases the first $t$ prefix sums with noise of variance polylogarithmic in $t$. Recently, Henzinger et al. and Denisov et al. showed that it is possible to improve on the binary mechanism in two ways: The variance of the noise can be reduced by a (large) constant factor, and also made more even across time steps. However, their algorithms for generating the noise distribution are not as efficient as one would like in terms of computation time and (in particular) space. We address the efficiency problem by presenting a simple alternative to the binary mechanism in which 1) generating the noise takes constant average time per value, 2) the variance is reduced by a factor about 4 compared to the binary mechanism, and 3) the noise distribution at each step is identical. Empirically, a simple Python implementation of our approach outperforms the running time of the approach of Henzinger et al., as well as an attempt to improve their algorithm using high-performance algorithms for multiplication with Toeplitz matrices.
We present an analysis of large-scale load balancing systems, where the processing time distribution of tasks depends on both the task and server types. Our study focuses on the asymptotic regime, where the number of servers and task types tend to infinity in proportion. In heterogeneous environments, commonly used load balancing policies such as Join Fastest Idle Queue and Join Fastest Shortest Queue exhibit poor performance and even shrink the stability region. Interestingly, prior to this work, finding a scalable policy with a provable performance guarantee in this setup remained an open question. To address this gap, we propose and analyze two asymptotically delay-optimal dynamic load balancing policies. The first policy efficiently reserves the processing capacity of each server for ``good" tasks and routes tasks using the vanilla Join Idle Queue policy. The second policy, called the speed-priority policy, significantly increases the likelihood of assigning tasks to the respective ``good" servers capable of processing them at high speeds. By leveraging a framework inspired by the graphon literature and employing the mean-field method and stochastic coupling arguments, we demonstrate that both policies achieve asymptotic zero queuing. Specifically, as the system scales, the probability of a typical task being assigned to an idle server approaches 1.