In this paper, we propose a class of low-rank panel quantile regression models which allow for unobserved slope heterogeneity over both individuals and time. We estimate the heterogeneous intercept and slope matrices via nuclear norm regularization followed by sample splitting, row- and column-wise quantile regressions and debiasing. We show that the estimators of the factors and factor loadings associated with the intercept and slope matrices are asymptotically normally distributed. In addition, we develop two specification tests: one for the null hypothesis that the slope coefficient is a constant over time and/or individuals under the case that true rank of slope matrix equals one, and the other for the null hypothesis that the slope coefficient exhibits an additive structure under the case that the true rank of slope matrix equals two. We illustrate the finite sample performance of estimation and inference via Monte Carlo simulations and real datasets.
Online learning naturally arises in many statistical and machine learning problems. The most widely used methods in online learning are stochastic first-order algorithms. Among this family of algorithms, there is a recently developed algorithm, Recursive One-Over-T SGD (ROOT-SGD). ROOT-SGD is advantageous in that it converges at a non-asymptotically fast rate, and its estimator further converges to a normal distribution. However, this normal distribution has unknown asymptotic covariance; thus cannot be directly applied to measure the uncertainty. To fill this gap, we develop two estimators for the asymptotic covariance of ROOT-SGD. Our covariance estimators are useful for statistical inference in ROOT-SGD. Our first estimator adopts the idea of plug-in. For each unknown component in the formula of the asymptotic covariance, we substitute it with its empirical counterpart. The plug-in estimator converges at the rate $\mathcal{O}(1/\sqrt{t})$, where $t$ is the sample size. Despite its quick convergence, the plug-in estimator has the limitation that it relies on the Hessian of the loss function, which might be unavailable in some cases. Our second estimator is a Hessian-free estimator that overcomes the aforementioned limitation. The Hessian-free estimator uses the random-scaling technique, and we show that it is an asymptotically consistent estimator of the true covariance.
We study the bias of classical quantile regression and instrumental variable quantile regression estimators. While being asymptotically first-order unbiased, these estimators can have non-negligible second-order biases. We derive a higher-order stochastic expansion of these estimators using empirical process theory. Based on this expansion, we derive an explicit formula for the second-order bias and propose a feasible bias correction procedure that uses finite-difference estimators of the bias components. The proposed bias correction method performs well in simulations. We provide an empirical illustration using Engel's classical data on household expenditure.
We establish a simple connection between robust and differentially-private algorithms: private mechanisms which perform well with very high probability are automatically robust in the sense that they retain accuracy even if a constant fraction of the samples they receive are adversarially corrupted. Since optimal mechanisms typically achieve these high success probabilities, our results imply that optimal private mechanisms for many basic statistics problems are robust. We investigate the consequences of this observation for both algorithms and computational complexity across different statistical problems. Assuming the Brennan-Bresler secret-leakage planted clique conjecture, we demonstrate a fundamental tradeoff between computational efficiency, privacy leakage, and success probability for sparse mean estimation. Private algorithms which match this tradeoff are not yet known -- we achieve that (up to polylogarithmic factors) in a polynomially-large range of parameters via the Sum-of-Squares method. To establish an information-computation gap for private sparse mean estimation, we also design new (exponential-time) mechanisms using fewer samples than efficient algorithms must use. Finally, we give evidence for privacy-induced information-computation gaps for several other statistics and learning problems, including PAC learning parity functions and estimation of the mean of a multivariate Gaussian.
This paper proposes a Kolmogorov-Smirnov type statistic and a Cram\'er-von Mises type statistic to test linearity in semi-functional partially linear regression models. Our test statistics are based on a residual marked empirical process indexed by a randomly projected functional covariate,which is able to circumvent the "curse of dimensionality" brought by the functional covariate. The asymptotic properties of the proposed test statistics under the null, the fixed alternative, and a sequence of local alternatives converging to the null at the $n^{1/2}$ rate are established. A straightforward wild bootstrap procedure is suggested to estimate the critical values that are required to carry out the tests in practical applications. Results from an extensive simulation study show that our tests perform reasonably well in finite samples.Finally, we apply our tests to the Tecator and AEMET datasets to check whether the assumption of linearity is supported by these datasets.
Given two matroids $\mathcal{M}_1 = (V, \mathcal{I}_1)$ and $\mathcal{M}_2 = (V, \mathcal{I}_2)$ over an $n$-element integer-weighted ground set $V$, the weighted matroid intersection problem aims to find a common independent set $S^{*} \in \mathcal{I}_1 \cap \mathcal{I}_2$ maximizing the weight of $S^{*}$. In this paper, we present a simple deterministic algorithm for weighted matroid intersection using $\tilde{O}(nr^{3/4}\log{W})$ rank queries, where $r$ is the size of the largest intersection of $\mathcal{M}_1$ and $\mathcal{M}_2$ and $W$ is the maximum weight. This improves upon the best previously known $\tilde{O}(nr\log{W})$ algorithm given by Lee, Sidford, and Wong [FOCS'15], and is the first subquadratic algorithm for polynomially-bounded weights under the standard independence or rank oracle models. The main contribution of this paper is an efficient algorithm that computes shortest-path trees in weighted exchange graphs.
This paper considers identification and estimation of the causal effect of the time Z until a subject is treated on a survival outcome T. The treatment is not randomly assigned, T is randomly right censored by a random variable C and the time to treatment Z is right censored by min(T,C). The endogeneity issue is treated using an instrumental variable explaining Z and independent of the error term of the model. We study identification in a fully nonparametric framework. We show that our specification generates an integral equation, of which the regression function of interest is a solution. We provide identification conditions that rely on this identification equation. For estimation purposes, we assume that the regression function follows a parametric model. We propose an estimation procedure and give conditions under which the estimator is asymptotically normal. The estimators exhibit good finite sample properties in simulations. Our methodology is applied to find evidence supporting the efficacy of a therapy for burn-out.
Tests for structural breaks in time series should ideally be sensitive to breaks in the parameter of interest, while being robust to nuisance changes. Statistical analysis thus needs to allow for some form of nonstationarity under the null hypothesis of no change. In this paper, estimators for integrated parameters of locally stationary time series are constructed and a corresponding functional central limit theorem is established, enabling change-point inference for a broad class of parameters under mild assumptions. The proposed framework covers all parameters which may be expressed as nonlinear functions of moments, for example kurtosis, autocorrelation, and coefficients in a linear regression model. To perform feasible inference based on the derived limit distribution, a bootstrap variant is proposed and its consistency is established. The methodology is illustrated by means of a simulation study and by an application to high-frequency asset prices.
Motivated by questions from program transformations, eight notions of isomorphisms between term rewriting systems are defined, analysed, and classified. The notions include global isomorphisms, where the renaming of variables and function symbols is the same for all term rewriting rules of the system, local ones, where a single renaming for every rule is used, and a combination, where one symbol set is renamed globally while the other set is renamed locally. Preservation of semantic properties like convertibility and termination is analysed for the different isomorphism notions. The notions of templates and maximal normal forms of term rewriting systems are introduced and algorithms to efficiently compute them are presented. Equipped with these techniques, the complexity of the underlying decision problems of the isomorphisms are analysed and either shown to be efficiently solvable or proved to be complete for the graph isomorphism complexity class.
The matrix-based R\'enyi's entropy allows us to directly quantify information measures from given data, without explicit estimation of the underlying probability distribution. This intriguing property makes it widely applied in statistical inference and machine learning tasks. However, this information theoretical quantity is not robust against noise in the data, and is computationally prohibitive in large-scale applications. To address these issues, we propose a novel measure of information, termed low-rank matrix-based R\'enyi's entropy, based on low-rank representations of infinitely divisible kernel matrices. The proposed entropy functional inherits the specialty of of the original definition to directly quantify information from data, but enjoys additional advantages including robustness and effective calculation. Specifically, our low-rank variant is more sensitive to informative perturbations induced by changes in underlying distributions, while being insensitive to uninformative ones caused by noises. Moreover, low-rank R\'enyi's entropy can be efficiently approximated by random projection and Lanczos iteration techniques, reducing the overall complexity from $\mathcal{O}(n^3)$ to $\mathcal{O}(n^2 s)$ or even $\mathcal{O}(ns^2)$, where $n$ is the number of data samples and $s \ll n$. We conduct large-scale experiments to evaluate the effectiveness of this new information measure, demonstrating superior results compared to matrix-based R\'enyi's entropy in terms of both performance and computational efficiency.
Multivariate time series anomaly detection has been extensively studied under the semi-supervised setting, where a training dataset with all normal instances is required. However, preparing such a dataset is very laborious since each single data instance should be fully guaranteed to be normal. It is, therefore, desired to explore multivariate time series anomaly detection methods based on the dataset without any label knowledge. In this paper, we propose MTGFlow, an unsupervised anomaly detection approach for multivariate time series anomaly detection via dynamic graph and entity-aware normalizing flow, leaning only on a widely accepted hypothesis that abnormal instances exhibit sparse densities than the normal. However, the complex interdependencies among entities and the diverse inherent characteristics of each entity pose significant challenges on the density estimation, let alone to detect anomalies based on the estimated possibility distribution. To tackle these problems, we propose to learn the mutual and dynamic relations among entities via a graph structure learning model, which helps to model accurate distribution of multivariate time series. Moreover, taking account of distinct characteristics of the individual entities, an entity-aware normalizing flow is developed to describe each entity into a parameterized normal distribution, thereby producing fine-grained density estimation. Incorporating these two strategies, MTGFlow achieves superior anomaly detection performance. Experiments on five public datasets with seven baselines are conducted, MTGFlow outperforms the SOTA methods by up to 5.0 AUROC\%. Codes will be released at //github.com/zqhang/Detecting-Multivariate-Time-Series-Anomalies-with-Zero-Known-Label.