Given a dataset of $n$ i.i.d. samples from an unknown distribution $P$, we consider the problem of generating a sample from a distribution that is close to $P$ in total variation distance, under the constraint of differential privacy (DP). We study the problem when $P$ is a multi-dimensional Gaussian distribution, under different assumptions on the information available to the DP mechanism: known covariance, unknown bounded covariance, and unknown unbounded covariance. We present new DP sampling algorithms, and show that they achieve near-optimal sample complexity in the first two settings. Moreover, when $P$ is a product distribution on the binary hypercube, we obtain a pure-DP algorithm whereas only an approximate-DP algorithm (with slightly worse sample complexity) was previously known.
We consider the problem of estimating a nested structure of two expectations taking the form $U_0 = E[\max\{U_1(Y), \pi(Y)\}]$, where $U_1(Y) = E[X\ |\ Y]$. Terms of this form arise in financial risk estimation and option pricing. When $U_1(Y)$ requires approximation, but exact samples of $X$ and $Y$ are available, an antithetic multilevel Monte Carlo (MLMC) approach has been well-studied in the literature. Under general conditions, the antithetic MLMC estimator obtains a root mean squared error $\varepsilon$ with order $\varepsilon^{-2}$ cost. If, additionally, $X$ and $Y$ require approximate sampling, careful balancing of the various aspects of approximation is required to avoid a significant computational burden. Under strong convergence criteria on approximations to $X$ and $Y$, randomised multilevel Monte Carlo techniques can be used to construct unbiased Monte Carlo estimates of $U_1$, which can be paired with an antithetic MLMC estimate of $U_0$ to recover order $\varepsilon^{-2}$ computational cost. In this work, we instead consider biased multilevel approximations of $U_1(Y)$, which require less strict assumptions on the approximate samples of $X$. Extensions to the method consider an approximate and antithetic sampling of $Y$. Analysis shows the resulting estimator has order $\varepsilon^{-2}$ asymptotic cost under the conditions required by randomised MLMC and order $\varepsilon^{-2}|\log\varepsilon|^3$ cost under more general assumptions.
Equipping the rototranslation group $SE(2)$ with a sub-Riemannian structure inspired by the visual cortex V1, we propose algorithms for image inpainting and enhancement based on hypoelliptic diffusion. We innovate on previous implementations of the methods by Citti, Sarti and Boscain et al., by proposing an alternative that prevents fading and capable of producing sharper results in a procedure that we call WaxOn-WaxOff. We also exploit the sub-Riemannian structure to define a completely new unsharp using $SE(2)$, analogous of the classical unsharp filter for 2D image processing, with applications to image enhancement. We demonstrate our method on blood vessels enhancement in retinal scans.
In the committee selection problem, the goal is to choose a subset of size $k$ from a set of candidates $C$ that collectively gives the best representation to a set of voters. We consider this problem in Euclidean $d$-space where each voter/candidate is a point and voters' preferences are implicitly represented by Euclidean distances to candidates. We explore fault-tolerance in committee selection and study the following three variants: (1) given a committee and a set of $f$ failing candidates, find their optimal replacement; (2) compute the worst-case replacement score for a given committee under failure of $f$ candidates; and (3) design a committee with the best replacement score under worst-case failures. The score of a committee is determined using the well-known (min-max) Chamberlin-Courant rule: minimize the maximum distance between any voter and its closest candidate in the committee. Our main results include the following: (1) in one dimension, all three problems can be solved in polynomial time; (2) in dimension $d \geq 2$, all three problems are NP-hard; and (3) all three problems admit a constant-factor approximation in any fixed dimension, and the optimal committee problem has an FPT bicriterion approximation.
We develop a new method for selecting the penalty parameter for $\ell_1$-penalized M-estimators in high dimensions, which we refer to as bootstrapping after cross-validation. We derive rates of convergence for the corresponding $\ell_1$-penalized M-estimator and also for the post-$\ell_1$-penalized M-estimator, which refits the non-zero parameters of the former estimator without penalty in the criterion function. We demonstrate via simulations that our method is not dominated by cross-validation in terms of estimation errors and outperforms cross-validation in terms of inference. As an illustration, we revisit Fryer Jr (2019), who investigated racial differences in police use of force, and confirm his findings.
In this paper, we present explicit expressions for conforming finite element function spaces, basis functions, and degrees of freedom on the pentatope and tetrahedral prism elements. More generally, our objective is to construct finite element function spaces that maintain conformity with infinite-dimensional spaces of a carefully chosen de Rham complex. This paper is a natural extension of the companion paper entitled "Conforming Finite Element Function Spaces in Four Dimensions, Part I: Foundational Principles and the Tesseract" by Nigam and Williams, (2023). In contrast to Part I, in this paper we focus on two of the most popular elements which do not possess a full tensor-product structure in all four coordinate directions. We note that these elements appear frequently in existing space-time finite element methods. In order to build our finite element spaces, we utilize powerful techniques from the recently developed 'Finite Element Exterior Calculus'. Subsequently, we translate our results into the well-known language of linear algebra (vectors and matrices) in order to facilitate implementation by scientists and engineers.
In this paper, we present a new family of cross $Z$-complementary pairs (CZCPs) based on generalized Boolean functions and two roots of unity. Our key idea is to consider an arbitrary partition of the set $\{1,2,\cdots, n\}$ with two subsets corresponding to two given roots of unity for which two truncated sequences of new alphabet size determined by the two roots of unity are obtained. We show that these two truncated sequences form a new $q$-ary CZCP with flexible sequence length and large zero-correlation zone width. Furthermore, we derive an enumeration formula by considering the Stirling number of the second kind for the partitions and show that the number of constructed CZCPs increases significantly compared to the existing works.
Multilevel lattice codes, such as the associated to Constructions $C$, $\overline{D}$, D and D', have relevant applications in communications. In this paper, we investigate some properties of lattices obtained via Constructions D and D' from $q$-ary linear codes. Connections with Construction A, generator matrices, expressions and bounds for the lattice volume and minimum distances are derived. Extensions of previous results regarding construction and decoding of binary and $p$-ary linear codes ($p$ prime) are also presented.
For a set of $p$-variate data points $\boldsymbol y_1,\ldots,\boldsymbol y_n$, there are several versions of multivariate median and related multivariate sign test proposed and studied in the literature. In this paper we consider the asymptotic properties of the multivariate extension of the Hodges-Lehmann (HL) estimator, the spatial HL-estimator, and the related test statistic. The asymptotic behavior of the spatial HL-estimator and the related test statistic when $n$ tends to infinity are collected, reviewed, and proved, some for the first time though being used already for a longer time. We also derive the limiting behavior of the HL-estimator when both the sample size $n$ and the dimension $p$ tend to infinity.
In this article, we consider a $n$-dimensional random walk $X_t$, whose error terms are linear processes generated by $n$-dimensional noise vectors, and each component of these noise vectors is independent and may not be identically distributed with uniformly bounded 8th moment and densities. Given $T$ observations such that the dimension $n$ and sample size $T$ going to infinity proportionally, define $\boldsymbol{X}$ and $\hat{\boldsymbol{R}}$ as the data matrix and the sample correlation matrix of $\boldsymbol{X}$ respectively. This article establishes the central limit theorem (CLT) of the first $K$ largest eigenvalues of $n^{-1}\hat{\boldsymbol{R}}$. Subsequently, we propose a new unit root test for the panel high-dimensional nonstationary time series based on the CLT of the largest eigenvalue of $n^{-1}\hat{\boldsymbol{R}}$. A numerical experiment is undertaken to verify the power of our proposed unit root test.
Object detection typically assumes that training and test data are drawn from an identical distribution, which, however, does not always hold in practice. Such a distribution mismatch will lead to a significant performance drop. In this work, we aim to improve the cross-domain robustness of object detection. We tackle the domain shift on two levels: 1) the image-level shift, such as image style, illumination, etc, and 2) the instance-level shift, such as object appearance, size, etc. We build our approach based on the recent state-of-the-art Faster R-CNN model, and design two domain adaptation components, on image level and instance level, to reduce the domain discrepancy. The two domain adaptation components are based on H-divergence theory, and are implemented by learning a domain classifier in adversarial training manner. The domain classifiers on different levels are further reinforced with a consistency regularization to learn a domain-invariant region proposal network (RPN) in the Faster R-CNN model. We evaluate our newly proposed approach using multiple datasets including Cityscapes, KITTI, SIM10K, etc. The results demonstrate the effectiveness of our proposed approach for robust object detection in various domain shift scenarios.