We consider a matching problem in a bipartite graph $G=(A\cup B,E)$ where nodes in $A$ are agents having preferences in partial order over their neighbors, while nodes in $B$ are objects without preferences. We propose a polynomial-time combinatorial algorithm based on LP duality that finds a maximum matching or assignment in $G$ that is popular among all maximum matchings, if there exists one. Our algorithm can also be used to achieve a trade-off between popularity and cardinality by imposing a penalty on unmatched nodes in $A$. We also provide an $O^*(|E|^k)$ algorithm that finds an assignment whose unpopularity margin is at most $k$; this algorithm is essentially optimal, since the problem is $\mathsf{NP}$-complete and $\mathsf{W}_l[1]$-hard with parameter $k$. We also prove that finding a popular assignment of minimum cost when each edge has an associated binary cost is $\mathsf{NP}$-hard, even if agents have strict preferences. By contrast, we propose a polynomial-time algorithm for the variant of the popular assignment problem with forced/forbidden edges. Finally, we present an application in the context of housing markets.
We propose a simple multivariate normality test based on Kac-Bernstein's characterization, which can be conducted by utilising existing statistical independence tests for sums and differences of data samples. We also perform its empirical investigation, which reveals that for high-dimensional data, the proposed approach may be more efficient than the alternative ones. The accompanying code repository is provided at \url{//shorturl.at/rtuy5}.
For an odd prime $p$, we say $f(X) \in {\mathbb F}_p[X]$ computes square roots in $\mathbb F_p$ if, for all nonzero perfect squares $a \in \mathbb F_p$, we have $f(a)^2 = a$. When $p \equiv 3 \mod 4$, it is well known that $f(X) = X^{(p+1)/4}$ computes square roots. This degree is surprisingly low (and in fact lowest possible), since we have specified $(p-1)/2$ evaluations (up to sign) of the polynomial $f(X)$. On the other hand, for $p \equiv 1 \mod 4$ there was previously no nontrivial bound known on the lowest degree of a polynomial computing square roots in $\mathbb F_p$; it could have been anywhere between $\frac{p}{4}$ and $\frac{p}{2}$. We show that for all $p \equiv 1 \mod 4$, the degree of a polynomial computing square roots has degree at least $p/3$. Our main new ingredient is a general lemma which may be of independent interest: powers of a low degree polynomial cannot have too many consecutive zero coefficients. The proof method also yields a robust version: any polynomial that computes square roots for 99\% of the squares also has degree almost $p/3$. In the other direction, we also show that for infinitely many $p \equiv 1 \mod 4$, the degree of a polynomial computing square roots can be $(\frac{1}{2} - \Omega(1))p$.
We prove and collect numerous explicit and computable results for the fractional Laplacian $(-\Delta)^s f(x)$ with $s>0$ as well as its whole space inverse, the Riesz potential, $(-\Delta)^{-s}f(x)$ with $s\in\left(0,\frac{1}{2}\right)$. Choices of $f(x)$ include weighted classical orthogonal polynomials such as the Legendre, Chebyshev, Jacobi, Laguerre and Hermite polynomials, or first and second kind Bessel functions with or without sinusoid weights. Some higher dimensional fractional Laplacians and Riesz potentials of generalized Zernike polynomials on the unit ball and its complement as well as whole space generalized Laguerre polynomials are also discussed. The aim of this paper is to aid in the continued development of numerical methods for problems involving the fractional Laplacian or the Riesz potential in bounded and unbounded domains -- both directly by providing useful basis or frame functions for spectral method approaches and indirectly by providing accessible ways to construct computable toy problems on which to test new numerical methods.
We prove a tight parallel repetition theorem for $3$-message computationally-secure quantum interactive protocols between an efficient challenger and an efficient adversary. We also prove under plausible assumptions that the security of $4$-message computationally secure protocols does not generally decrease under parallel repetition. These mirror the classical results of Bellare, Impagliazzo, and Naor [BIN97]. Finally, we prove that all quantum argument systems can be generically compiled to an equivalent $3$-message argument system, mirroring the transformation for quantum proof systems [KW00, KKMV07]. As immediate applications, we show how to derive hardness amplification theorems for quantum bit commitment schemes (answering a question of Yan [Yan22]), EFI pairs (answering a question of Brakerski, Canetti, and Qian [BCQ23]), public-key quantum money schemes (answering a question of Aaronson and Christiano [AC13]), and quantum zero-knowledge argument systems. We also derive an XOR lemma [Yao82] for quantum predicates as a corollary.
For a function $F: X \to Y$ between real Banach spaces, we show how continuation methods to solve $F(u) = g$ may improve from basic understanding of the critical set $C$ of $F$. The algorithm aims at special points with a large number of preimages, which in turn may be used as initial conditions for standard continuation methods applied to the solution of the desired equation. A geometric model based on the sets $C$ and $F^{-1}(F(C))$ substantiate our choice of curves $c \in X$ with abundant intersections with $C$. We consider three classes of examples. First we handle functions $F: R^2 \to R^2$, for which the reasoning behind the techniques is visualizable. The second set of examples, between spaces of dimension 15, is obtained by discretizing a nonlinear Sturm-Liouville problem for which special points admit a high number of solutions. Finally, we handle a semilinear elliptic operator, by computing the six solutions of an equation of the form $-\Delta - f(u) = g$ studied by Solimini.
Temporal data, obtained in the setting where it is only possible to observe one time point per trajectory, is widely used in different research fields, yet remains insufficiently addressed from the statistical point of view. Such data often contain observations of a large number of entities, in which case it is of interest to identify a small number of representative behavior types. In this paper, we propose a new method performing clustering simultaneously with alignment of temporal objects inferred from these data, providing insight into the relationships between the entities. A series of simulations confirm the ability of the proposed approach to leverage multiple properties of the complex data we target such as accessible uncertainties, correlations and a small number of time points. We illustrate it on real data encoding cellular response to a radiation treatment with high energy, supported with the results of an enrichment analysis.
We derive novel and sharp high-dimensional Berry--Esseen bounds for the sum of $m$-dependent random vectors over the class of hyper-rectangles exhibiting only a poly-logarithmic dependence in the dimension. Our results hold under minimal assumptions, such as non-degenerate covariances and finite third moments, and yield a sample complexity of order $\sqrt{m/n}$, aside from logarithmic terms, matching the optimal rates established in the univariate case. When specialized to the sums of independent non-degenerate random vectors, we obtain sharp rates under the weakest possible conditions. On the technical side, we develop an inductive relationship between anti-concentration inequalities and Berry--Esseen bounds, inspired by the classical Lindeberg swapping method and the concentration inequality approach for dependent data, that may be of independent interest.
Motivated by creating physical theories, formal languages $S$ with variables are considered and a kind of distance between elements of the languages is defined by the formula $d(x,y)= \ell(x \nabla y) - \ell(x) \wedge \ell(y)$, where $\ell$ is a length function and $x \nabla y$ means the united theory of $x$ and $y$. Actually we mainly consider abstract abelian idempotent monoids $(S,\nabla)$ provided with length functions $\ell$. The set of length functions can be projected to another set of length functions such that the distance $d$ is actually a pseudometric and satisfies $d(x\nabla a,y\nabla b) \le d(x,y) + d(a,b)$. We also propose a "signed measure" on the set of Boolean expressions of elements in $S$, and a Banach-Mazur-like distance between abelian, idempotent monoids with length functions, or formal languages.
Consider the family of power divergence statistics based on $n$ trials, each leading to one of $r$ possible outcomes. This includes the log-likelihood ratio and Pearson's statistic as important special cases. It is known that in certain regimes (e.g., when $r$ is of order $n^2$ and the allocation is asymptotically uniform as $n\to\infty$) the power divergence statistic converges in distribution to a linear transformation of a Poisson random variable. We establish explicit error bounds in the Kolmogorov (or uniform) metric to complement this convergence result, which may be applied for any values of $n$, $r$ and the index parameter $\lambda$ for which such a finite-sample bound is meaningful. We further use this Poisson approximation result to derive error bounds in Gaussian approximation of the power divergence statistics.
Research intended to estimate the effect of an action, like in randomized trials, often do not have random samples of the intended target population. Instead, estimates can be transported to the desired target population. Methods for transporting between populations are often premised on a positivity assumption, such that all relevant covariate patterns in one population are also present in the other. However, eligibility criteria, particularly in the case of trials, can result in violations of positivity. To address nonpositivity, a synthesis of statistical and mechanistic models was previously proposed in the context of violations by a single binary covariate. Here, we extend the synthesis approach for positivity violations with a continuous covariate. For estimation, two novel augmented inverse probability weighting estimators are proposed, with one based on estimating the parameters of a marginal structural model and the other based on estimating the conditional average causal effect. Both estimators are compared to other common approaches to address nonpositivity via a simulation study. Finally, the competing approaches are illustrated with an example in the context of two-drug versus one-drug antiretroviral therapy on CD4 T cell counts among women with HIV.