We quantify the minimax rate for a nonparametric regression model over a convex function class $\mathcal{F}$ with bounded diameter. We obtain a minimax rate of ${\varepsilon^{\ast}}^2\wedge\mathrm{diam}(\mathcal{F})^2$ where \[\varepsilon^{\ast} =\sup\{\varepsilon>0:n\varepsilon^2 \le \log M_{\mathcal{F}}^{\operatorname{loc}}(\varepsilon,c)\},\] where $M_{\mathcal{F}}^{\operatorname{loc}}(\cdot, c)$ is the local metric entropy of $\mathcal{F}$ and our loss function is the squared population $L_2$ distance over our input space $\mathcal{X}$. In contrast to classical works on the topic [cf. Yang and Barron, 1999], our results do not require functions in $\mathcal{F}$ to be uniformly bounded in sup-norm. In addition, we prove that our estimator is adaptive to the true point, and to the best of our knowledge this is the first such estimator in this general setting. This work builds on the Gaussian sequence framework of Neykov [2022] using a similar algorithmic scheme to achieve the minimax rate. Our algorithmic rate also applies with sub-Gaussian noise. We illustrate the utility of this theory with examples including multivariate monotone functions, linear functionals over ellipsoids, and Lipschitz classes.
We study a hypothesis testing problem in the context of high-dimensional changepoint detection. Given a matrix $X \in \mathbb{R}^{p \times n}$ with independent Gaussian entries, the goal is to determine whether or not a sparse, non-null fraction of rows in $X$ exhibits a shift in mean at a common index between $1$ and $n$. We focus on three aspects of this problem: the sparsity of non-null rows, the presence of a single, common changepoint in the non-null rows, and the signal strength associated with the changepoint. Within an asymptotic regime relating the data dimensions $n$ and $p$ to the signal sparsity and strength, we characterize the information-theoretic limits of the testing problem by a formula that determines whether the sum of Type I and II errors tends to zero or is bounded away from zero. The formula, called the \emph{detection boundary}, is a curve that separates the parameter space into a detectable region and an undetectable region. We show that a Berk--Jones type test statistic can detect the presence of a sparse non-null fraction of rows, and does so adaptively throughout the detectable region. Conversely, within the undetectable region, no test is able to consistently distinguish the signal from noise.
As an important part of genetic algorithms (GAs), mutation operators is widely used in evolutionary algorithms to solve $\mathcal{NP}$-hard problems because it can increase the population diversity of individual. Due to limitations in mathematical tools, the mutation probability of the mutation operator is primarily empirically set in practical applications. In this paper, we propose a novel reduction method for the 0-1 knapsack problem(0-1 KP) and an improved mutation operator (IMO) based on the assumption $\mathcal{NP}\neq\mathcal{P}$, along with the utilization of linear relaxation techniques and a recent result by Dey et al. (Math. Prog., pp 569-587, 2022). We employ this method to calculate an upper bound of the mutation probability in general instances of the 0-1 KP, and construct an instance where the mutation probability does not tend towards 0 as the problem size increases. Finally, we prove that the probability of the IMO hitting the optimal solution within only a single iteration in large-scale instances is superior to that of the traditional mutation operator.
The merit factor of a $\{-1, 1\}$ binary sequence measures the collective smallness of its non-trivial aperiodic autocorrelations. Binary sequences with large merit factor are important in digital communications because they allow the efficient separation of signals from noise. It is a longstanding open question whether the maximum merit factor is asymptotically unbounded and, if so, what is its limiting value. Attempts to answer this question over almost sixty years have identified certain classes of binary sequences as particularly important: skew-symmetric sequences, symmetric sequences, and anti-symmetric sequences. Using only elementary methods, we find an exact formula for the mean and variance of the reciprocal merit factor of sequences in each of these classes, and in the class of all binary sequences. This provides a much deeper understanding of the distribution of the merit factor in these four classes than was previously available. A consequence is that, for each of the four classes, the merit factor of a sequence drawn uniformly at random from the class converges in probability to a constant as the sequence length increases.
Properties of the additive differential probability $\mathrm{adp}^{\mathrm{XR}}$ of the composition of bitwise XOR and a bit rotation are investigated, where the differences are expressed using addition modulo $2^n$. This composition is widely used in ARX constructions consisting of additions modulo $2^n$, bit rotations and bitwise XORs. Differential cryptanalysis of such primitives may involve maximums of $\mathrm{adp}^{\mathrm{XR}}$, where some of its input or output differences are fixed. Although there is an efficient way to calculate this probability (Velichkov et al, 2011), many of its properties are still unknown. In this work, we find maximums of $\mathrm{adp}^{\mathrm{XR}}$, where the rotation is one bit left/right and one of its input differences is fixed. Some symmetries of $\mathrm{adp}^{\mathrm{XR}}$ are obtained as well. We provide all its impossible differentials in terms of regular expression patterns and estimate the number of them. This number turns out to be maximal for the one bit left rotation and noticeably less than the number of impossible differentials of bitwise XOR.
We establish an invariance principle for polynomial functions of $n$ independent high-dimensional random vectors, and also show that the obtained rates are nearly optimal. Both the dimension of the vectors and the degree of the polynomial are permitted to grow with $n$. Specifically, we obtain a finite sample upper bound for the error of approximation by a polynomial of Gaussians, measured in Kolmogorov distance, and extend it to functions that are approximately polynomial in a mean squared error sense. We give a corresponding lower bound that shows the invariance principle holds up to polynomial degree $o(\log n)$. The proof is constructive and adapts an asymmetrisation argument due to V. V. Senatov. As applications, we obtain a higher-order delta method with possibly non-Gaussian limits, and generalise a number of known results on high-dimensional and infinite-order U-statistics, and on fluctuations of subgraph counts.
The logics $\mathsf{CS4}$ and $\mathsf{IS4}$ are the two leading intuitionistic variants of the modal logic $\mathsf{S4}$. Whether the finite model property holds for each of these logics have been long-standing open problems. It was recently shown that $\mathsf{IS4}$ has the finite frame property and thus the finite model property. In this paper, we prove that $\mathsf{CS4}$ also enjoys the finite frame property. Additionally, we investigate the following three logics closely related to $\mathsf{IS4}$. The logic $\mathsf{GS4}$ is obtained by adding the G\"odel--Dummett axiom to $\mathsf{IS4}$; it is both a superintuitionistic and a fuzzy logic and has previously been given a real-valued semantics. We provide an alternative birelational semantics and prove strong completeness with respect to this semantics. The extension $\mathsf{GS4^c}$ of $\mathsf{GS4}$ corresponds to requiring a crisp accessibility relation on the real-valued semantics. We give a birelational semantics corresponding to an extra confluence condition on the $\mathsf{GS4}$ birelational semantics and prove strong completeness. Neither of these two logics have the finite model property with respect to their real-valued semantics, but we prove that they have the finite frame property for their birelational semantics. Establishing the finite birelational frame property immediately establishes decidability, which was previously open for these two logics. Our proofs yield NEXPTIME upper bounds. The logic $\mathsf{S4I}$ is obtained from $\mathsf{IS4}$ by reversing the roles of the modal and intuitionistic relations in the birelational semantics. We also prove the finite frame property, and thereby decidability, for $\mathsf{S4I}$
A class $\mathcal{G}$ of graphs is called hereditary if it is closed under taking induced subgraphs. We denote by $G^{epex}$ the class of graphs that are at most one edge away from being in $\mathcal{G}$. We note that $G^{epex}$ is hereditary and prove that if a hereditary class $\mathcal{G}$ has finitely many forbidden induced subgraphs, then so does $G^{epex}$. The hereditary class of cographs consists of all graphs $G$ that can be generated from $K_1$ using complementation and disjoint union. Cographs are precisely the graphs that do not have the $4$-vertex path as an induced subgraph. For the class of edge-apex cographs our main result bounds the order of such forbidden induced subgraphs by 8 and finds all of them by computer search.
We study empirical variants of the halfspace (Tukey) depth of a probability measure $\mu$, which are obtained by replacing $\mu$ with the corresponding weighted empirical measure. We prove analogues of the Marcinkiewicz--Zygmund strong law of large numbers and of the law of the iterated logarithm in terms of set inclusions and for the Hausdorff distance between the theoretical and empirical variants of depth trimmed regions. In the special case of $\mu$ being the uniform distribution on a convex body $K$, the depth trimmed regions are convex floating bodies of $K$, and we obtain strong limit theorems for their empirical estimators.
In this work, we derive a $\gamma$-robust a posteriori error estimator for finite element approximations of the Allen-Cahn equation with variable non-degenerate mobility. The estimator utilizes spectral estimates for the linearized steady part of the differential operator as well as a conditional stability estimate based on a weighted sum of Bregman distances, based on the energy and a functional related to the mobility. A suitable reconstruction of the numerical solution in the stability estimate leads to a fully computable estimator.
Given an undirected graph $G$, a quasi-clique is a subgraph of $G$ whose density is at least $\gamma$ $(0 < \gamma \leq 1)$. Two optimization problems can be defined for quasi-cliques: the Maximum Quasi-Clique (MQC) Problem, which finds a quasi-clique with maximum vertex cardinality, and the Densest $k$-Subgraph (DKS) Problem, which finds the densest subgraph given a fixed cardinality constraint. Most existing approaches to solve both problems often disregard the requirement of connectedness, which may lead to solutions containing isolated components that are meaningless for many real-life applications. To address this issue, we propose two flow-based connectedness constraints to be integrated into known Mixed-Integer Linear Programming (MILP) formulations for either MQC or DKS problems. We compare the performance of MILP formulations enhanced with our connectedness constraints in terms of both running time and number of solved instances against existing approaches that ensure quasi-clique connectedness. Experimental results demonstrate that our constraints are quite competitive, making them valuable for practical applications requiring connectedness.