We show that the graph property of having a (very) large $k$-th Betti number $\beta_k$ for constant $k$ is testable with a constant number of queries in the dense graph model. More specifically, we consider a clique complex defined by an underlying graph and prove that for any $\varepsilon>0$, there exists $\delta(\varepsilon,k)>0$ such that testing whether $\beta_k \geq (1-\delta) d_k$ for $\delta \leq \delta(\varepsilon,k)$ reduces to tolerantly testing $(k+2)$-clique-freeness, which is known to be testable. This complements a result by Elek (2010) showing that Betti numbers are testable in the bounded-degree model. Our result combines the Euler characteristic, matroid theory and the graph removal lemma.
Let $G$ be a simple graph with adjacency matrix $A(G)$, signless Laplacian matrix $Q(G)$, degree diagonal matrix $D(G)$ and let $l(G)$ be the line graph of $G$. In 2017, Nikiforov defined the $A_\alpha$-matrix of $G$, $A_\alpha(G)$, as a linear convex combination of $A(G)$ and $D(G)$, the following way, $A_\alpha(G):=\alpha A(G)+(1-\alpha)D(G),$ where $\alpha\in[0,1]$. In this paper, we present some bounds for the eigenvalues of $A_\alpha(G)$ and for the largest and smallest eigenvalues of $A_\alpha(l(G))$. Extremal graphs attaining some of these bounds are characterized.
Randomized matrix algorithms have become workhorse tools in scientific computing and machine learning. To use these algorithms safely in applications, they should be coupled with posterior error estimates to assess the quality of the output. To meet this need, this paper proposes two diagnostics: a leave-one-out error estimator for randomized low-rank approximations and a jackknife resampling method to estimate the variance of the output of a randomized matrix computation. Both of these diagnostics are rapid to compute for randomized low-rank approximation algorithms such as the randomized SVD and randomized Nystr\"om approximation, and they provide useful information that can be used to assess the quality of the computed output and guide algorithmic parameter choices.
The fixed length Levenshtein (FLL) distance between two words $\mathbf{x,y} \in \mathbb{Z}_m^n$ is the smallest integer $t$ such that $\mathbf{x}$ can be transformed to $\mathbf{y}$ by $t$ insertions and $t$ deletions. The size of a ball in FLL metric is a fundamental but challenging problem. Very recently, Bar-Lev, Etzion, and Yaakobi explicitly determined the minimum, maximum and average sizes of the FLL balls with radius one. In this paper, based on these results, we further prove that the size of the FLL balls with radius one is highly concentrated around its mean by Azuma's inequality.
We study the problem of selecting $k$ experiments from a larger candidate pool, where the goal is to maximize mutual information (MI) between the selected subset and the underlying parameters. Finding the exact solution is to this combinatorial optimization problem is computationally costly, not only due to the complexity of the combinatorial search but also the difficulty of evaluating MI in nonlinear/non-Gaussian settings. We propose greedy approaches based on new computationally inexpensive lower bounds for MI, constructed via log-Sobolev inequalities. We demonstrate that our method outperforms random selection strategies, Gaussian approximations, and nested Monte Carlo (NMC) estimators of MI in various settings, including optimal design for nonlinear models with non-additive noise.
Let $1<t<n$ be integers, where $t$ is a divisor of $n$. An R-$q^t$-partially scattered polynomial is a $\mathbb F_q$-linearized polynomial $f$ in $\mathbb F_{q^n}[X]$ that satisfies the condition that for all $x,y\in\mathbb F_{q^n}^*$ such that $x/y\in\mathbb F_{q^t}$, if $f(x)/x=f(y)/y$, then $x/y\in\mathbb F_q$; $f$ is called scattered if this implication holds for all $x,y\in\mathbb F_{q^n}^*$. Two polynomials in $\mathbb F_{q^n}[X]$ are said to be equivalent if their graphs are in the same orbit under the action of the group $\Gamma L(2,q^n)$. For $n>8$ only three families of scattered polynomials in $\mathbb F_{q^n}[X]$ are known: $(i)$~monomials of pseudoregulus type, $(ii)$~binomials of Lunardon-Polverino type, and $(iii)$~a family of quadrinomials defined in [9] and extended in [7,12]. In this paper we prove that the polynomial $\varphi_{m,q^J}=X^{q^{J(t-1)}}+X^{q^{J(2t-1)}}+m(X^{q^J}-X^{q^{J(t+1)}})\in\mathbb F_{q^{2t}}[X]$, $q$ odd, $t\ge3$ is R-$q^t$-partially scattered for every value of $m\in\mathbb F_{q^t}^*$ and $J$ coprime with $2t$. Moreover, for every $t>4$ and $q>5$ there exist values of $m$ for which $\varphi_{m,q}$ is scattered and new with respect to the polynomials mentioned in $(i)$, $(ii)$ and $(iii)$ above. The related linear sets are of $\Gamma L$-class at least two.
Let $A$ be a square matrix with a given structure (e.g. real matrix, sparsity pattern, Toeplitz structure, etc.) and assume that it is unstable, i.e. at least one of its eigenvalues lies in the complex right half-plane. The problem of stabilizing $A$ consists in the computation of a matrix $B$, whose eigenvalues have negative real part and such that the perturbation $\Delta=B-A$ has minimal norm. The structured stabilization further requires that the perturbation preserves the structural pattern of $A$. We solve this non-convex problem by a two-level procedure which involves the computation of the stationary points of a matrix ODE. We exploit the low rank underlying features of the problem by using an adaptive-rank integrator that follows slavishly the rank of the solution. We show the benefits derived from the low rank setting in several numerical examples, which also allow to deal with high dimensional problems.
Machine learning (ML) algorithms can often differ in performance across domains. Understanding $\textit{why}$ their performance differs is crucial for determining what types of interventions (e.g., algorithmic or operational) are most effective at closing the performance gaps. Existing methods focus on $\textit{aggregate decompositions}$ of the total performance gap into the impact of a shift in the distribution of features $p(X)$ versus the impact of a shift in the conditional distribution of the outcome $p(Y|X)$; however, such coarse explanations offer only a few options for how one can close the performance gap. $\textit{Detailed variable-level decompositions}$ that quantify the importance of each variable to each term in the aggregate decomposition can provide a much deeper understanding and suggest much more targeted interventions. However, existing methods assume knowledge of the full causal graph or make strong parametric assumptions. We introduce a nonparametric hierarchical framework that provides both aggregate and detailed decompositions for explaining why the performance of an ML algorithm differs across domains, without requiring causal knowledge. We derive debiased, computationally-efficient estimators, and statistical inference procedures for asymptotically valid confidence intervals.
New low-order $H(\textrm{div})$-conforming finite elements for symmetric tensors are constructed in arbitrary dimension. The space of shape functions is defined by enriching the symmetric quadratic polynomial space with the $(d+1)$-order normal-normal face bubble space. The reduced counterpart has only $d(d+1)^2$ degrees of freedom. Basis functions are explicitly given in terms of barycentric coordinates. Low-order conforming finite element elasticity complexes starting from the Bell element, are developed in two dimensions. These finite elements for symmetric tensors are applied to devise robust mixed finite element methods for the linear elasticity problem, which possess the uniform error estimates with respect to the Lam\'{e} coefficient $\lambda$, and superconvergence for the displacement. Numerical results are provided to verify the theoretical convergence rates.
We study strong approximation of scalar additive noise driven stochastic differential equations (SDEs) at time point $1$ in the case that the drift coefficient is bounded and has Sobolev regularity $s\in(0,1)$. Recently, it has been shown in [arXiv:2101.12185v2 (2022)] that for such SDEs the equidistant Euler approximation achieves an $L^2$-error rate of at least $(1+s)/2$, up to an arbitrary small $\varepsilon$, in terms of the number of evaluations of the driving Brownian motion $W$. In the present article we prove a matching lower error bound for $s\in(1/2,1)$. More precisely we show that, for every $s\in(1/2,1)$, the $L^2$-error rate $(1+s)/2$ can, up to a logarithmic term, not be improved in general by no numerical method based on finitely many evaluations of $W$ at fixed time points. Up to now, this result was known in the literature only for the cases $s=1/2-$ and $s=1-$. For the proof we employ the coupling of noise technique recently introduced in [arXiv:2010.00915 (2020)] to bound the $L^2$-error of an arbitrary approximation from below by the $L^2$-distance of two occupation time functionals provided by a specifically chosen drift coefficient with Sobolev regularity $s$ and two solutions of the corresponding SDE with coupled driving Brownian motions. For the analysis of the latter distance we employ a transformation of the original SDE to overcome the problem of correlated increments of the difference of the two coupled solutions, occupation time estimates to cope with the lack of regularity of the chosen drift coefficient around the point $0$ and scaling properties of the drift coefficient.
We perturb a real matrix $A$ of full column rank, and derive lower bounds for the smallest singular values of the perturbed matrix, in terms of normwise absolute perturbations. Our bounds, which extend existing lower-order expressions, demonstrate the potential increase in the smallest singular values, and represent a qualitative model for the increase in the small singular values after a matrix has been downcast to a lower arithmetic precision. Numerical experiments confirm the qualitative validity of this model and its ability to predict singular values changes in the presence of decreased arithmetic precision.