In this work, for a given oriented graph $D$, we study its interval and hull numbers, denoted by ${in}(D)$ and ${hn}(D)$, respectively, in the geodetic, ${P_3}$ and ${P_3^*}$ convexities. This last one, we believe to be formally defined and first studied in this paper, although its undirected version is well-known in the literature. Concerning bounds, for a strongly oriented graph $D$, we prove that ${hn_g}(D)\leq m(D)-n(D)+2$ and that there is a strongly oriented graph such that ${hn_g}(D) = m(D)-n(D)$. We also determine exact values for the hull numbers in these three convexities for tournaments, which imply polynomial-time algorithms to compute them. These results allows us to deduce polynomial-time algorithms to compute ${hn_{P_3}}(D)$ when the underlying graph of $D$ is split or cobipartite. Moreover, we provide a meta-theorem by proving that if deciding whether ${in_g}(D)\leq k$ or ${hn_g}(D)\leq k$ is NP-hard or W[i]-hard parameterized by $k$, for some $i\in\mathbb{Z_+^*}$, then the same holds even if the underlying graph of $D$ is bipartite. Next, we prove that deciding whether ${hn_{P_3}}(D)\leq k$ or ${hn_{P_3^*}}(D)\leq k$ is W[2]-hard parameterized by $k$, even if the underlying graph of $D$ is bipartite; that deciding whether ${in_{P_3}}(D)\leq k$ or ${in_{P_3^*}}(D)\leq k$ is NP-complete, even if $D$ has no directed cycles and the underlying graph of $D$ is a chordal bipartite graph; and that deciding whether ${in_{P_3}}(D)\leq k$ or ${in_{P_3^*}}(D)\leq k$ is W[2]-hard parameterized by $k$, even if the underlying graph of $D$ is split. We also argue that the interval and hull numbers in the oriented $P_3$ and $P_3^*$ convexities can be computed in polynomial time for graphs of bounded tree-width by using Courcelle's theorem.
A strong blocking set in a finite projective space is a set of points that intersects each hyperplane in a spanning set. We provide a new graph theoretic construction of such sets: combining constant-degree expanders with asymptotically good codes, we explicitly construct strong blocking sets in the $(k-1)$-dimensional projective space over $\mathbb{F}_q$ that have size $O( q k )$. Since strong blocking sets have recently been shown to be equivalent to minimal linear codes, our construction gives the first explicit construction of $\mathbb{F}_q$-linear minimal codes of length $n$ and dimension $k$, for every prime power $q$, for which $n = O (q k)$. This solves one of the main open problems on minimal codes.
Consider the Toeplitz matrix $T_n(f)$ generated by the symbol $f(\theta)=\hat{f}_r e^{\mathbf{i}r\theta}+\hat{f}_0+\hat{f}_{-s} e^{-\mathbf{i}s\theta}$, where $\hat{f}_r, \hat{f}_0, \hat{f}_{-s} \in \mathbb{C}$ and $0<r<n,~0<s<n$. For $r=s=1$ we have the classical tridiagonal Toeplitz matrices, for which the eigenvalues and eigenvectors are known. Similarly, the eigendecompositions are known for $1<r=s$, when the generated matrices are ``symmetrically sparse tridiagonal''. In the current paper we study the eigenvalues of $T_n(f)$ for $1\leq r<s$, which are ``non-symmetrically sparse tridiagonal''. We propose an algorithm which constructs one or two ad hoc matrices smaller than $T_n(f)$, whose eigenvalues are sufficient for determining the full spectrum of $T_n(f)$. The algorithm is explained through use of a conjecture for which examples and numerical experiments are reported for supporting it and for clarifying the presentation. Open problems are briefly discussed.
We show that most structured prediction problems can be solved in linear time and space by considering them as partial orderings of the tokens in the input string. Our method computes real numbers for each token in an input string and sorts the tokens accordingly, resulting in as few as 2 total orders of the tokens in the string. Each total order possesses a set of edges oriented from smaller to greater tokens. The intersection of total orders results in a partial order over the set of input tokens, which is then decoded into a directed graph representing the desired structure. Experiments show that our method achieves 95.4 LAS and 96.9 UAS by using an intersection of 2 total orders, 95.7 LAS and 97.1 UAS with 4 on the English Penn Treebank dependency parsing benchmark. Our method is also the first linear-complexity coreference resolution model and achieves 79.2 F1 on the English OntoNotes benchmark, which is comparable with state of the art.
Reasoning system dynamics is one of the most important analytical approaches for many scientific studies. With the initial state of a system as input, the recent graph neural networks (GNNs)-based methods are capable of predicting the future state distant in time with high accuracy. Although these methods have diverse designs in modeling the coordinates and interacting forces of the system, we show that they actually share a common paradigm that learns the integration of the velocity over the interval between the initial and terminal coordinates. However, their integrand is constant w.r.t. time. Inspired by this observation, we propose a new approach to predict the integration based on several velocity estimations with Newton-Cotes formulas and prove its effectiveness theoretically. Extensive experiments on several benchmarks empirically demonstrate consistent and significant improvement compared with the state-of-the-art methods.
The success of over-parameterized neural networks trained to near-zero training error has caused great interest in the phenomenon of benign overfitting, where estimators are statistically consistent even though they interpolate noisy training data. While benign overfitting in fixed dimension has been established for some learning methods, current literature suggests that for regression with typical kernel methods and wide neural networks, benign overfitting requires a high-dimensional setting where the dimension grows with the sample size. In this paper, we show that the smoothness of the estimators, and not the dimension, is the key: benign overfitting is possible if and only if the estimator's derivatives are large enough. We generalize existing inconsistency results to non-interpolating models and more kernels to show that benign overfitting with moderate derivatives is impossible in fixed dimension. Conversely, we show that benign overfitting is possible for regression with a sequence of spiky-smooth kernels with large derivatives. Using neural tangent kernels, we translate our results to wide neural networks. We prove that while infinite-width networks do not overfit benignly with the ReLU activation, this can be fixed by adding small high-frequency fluctuations to the activation function. Our experiments verify that such neural networks, while overfitting, can indeed generalize well even on low-dimensional data sets.
Subspace codes have important applications in random network coding. It is interesting to construct subspace codes with both sizes, and the minimum distances are as large as possible. In particular, cyclic constant dimension subspaces codes have additional properties which can be used to make encoding and decoding more efficient. In this paper, we construct large cyclic constant dimension subspace codes with minimum distances $2k-2$ and $2k$. These codes are contained in $\mathcal{G}_q(n, k)$, where $\mathcal{G}_q(n, k)$ denotes the set of all $k$-dimensional subspaces of $\mathbb{F}_{q^n}$. Consequently, some results in \cite{FW}, \cite{NXG}, and \cite{ZT} are extended.
We analyze to what extent final users can infer information about the level of protection of their data when the data obfuscation mechanism is a priori unknown to them (the so-called ''black-box'' scenario). In particular, we delve into the investigation of two notions of local differential privacy (LDP), namely {\epsilon}-LDP and R\'enyi LDP. On one hand, we prove that, without any assumption on the underlying distributions, it is not possible to have an algorithm able to infer the level of data protection with provable guarantees; this result also holds for the central versions of the two notions of DP considered. On the other hand, we demonstrate that, under reasonable assumptions (namely, Lipschitzness of the involved densities on a closed interval), such guarantees exist and can be achieved by a simple histogram-based estimator. We validate our results experimentally and we note that, on a particularly well-behaved distribution (namely, the Laplace noise), our method gives even better results than expected, in the sense that in practice the number of samples needed to achieve the desired confidence is smaller than the theoretical bound, and the estimation of {\epsilon} is more precise than predicted.
Numerical differentiation of a function, contaminated with noise, over the unit interval $[0,1] \subset \mathbb{R}$ by inverting the simple integration operator $J:L^2([0,1]) \to L^2([0,1])$ defined as $[Jx](s):=\int_0^s x(t) dt$ is discussed extensively in the literature. The complete singular system of the compact operator $J$ is explicitly given with singular values $\sigma_n(J)$ asymptotically proportional to $1/n$, which indicates a degree {\sl one} of ill-posedness for this inverse problem. We recall the concept of the degree of ill-posedness for linear operator equations with compact forward operators in Hilbert spaces. In contrast to the one-dimensional case with operator $J$, there is little material available about the analysis of the d-dimensional case, where the compact integral operator $J_d:L^2([0,1]^d) \to L^2([0,1]^d)$ defined as $[J_d\,x](s_1,\ldots,s_d):=\int_0^{s_1}\ldots\int_0^{s_d} x(t_1,\ldots,t_d)\, dt_d\ldots dt_1$ over unit $d$-cube is to be inverted. This inverse problem of mixed differentiation $x(s_1,\ldots,s_d)=\frac{\partial^d}{\partial s_1 \ldots \partial s_d} y(s_1,\ldots ,s_d)$ is of practical interest, for example when in statistics copula densities have to be verified from empirical copulas over $[0,1]^d \subset \mathbb{R}^d$. In this note, we prove that the non-increasingly ordered singular values $\sigma_n(J_d)$ of the operator $J_d$ have an asymptotics of the form $\frac{(\log n)^{d-1}}{n}$, which shows that the degree of ill-posedness stays at one, even though an additional logarithmic factor occurs. Some more discussion refers to the special case $d=2$ for characterizing the range $\mathcal{R}(J_2)$ of the operator $J_2$.
We introduce and analyze a new finite-difference scheme, relying on the theta-method, for solving monotone second-order mean field games. These games consist of a coupled system of the Fokker-Planck and the Hamilton-Jacobi-Bellman equation. The theta-method is used for discretizing the diffusion terms: we approximate them with a convex combination of an implicit and an explicit term. On contrast, we use an explicit centered scheme for the first-order terms. Assuming that the running cost is strongly convex and regular, we first prove the monotonicity and the stability of our theta-scheme, under a CFL condition. Taking advantage of the regularity of the solution of the continuous problem, we estimate the consistency error of the theta-scheme. Our main result is a convergence rate of order $\mathcal{O}(h^r)$ for the theta-scheme, where $h$ is the step length of the space variable and $r \in (0,1)$ is related to the H\"older continuity of the solution of the continuous problem and some of its derivatives.
We study the problem of identification of linear dynamical system from a single trajectory, via excitations of isotropic Gaussian. In stark contrast with previously reported results, Ordinary Least Squares (OLS) estimator for even \emph{stable} dynamical system contains non-vanishing error in \emph{high dimensions}; which stems from the fact that realizations of non-diagonalizable dynamics can have strong \emph{spatial correlations} and a variance, of order $O(e^{n})$, where $n$ is the dimension of the underlying state space. Employing \emph{concentration of measure phenomenon}, in particular tensorization of \emph{Talagrands inequality} for random dynamical systems we show that observed trajectory of dynamical system of length-$N$ can have a variance of order $O(e^{nN})$. Consequently, showing some or most of the $n$ distances between an $N-$ dimensional random vector and an $(n-1)$ dimensional hyperplane in $\mathbb{R}^{N}$ can be close to zero with positive probability and these estimates become stronger in high dimensions and more iterations via \emph{Isoperimetry}. \emph{Negative second moment identity}, along with distance estimates give a control on all the singular values of \emph{Random matrix} of data, revealing limitations of OLS for stable non-diagonalizable and explosive diagonalizable systems.