Aboulker et al. proved that a digraph with large enough dichromatic number contains any fixed digraph as a subdivision. The dichromatic number of a digraph is the smallest order of a partition of its vertex set into acyclic induced subdigraphs. A digraph is dicritical if the removal of any arc or vertex decreases its dichromatic number. In this paper we give sufficient conditions on a dicritical digraph of large order or large directed girth to contain a given digraph as a subdivision. In particular, we prove that (i) for every integers $k,\ell$, large enough dicritical digraphs with dichromatic number $k$ contain an orientation of a cycle with at least $\ell$ vertices; (ii) there are functions $f,g$ such that for every subdivision $F^*$ of a digraph $F$, digraphs with directed girth at least $f(F^*)$ and dichromatic number at least $g(F)$ contain a subdivision of $F^*$, and if $F$ is a tree, then $g(F)=|V(F)|$; (iii) there is a function $f$ such that for every subdivision $F^*$ of $TT_3$ (the transitive tournament on three vertices), digraphs with directed girth at least $f(F^*)$ and minimum out-degree at least $2$ contain $F^*$ as a subdivision.
Randomized matrix algorithms have become workhorse tools in scientific computing and machine learning. To use these algorithms safely in applications, they should be coupled with posterior error estimates to assess the quality of the output. To meet this need, this paper proposes two diagnostics: a leave-one-out error estimator for randomized low-rank approximations and a jackknife resampling method to estimate the variance of the output of a randomized matrix computation. Both of these diagnostics are rapid to compute for randomized low-rank approximation algorithms such as the randomized SVD and randomized Nystr\"om approximation, and they provide useful information that can be used to assess the quality of the computed output and guide algorithmic parameter choices.
Digital credentials represent a cornerstone of digital identity on the Internet. To achieve privacy, certain functionalities in credentials should be implemented. One is selective disclosure, which allows users to disclose only the claims or attributes they want. This paper presents a novel approach to selective disclosure that combines Merkle hash trees and Boneh-Lynn-Shacham (BLS) signatures. Combining these approaches, we achieve selective disclosure of claims in a single credential and creation of a verifiable presentation containing selectively disclosed claims from multiple credentials signed by different parties. Besides selective disclosure, we enable issuing credentials signed by multiple issuers using this approach.
Validation metrics are key for the reliable tracking of scientific progress and for bridging the current chasm between artificial intelligence (AI) research and its translation into practice. However, increasing evidence shows that particularly in image analysis, metrics are often chosen inadequately in relation to the underlying research problem. This could be attributed to a lack of accessibility of metric-related knowledge: While taking into account the individual strengths, weaknesses, and limitations of validation metrics is a critical prerequisite to making educated choices, the relevant knowledge is currently scattered and poorly accessible to individual researchers. Based on a multi-stage Delphi process conducted by a multidisciplinary expert consortium as well as extensive community feedback, the present work provides the first reliable and comprehensive common point of access to information on pitfalls related to validation metrics in image analysis. Focusing on biomedical image analysis but with the potential of transfer to other fields, the addressed pitfalls generalize across application domains and are categorized according to a newly created, domain-agnostic taxonomy. To facilitate comprehension, illustrations and specific examples accompany each pitfall. As a structured body of information accessible to researchers of all levels of expertise, this work enhances global comprehension of a key topic in image analysis validation.
Three types of the Parikh mapping are introduced, namely, alphabetic, alphabetic-basis and basis. Explicit expressions for attractors of the k-th order in bases n >= 8, including countable ones, are found. Properties for the alphabetic, alphabetic-basis and basis Parikh vectors are given at each step of the Parikh mapping. The maximum number of iterations leading to attractors of the k-th order in the basis n is determined.
We propose a simple empirical representation of expectations such that: For a number of samples above a certain threshold, drawn from any probability distribution with finite fourth-order statistic, the proposed estimator outperforms the empirical average when tested against the actual population, with respect to the quadratic loss. For datasets smaller than this threshold, the result still holds, but for a class of distributions determined by their first four statistics. Our approach leverages the duality between distributionally robust and risk-averse optimization.
We propose a novel way to describe numerical methods for ordinary differential equations via the notion of multi-indice. The main idea is to replace rooted trees in Butcher's B-series by multi-indices. The latter were introduced recently in the context of describing solutions of singular stochastic partial differential equations. The combinatorial shift away from rooted trees allows for a compressed description of numerical schemes. Moreover, these multi-indices B-series characterise uniquely the Taylor development of local and affine equivariant maps.
Genome assembly is a prominent problem studied in bioinformatics, which computes the source string using a set of its overlapping substrings. Classically, genome assembly uses assembly graphs built using this set of substrings to compute the source string efficiently, having a tradeoff between scalability and avoiding information loss. The scalable de Bruijn graphs come at the price of losing crucial overlap information. The complete overlap information is stored in overlap graphs using quadratic space. Hierarchical overlap graphs [IPL20] (HOG) overcome these limitations, avoiding information loss despite using linear space. After a series of suboptimal improvements, Khan and Park et al. simultaneously presented two optimal algorithms [CPM2021], where only the former was seemingly practical. We empirically analyze all the practical algorithms for computing HOG, where the optimal algorithm [CPM2021] outperforms the previous algorithms as expected, though at the expense of extra memory. However, it uses non-intuitive approach and non-trivial data structures. We present arguably the most intuitive algorithm, using only elementary arrays, which is also optimal. Our algorithm empirically proves even better for both time and memory over all the algorithms, highlighting its significance in both theory and practice. We further explore the applications of hierarchical overlap graphs to solve various forms of suffix-prefix queries on a set of strings. Loukides et al. [CPM2023] recently presented state-of-the-art algorithms for these queries. However, these algorithms require complex black-box data structures and are seemingly impractical. Our algorithms, despite failing to match the state-of-the-art algorithms theoretically, answer different queries ranging from 0.01-100 milliseconds for a data set having around a billion characters.
We propose a novel methodology to solve a key eigenvalue optimization problem which arises in the contractivity analysis of neural ODEs. When looking at contractivity properties of a one layer weight-tied neural ODE $\dot{u}(t)=\sigma(Au(t)+b)$ (with $u,b \in {\mathbb R}^n$, $A$ is a given $n \times n$ matrix, $\sigma : {\mathbb R} \to {\mathbb R}^+$ denotes an activation function and for a vector $z \in {\mathbb R}^n$, $\sigma(z) \in {\mathbb R}^n$ has to be interpreted entry-wise), we are led to study the logarithmic norm of a set of products of type $D A$, where $D$ is a diagonal matrix such that ${\mathrm{diag}}(D) \in \sigma'({\mathbb R}^n)$. Specifically, given a real number $c$ (usually $c=0$), the problem consists in finding the largest positive interval $\chi\subseteq \mathbb [0,\infty)$ such that the logarithmic norm $\mu(DA) \le c$ for all diagonal matrices $D$ with $D_{ii}\in \chi$. We propose a two-level nested methodology: an inner level where, for a given $\chi$, we compute an optimizer $D^\star(\chi)$ by a gradient system approach, and an outer level where we tune $\chi$ so that the value $c$ is reached by $\mu(D^\star(\chi)A)$. We extend the proposed two-level approach to the general multilayer, and possibly time-dependent, case $\dot{u}(t) = \sigma( A_k(t) \ldots \sigma ( A_{1}(t) u(t) + b_{1}(t) ) \ldots + b_{k}(t) )$ and we propose several numerical examples to illustrate its behaviour, including its stabilizing performance on a one-layer neural ODE applied to the classification of the MNIST handwritten digits dataset.
By computing a feedback control via the linear quadratic regulator (LQR) approach and simulating a non-linear non-autonomous closed-loop system using this feedback, we combine two numerically challenging tasks. For the first task, the computation of the feedback control, we use the non-autonomous generalized differential Riccati equation (DRE), whose solution determines the time-varying feedback gain matrix. Regarding the second task, we want to be able to simulate non-linear closed-loop systems for which it is known that the regulator is only valid for sufficiently small perturbations. Thus, one easily runs into numerical issues in the integrators when the closed-loop control varies greatly. For these systems, e.g., the A-stable implicit Euler methods fails.\newline On the one hand, we implement non-autonomous versions of splitting schemes and BDF methods for the solution of our non-autonomous DREs. These are well-established DRE solvers in the autonomous case. On the other hand, to tackle the numerical issues in the simulation of the non-linear closed-loop system, we apply a fractional-step-theta scheme with time-adaptivity tuned specifically to this kind of challenge. That is, we additionally base the time-adaptivity on the activity of the control. We compare this approach to the more classical error-based time-adaptivity.\newline We describe techniques to make these two tasks computable in a reasonable amount of time and are able to simulate closed-loop systems with strongly varying controls, while avoiding numerical issues. Our time-adaptivity approach requires fewer time steps than the error-based alternative and is more reliable.
We present a unification and generalization of sequentially and hierarchically semi-separable (SSS and HSS) matrices called tree semi-separable (TSS) matrices. Our main result is to show that any dense matrix can be expressed in a TSS format. Here, the dimensions of the generators are specified by the ranks of the Hankel blocks of the matrix. TSS matrices satisfy a graph-induced rank structure (GIRS) property. It is shown that TSS matrices generalize the algebraic properties of SSS and HSS matrices under addition, products, and inversion. Subsequently, TSS matrices admit linear time matrix-vector multiply, matrix-matrix multiply, matrix-matrix addition, inversion, and solvers.