By using the notion of $d$-embedding $\Gamma$ of a (canonical) subgeometry $\Sigma$ and of exterior set with respect to the $h$-secant variety $\Omega_{h}(\mathcal{A})$ of a subset $\mathcal{A}$, $ 0 \leq h \leq n-1$, in the finite projective space $\mathrm{PG}(n-1,q^n)$, $n \geq 3$, in this article we construct a class of non-linear $(n,n,q;d)$-MRD codes for any $ 2 \leq d \leq n-1$. A code $\mathcal{C}_{\sigma,T}$ of this class, where $1\in T \subset \mathbb{F}_q^*$ and $\sigma$ is a generator of $\mathrm{Gal}(\mathbb{F}_{q^n}|\mathbb{F}_q)$, arises from a cone of $\mathrm{PG}(n-1,q^n)$ with vertex an $(n-d-2)$-dimensional subspace over a maximum exterior set $\mathcal{E}$ with respect to $\Omega_{d-2}(\Gamma)$. We prove that the codes introduced in [Cossidente, A., Marino, G., Pavese, F.: Non-linear maximum rank distance codes. Des. Codes Cryptogr. 79, 597--609 (2016); Durante, N., Siciliano, A.: Non-linear maximum rank distance codes in the cyclic model for the field reduction of finite geometries. Electron. J. Comb. (2017); Donati, G., Durante, N.: A generalization of the normal rational curve in $\mathrm{PG}(d,q^n)$ and its associated non-linear MRD codes. Des. Codes Cryptogr. 86, 1175--1184 (2018)] are appropriate punctured ones of $\mathcal{C}_{\sigma,T}$ and solve completely the inequivalence issue for this class showing that $\mathcal{C}_{\sigma,T}$ is neither equivalent nor adjointly equivalent to the non-linear MRD code $\mathcal{C}_{n,k,\sigma,I}$, $I \subseteq \mathbb{F}_q$, obtained in [Otal, K., \"Ozbudak, F.: Some new non-additive maximum rank distance codes. Finite Fields and Their Applications 50, 293--303 (2018).].
Quantum machine learning has become an area of growing interest but has certain theoretical and hardware-specific limitations. Notably, the problem of vanishing gradients, or barren plateaus, renders the training impossible for circuits with high qubit counts, imposing a limit on the number of qubits that data scientists can use for solving problems. Independently, angle-embedded supervised quantum neural networks were shown to produce truncated Fourier series with a degree directly dependent on two factors: the depth of the encoding and the number of parallel qubits the encoding applied to. The degree of the Fourier series limits the model expressivity. This work introduces two new architectures whose Fourier degrees grow exponentially: the sequential and parallel exponential quantum machine learning architectures. This is done by efficiently using the available Hilbert space when encoding, increasing the expressivity of the quantum encoding. Therefore, the exponential growth allows staying at the low-qubit limit to create highly expressive circuits avoiding barren plateaus. Practically, parallel exponential architecture was shown to outperform the existing linear architectures by reducing their final mean square error value by up to 44.7% in a one-dimensional test problem. Furthermore, the feasibility of this technique was also shown on a trapped ion quantum processing unit.
We establish universality and expression rate bounds for a class of neural Deep Operator Networks (DON) emulating Lipschitz (or H\"older) continuous maps $\mathcal G:\mathcal X\to\mathcal Y$ between (subsets of) separable Hilbert spaces $\mathcal X$, $\mathcal Y$. The DON architecture considered uses linear encoders $\mathcal E$ and decoders $\mathcal D$ via (biorthogonal) Riesz bases of $\mathcal X$, $\mathcal Y$, and an approximator network of an infinite-dimensional, parametric coordinate map that is Lipschitz continuous on the sequence space $\ell^2(\mathbb N)$. Unlike previous works ([Herrmann, Schwab and Zech: Neural and Spectral operator surrogates: construction and expression rate bounds, SAM Report, 2022], [Marcati and Schwab: Exponential Convergence of Deep Operator Networks for Elliptic Partial Differential Equations, SAM Report, 2022]), which required for example $\mathcal G$ to be holomorphic, the present expression rate results require mere Lipschitz (or H\"older) continuity of $\mathcal G$. Key in the proof of the present expression rate bounds is the use of either super-expressive activations (e.g. [Yarotski: Elementary superexpressive activations, Int. Conf. on ML, 2021], [Shen, Yang and Zhang: Neural network approximation: Three hidden layers are enough, Neural Networks, 2021], and the references there) which are inspired by the Kolmogorov superposition theorem, or of nonstandard NN architectures with standard (ReLU) activations as recently proposed in [Zhang, Shen and Yang: Neural Network Architecture Beyond Width and Depth, Adv. in Neural Inf. Proc. Sys., 2022]. We illustrate the abstract results by approximation rate bounds for emulation of a) solution operators for parametric elliptic variational inequalities, and b) Lipschitz maps of Hilbert-Schmidt operators.
A key technique of machine learning and computer vision is to embed discrete weighted graphs into continuous spaces for further downstream processing. Embedding discrete hierarchical structures in hyperbolic geometry has proven very successful since it was shown that any weighted tree can be embedded in that geometry with arbitrary low distortion. Various optimization methods for hyperbolic embeddings based on common models of hyperbolic geometry have been studied. In this paper, we consider Hilbert geometry for the standard simplex which is isometric to a vector space equipped with the variation polytope norm. We study the representation power of this Hilbert simplex geometry by embedding distance matrices of graphs. Our findings demonstrate that Hilbert simplex geometry is competitive to alternative geometries such as the Poincar\'e hyperbolic ball or the Euclidean geometry for embedding tasks while being fast and numerically robust.
Let $(X, d)$ be a metric space and $\mathcal{C} \subseteq 2^X$ -- a collection of special objects. In the $(X,d,\mathcal{C})$-chasing problem, an online player receives a sequence of online requests $\{B_t\}_{t=1}^T \subseteq \mathcal{C}$ and responds with a trajectory $\{x_t\}_{t=1}^T$ such that $x_t \in B_t$. This response incurs a movement cost $\sum_{t=1}^T d(x_t, x_{t-1})$, and the online player strives to minimize the competitive ratio -- the worst case ratio over all input sequences between the online movement cost and the optimal movement cost in hindsight. Under this setup, we call the $(X,d,\mathcal{C})$-chasing problem $\textit{chaseable}$ if there exists an online algorithm with finite competitive ratio. In the case of Convex Body Chasing (CBC) over real normed vector spaces, (Bubeck et al. 2019) proved the chaseability of the problem. Furthermore, in the vector space setting, the dimension of the ambient space appears to be the factor controlling the size of the competitive ratio. Indeed, recently, (Sellke 2020) provided a $d-$competitive online algorithm over arbitrary real normed vector spaces $(\mathbb{R}^d, ||\cdot||)$, and we will shortly present a general strategy for obtaining novel lower bounds of the form $\Omega(d^c), \enspace c > 0$, for CBC in the same setting. In this paper, we also prove that the $\textit{doubling}$ and $\textit{Assouad}$ dimensions of a metric space exert no control on the hardness of ball chasing over the said metric space. More specifically, we show that for any large enough $\rho \in \mathbb{R}$, there exists a metric space $(X,d)$ of doubling dimension $\Theta(\rho)$ and Assouad dimension $\rho$ such that no online selector can achieve a finite competitive ratio in the general ball chasing regime.
Explicit bases for the subfield subcodes of projective Reed-Muller codes over the projective plane and their duals are obtained. In particular, we provide a formula for the dimension of these codes. For the general case over the projective space, we are able to generalize the necessary tools to deal with this case as well: we obtain a universal Gr\"obner basis for the vanishing ideal of the set of standard representatives of the projective space and we are able to reduce any monomial with respect to this Gr\"obner basis. With respect to the parameters of these codes, by considering subfield subcodes of projective Reed-Muller codes we are able to obtain long linear codes with good parameters over a small finite field.
One of the distinct characteristics in radiologists' reading of multiparametric prostate MR scans, using reporting systems such as PI-RADS v2.1, is to score individual types of MR modalities, T2-weighted, diffusion-weighted, and dynamic contrast-enhanced, and then combine these image-modality-specific scores using standardised decision rules to predict the likelihood of clinically significant cancer. This work aims to demonstrate that it is feasible for low-dimensional parametric models to model such decision rules in the proposed Combiner networks, without compromising the accuracy of predicting radiologic labels: First, it is shown that either a linear mixture model or a nonlinear stacking model is sufficient to model PI-RADS decision rules for localising prostate cancer. Second, parameters of these (generalised) linear models are proposed as hyperparameters, to weigh multiple networks that independently represent individual image modalities in the Combiner network training, as opposed to end-to-end modality ensemble. A HyperCombiner network is developed to train a single image segmentation network that can be conditioned on these hyperparameters during inference, for much improved efficiency. Experimental results based on data from 850 patients, for the application of automating radiologist labelling multi-parametric MR, compare the proposed combiner networks with other commonly-adopted end-to-end networks. Using the added advantages of obtaining and interpreting the modality combining rules, in terms of the linear weights or odds-ratios on individual image modalities, three clinical applications are presented for prostate cancer segmentation, including modality availability assessment, importance quantification and rule discovery.
The classical way of extending an $[n, k, d]$ linear code $\C$ is to add an overall parity-check coordinate to each codeword of the linear code $\C$. The extended code, denoted by $\overline{\C}$ and called the standardly extended code of $\C$, is a linear code with parameters $[n+1, k, \bar{d}]$, where $\bar{d}=d$ or $\bar{d}=d+1$. This extending technique is one of the classical ways to construct a new linear code with a known linear code and a way to study the original code $\C$ via its extended code $\overline{\C}$. The standardly extended codes of some families of binary linear codes have been studied to some extent. However, not much is known about the standardly extended codes of nonbinary codes. For example, the minimum distances of the standardly extended codes of the nonbinary Hamming codes remain open for over 70 years. The first objective of this paper is to introduce the nonstandardly extended codes of a linear code and develop some general theory for extended linear codes. The second objective is to study the extended codes of several families of linear codes, including cyclic codes, projective two-weight codes and nonbinary Hamming codes. Many families of distance-optimal linear codes are obtained with the extending technique.
One of the most studied extensions of the famous Traveling Salesperson Problem (TSP) is the {\sc Multiple TSP}: a set of $m\geq 1$ salespersons collectively traverses a set of $n$ cities by $m$ non-trivial tours, to minimize the total length of their tours. This problem can also be considered to be a variant of {\sc Uncapacitated Vehicle Routing} where the objective function is the sum of all tour lengths. When all $m$ tours start from a single common \emph{depot} $v_0$, then the metric {\sc Multiple TSP} can be approximated equally well as the standard metric TSP, as shown by Frieze (1983). The {\sc Multiple TSP} becomes significantly harder to approximate when there is a \emph{set} $D$ of $d \geq 1$ depots that form the starting and end points of the $m$ tours. For this case only a $(2-1/d)$-approximation in polynomial time is known, as well as a $3/2$-approximation for \emph{constant} $d$ which requires a prohibitive run time of $n^{\Theta(d)}$ (Xu and Rodrigues, \emph{INFORMS J. Comput.}, 2015). A recent work of Traub, Vygen and Zenklusen (STOC 2020) gives another approximation algorithm for {\sc Multiple TSP} running in time $n^{\Theta(d)}$ and reducing the problem to approximating TSP. In this paper we overcome the $n^{\Theta(d)}$ time barrier: we give the first efficient approximation algorithm for {\sc Multiple TSP} with a \emph{variable} number $d$ of depots that yields a better-than-2 approximation. Our algorithm runs in time $(1/\varepsilon)^{\mathcal O(d\log d)}\cdot n^{\mathcal O(1)}$, and produces a $(3/2+\varepsilon)$-approximation with constant probability. For the graphic case, we obtain a deterministic $3/2$-approximation in time $2^d\cdot n^{\mathcal O(1)}$.ithm for metric {\sc Multiple TSP} with run time $n^{\Theta(d)}$, which reduces the problem to approximating metric TSP.
Positional and structural encodings (PSE) enable better identifiability of nodes within a graph, as in general graphs lack a canonical node ordering. This renders PSEs essential tools for empowering modern GNNs, and in particular graph Transformers. However, designing PSEs that work optimally for a variety of graph prediction tasks is a challenging and unsolved problem. Here, we present the graph positional and structural encoder (GPSE), a first-ever attempt to train a graph encoder that captures rich PSE representations for augmenting any GNN. GPSE can effectively learn a common latent representation for multiple PSEs, and is highly transferable. The encoder trained on a particular graph dataset can be used effectively on datasets drawn from significantly different distributions and even modalities. We show that across a wide range of benchmarks, GPSE-enhanced models can significantly improve the performance in certain tasks, while performing on par with those that employ explicitly computed PSEs in other cases. Our results pave the way for the development of large pre-trained models for extracting graph positional and structural information and highlight their potential as a viable alternative to explicitly computed PSEs as well as to existing self-supervised pre-training approaches.
We characterise the classes of tournaments with tractable first-order model checking. For every hereditary class of tournaments $\mathcal T$, first-order model checking either is fixed parameter tractable, or is AW$[*]$-hard. This dichotomy coincides with the fact that $\mathcal T$ has either bounded or unbounded twin-width, and that the growth of $\mathcal T$ is either at most exponential or at least factorial. From the model-theoretic point of view, we show that NIP classes of tournaments coincide with bounded twin-width. Twin-width is also characterised by three infinite families of obstructions: $\mathcal T$ has bounded twin-width if and only if it excludes one tournament from each family. This generalises results of Bonnet et al. on ordered graphs. The key for these results is a polynomial time algorithm which takes as input a tournament $T$ and compute a linear order $<$ on $V(T)$ such that the twin-width of the birelation $(T,<)$ is at most some function of the twin-width of $T$. Since approximating twin-width can be done in polynomial time for an ordered structure $(T,<)$, this provides a polytime approximation of twin-width for tournaments. Our results extend to oriented graphs with stable sets of bounded size, which may also be augmented by arbitrary binary relations.