We show that any Lotka--Volterra $T$-system associated with an $n$-vertex tree $T$ as introduced in Quispel et al., J. Phys. A 56 (2023) 315201, preserves a rational measure. We also prove that the Kahan discretisation of these $T$-systems factorises and preserves the same measure. As a consequence, for the Kahan maps of Lotka--Volterra systems related to the subclass of $T$-systems corresponding to graphs with more than one $n$-vertex subtree, we are able to construct rational integrals.
In the Trivially Perfect Editing problem one is given an undirected graph $G = (V,E)$ and an integer $k$ and seeks to add or delete at most $k$ edges in $G$ to obtain a trivially perfect graph. In a recent work, Dumas, Perez and Todinca [Algorithmica 2023] proved that this problem admits a kernel with $O(k^3)$ vertices. This result heavily relies on the fact that the size of trivially perfect modules can be bounded by $O(k^2)$ as shown by Drange and Pilipczuk [Algorithmica 2018]. To obtain their cubic vertex-kernel, Dumas, Perez and Todinca [Algorithmica 2023] then showed that a more intricate structure, so-called \emph{comb}, can be reduced to $O(k^2)$ vertices. In this work we show that the bound can be improved to $O(k)$ for both aforementioned structures and thus obtain a kernel with $O(k^2)$ vertices. Our approach relies on the straightforward yet powerful observation that any large enough structure contains unaffected vertices whose neighborhood remains unchanged by an editing of size $k$, implying strong structural properties.
Given a graph $G=(V,E)$ and an integer $k$, the Cluster Editing problem asks whether we can transform $G$ into a union of vertex-disjoint cliques by at most $k$ modifications (edge deletions or insertions). In this paper, we study the following variant of Cluster Editing. We are given a graph $G=(V,E)$, a packing $\cal H$ of modification-disjoint induced $P_3$s (no pair of $P_3$s in $\cal H$ share an edge or non-edge) and an integer $\ell$. The task is to decide whether $G$ can be transformed into a union of vertex-disjoint cliques by at most $\ell+|\cal H|$ modifications (edge deletions or insertions). We show that this problem is NP-hard even when $\ell=0$ (in which case the problem asks to turn $G$ into a disjoint union of cliques by performing exactly one edge deletion or insertion per element of $\cal H$) and when each vertex is in at most 23 $P_3$s of the packing. This answers negatively a question of van Bevern, Froese, and Komusiewicz (CSR 2016, ToCS 2018), repeated by C. Komusiewicz at Shonan meeting no. 144 in March 2019. We then initiate the study to find the largest integer $c$ such that the problem remains tractable when restricting to packings such that each vertex is in at most $c$ packed $P_3$s. Here packed $P_3$s are those belonging to the packing $\cal H$. Van Bevern et al. showed that the case $c = 1$ is fixed-parameter tractable with respect to $\ell$ and we show that the case $c = 2$ is solvable in $|V|^{2\ell + O(1)}$ time.
Can a machine understand the meanings of natural language? Recent developments in the generative large language models (LLMs) of artificial intelligence have led to the belief that traditional philosophical assumptions about machine understanding of language need to be revised. This article critically evaluates the prevailing tendency to regard machine language performance as mere syntactic manipulation and the simulation of understanding, which is only partial and very shallow, without sufficient referential grounding in the world. The aim is to highlight the conditions crucial to attributing natural language understanding to state-of-the-art LLMs, where it can be legitimately argued that LLMs not only use syntax but also semantics, their understanding not being simulated but duplicated; and determine how they ground the meanings of linguistic expressions.
We address speech enhancement based on variational autoencoders, which involves learning a speech prior distribution in the time-frequency (TF) domain. A zero-mean complex-valued Gaussian distribution is usually assumed for the generative model, where the speech information is encoded in the variance as a function of a latent variable. In contrast to this commonly used approach, we propose a weighted variance generative model, where the contribution of each spectrogram time-frame in parameter learning is weighted. We impose a Gamma prior distribution on the weights, which would effectively lead to a Student's t-distribution instead of Gaussian for speech generative modeling. We develop efficient training and speech enhancement algorithms based on the proposed generative model. Our experimental results on spectrogram auto-encoding and speech enhancement demonstrate the effectiveness and robustness of the proposed approach compared to the standard unweighted variance model.
This paper reexamines the research on out-of-distribution (OOD) robustness in the field of NLP. We find that the distribution shift settings in previous studies commonly lack adequate challenges, hindering the accurate evaluation of OOD robustness. To address these issues, we propose a benchmark construction protocol that ensures clear differentiation and challenging distribution shifts. Then we introduce BOSS, a Benchmark suite for Out-of-distribution robustneSS evaluation covering 5 tasks and 20 datasets. Based on BOSS, we conduct a series of experiments on pre-trained language models for analysis and evaluation of OOD robustness. First, for vanilla fine-tuning, we examine the relationship between in-distribution (ID) and OOD performance. We identify three typical types that unveil the inner learning mechanism, which could potentially facilitate the forecasting of OOD robustness, correlating with the advancements on ID datasets. Then, we evaluate 5 classic methods on BOSS and find that, despite exhibiting some effectiveness in specific cases, they do not offer significant improvement compared to vanilla fine-tuning. Further, we evaluate 5 LLMs with various adaptation paradigms and find that when sufficient ID data is available, fine-tuning domain-specific models outperform LLMs on ID examples significantly. However, in the case of OOD instances, prioritizing LLMs with in-context learning yields better results. We identify that both fine-tuned small models and LLMs face challenges in effectively addressing downstream tasks. The code is public at \url{//github.com/lifan-yuan/OOD_NLP}.
Agglomerative hierarchical clustering based on Ordered Weighted Averaging (OWA) operators not only generalises the single, complete, and average linkages, but also includes intercluster distances based on a few nearest or farthest neighbours, trimmed and winsorised means of pairwise point similarities, amongst many others. We explore the relationships between the famous Lance-Williams update formula and the extended OWA-based linkages with weights generated via infinite coefficient sequences. Furthermore, we provide some conditions for the weight generators to guarantee the resulting dendrograms to be free from unaesthetic inversions.
We study parallel fault-tolerant quantum computing for families of homological quantum low-density parity-check (LDPC) codes defined on 3-manifolds with constant or almost-constant encoding rate. We derive generic formula for a transversal $T$ gate of color codes on general 3-manifolds, which acts as collective non-Clifford logical CCZ gates on any triplet of logical qubits with their logical-$X$ membranes having a $\mathbb{Z}_2$ triple intersection at a single point. The triple intersection number is a topological invariant, which also arises in the path integral of the emergent higher symmetry operator in a topological quantum field theory: the $\mathbb{Z}_2^3$ gauge theory. Moreover, the transversal $S$ gate of the color code corresponds to a higher-form symmetry supported on a codimension-1 submanifold, giving rise to exponentially many addressable and parallelizable logical CZ gates. We have developed a generic formalism to compute the triple intersection invariants for 3-manifolds and also study the scaling of the Betti number and systoles with volume for various 3-manifolds, which translates to the encoding rate and distance. We further develop three types of LDPC codes supporting such logical gates: (1) A quasi-hyperbolic code from the product of 2D hyperbolic surface and a circle, with almost-constant rate $k/n=O(1/\log(n))$ and $O(\log(n))$ distance; (2) A homological fibre bundle code with $O(1/\log^{\frac{1}{2}}(n))$ rate and $O(\log^{\frac{1}{2}}(n))$ distance; (3) A specific family of 3D hyperbolic codes: the Torelli mapping torus code, constructed from mapping tori of a pseudo-Anosov element in the Torelli subgroup, which has constant rate while the distance scaling is currently unknown. We then show a generic constant-overhead scheme for applying a parallelizable universal gate set with the aid of logical-$X$ measurements.
We propose a new methodology to develop heuristic algorithms using tree decompositions. Traditionally, such algorithms construct an optimal solution of the given problem instance through a dynamic programming approach. We modify this procedure by introducing a parameter $W$ that dictates the number of dynamic programming states to consider. We drop the exactness guarantee in favour of a shorter running time. However, if $W$ is large enough such that all valid states are considered, our heuristic algorithm proves optimality of the constructed solution. In particular, we implement a heuristic algorithm for the Maximum Happy Vertices problem using this approach. Our algorithm more efficiently constructs optimal solutions compared to the exact algorithm for graphs of bounded treewidth. Furthermore, our algorithm constructs higher quality solutions than state-of-the-art heuristic algorithms Greedy-MHV and Growth-MHV for instances of which at least 40\% of the vertices are initially coloured, at the cost of a larger running time.
We study the stability of randomized Taylor schemes for ODEs. We consider three notions of probabilistic stability: asymptotic stability, mean-square stability, and stability in probability. We prove fundamental properties of the probabilistic stability regions and benchmark them against the absolute stability regions for deterministic Taylor schemes.
Let ${\mathcal P}$ be a family of probability measures on a measurable space $(S,{\mathcal A}).$ Given a Banach space $E,$ a functional $f:E\mapsto {\mathbb R}$ and a mapping $\theta: {\mathcal P}\mapsto E,$ our goal is to estimate $f(\theta(P))$ based on i.i.d. observations $X_1,\dots, X_n\sim P, P\in {\mathcal P}.$ In particular, if ${\mathcal P}=\{P_{\theta}: \theta\in \Theta\}$ is an identifiable statistical model with parameter set $\Theta\subset E,$ one can consider the mapping $\theta(P)=\theta$ for $P\in {\mathcal P}, P=P_{\theta},$ resulting in a problem of estimation of $f(\theta)$ based on i.i.d. observations $X_1,\dots, X_n\sim P_{\theta}, \theta\in \Theta.$ Given a smooth functional $f$ and estimators $\hat \theta_n(X_1,\dots, X_n), n\geq 1$ of $\theta(P),$ we use these estimators, the sample split and the Taylor expansion of $f(\theta(P))$ of a proper order to construct estimators $T_f(X_1,\dots, X_n)$ of $f(\theta(P)).$ For these estimators and for a functional $f$ of smoothness $s\geq 1,$ we prove upper bounds on the $L_p$-errors of estimator $T_f(X_1,\dots, X_n)$ under certain moment assumptions on the base estimators $\hat \theta_n.$ We study the performance of estimators $T_f(X_1,\dots, X_n)$ in several concrete problems, showing their minimax optimality and asymptotic efficiency. In particular, this includes functional estimation in high-dimensional models with many low dimensional components, functional estimation in high-dimensional exponential families and estimation of functionals of covariance operators in infinite-dimensional subgaussian models.