The general theory of greedy approximation with respect to arbitrary dictionaries is well developed in the case of real Banach spaces. Recently, some of results proved for the Weak Chebyshev Greedy Algorithm (WCGA) in the case of real Banach spaces were extended to the case of complex Banach spaces. In this paper we extend some of known in the real case results for other than WCGA greedy algorithms to the case of complex Banach spaces.
Determining the difficulty of a text involves assessing various textual features that may impact the reader's text comprehension, yet current research in Vietnamese has only focused on statistical features. This paper introduces a new approach that integrates statistical and semantic approaches to assessing text readability. Our research utilized three distinct datasets: the Vietnamese Text Readability Dataset (ViRead), OneStopEnglish, and RACE, with the latter two translated into Vietnamese. Advanced semantic analysis methods were employed for the semantic aspect using state-of-the-art language models such as PhoBERT, ViDeBERTa, and ViBERT. In addition, statistical methods were incorporated to extract syntactic and lexical features of the text. We conducted experiments using various machine learning models, including Support Vector Machine (SVM), Random Forest, and Extra Trees and evaluated their performance using accuracy and F1 score metrics. Our results indicate that a joint approach that combines semantic and statistical features significantly enhances the accuracy of readability classification compared to using each method in isolation. The current study emphasizes the importance of considering both statistical and semantic aspects for a more accurate assessment of text difficulty in Vietnamese. This contribution to the field provides insights into the adaptability of advanced language models in the context of Vietnamese text readability. It lays the groundwork for future research in this area.
The purpose of this paper is to employ the language of Cartan moving frames to study the geometry of the data manifolds and its Riemannian structure, via the data information metric and its curvature at data points. Using this framework and through experiments, explanations on the response of a neural network are given by pointing out the output classes that are easily reachable from a given input. This emphasizes how the proposed mathematical relationship between the output of the network and the geometry of its inputs can be exploited as an explainable artificial intelligence tool.
The sum-of-squares hierarchy of semidefinite programs has become a common tool for algorithm design in theoretical computer science, including problems in quantum information. In this work we study a connection between a Hermitian version of the SoS hierarchy, related to the quantum de Finetti theorem, and geometric quantization of compact K\"ahler manifolds (such as complex projective space $\mathbb{C}P^{d}$, the set of all pure states in a $(d + 1)$-dimensional Hilbert space). We show that previously known HSoS rounding algorithms can be recast as quantizing an objective function to obtain a finite-dimensional matrix, finding its top eigenvector, and then (possibly nonconstructively) rounding it by using a version of the Husimi quasiprobability distribution. Dually, we recover most known quantum de Finetti theorems by doing the same steps in the reverse order: a quantum state is first approximated by its Husimi distribution, and then quantized to obtain a separable state approximating the original one. In cases when there is a transitive group action on the manifold we give some new proofs of existing de Finetti theorems, as well as some applications including a new version of Renner's exponential de Finetti theorem proven using the Borel--Weil--Bott theorem, and hardness of approximation results and optimal degree-2 integrality gaps for the basic SDP relaxation of \textsc{Quantum Max-$d$-Cut} (for arbitrary $d$). We also describe how versions of these results can be proven when there is no transitive group action. In these cases we can deduce some error bounds for the HSoS hierarchy on complex projective varieties which are smooth.
The gradient bounds of generalized barycentric coordinates play an essential role in the $H^1$ norm approximation error estimate of generalized barycentric interpolations. Similarly, the $H^k$ norm, $k>1$, estimate needs upper bounds of high-order derivatives, which are not available in the literature. In this paper, we derive such upper bounds for the Wachspress generalized barycentric coordinates on simple convex $d$-dimensional polytopes, $d\ge 1$. The result can be used to prove optimal convergence for Wachspress-based polytopal finite element approximation of, for example, fourth-order elliptic equations. Another contribution of this paper is to compare various shape-regularity conditions for simple convex polytopes, and to clarify their relations using knowledge from convex geometry.
The study of diffeomorphism groups and their applications to problems in analysis and geometry has a long history. In geometric hydrodynamics, pioneered by V.~Arnold in the 1960s, one considers an ideal fluid flow as the geodesic motion on the infinite-dimensional group of volume-preserving diffeomorphisms of the fluid domain with respect to the metric defined by the kinetic energy. Similar considerations on the space of densities lead to a geometric description of optimal mass transport and the Kantorovich-Wasserstein metric. Likewise, information geometry associated with the Fisher-Rao metric and the Hellinger distance has an equally beautiful infinite-dimensional geometric description and can be regarded as a higher-order Sobolev analogue of optimal transportation. In this work we review various metrics on diffeomorphism groups relevant to this approach and introduce appropriate topology, smooth structures and dynamics on the corresponding infinite-dimensional manifolds. Our main goal is to demonstrate how, alongside topological hydrodynamics, Hamiltonian dynamics and optimal mass transport, information geometry with its elaborate toolbox has become yet another exciting field for applications of geometric analysis on diffeomorphism groups.
The Harary reconstruction conjecture states that any graph with more than four edges can be uniquely reconstructed from its set of maximal edge-deleted subgraphs. In 1977, M\"uller verified the conjecture for graphs with $n$ vertices and $n \log_2(n)$ edges, improving on Lov\'as's bound of $\log(n^2-n)/4$. Here, we show that the reconstruction conjecture holds for graphs which have exactly one cycle and and three non-isomorphic subtrees.
A major open problem in computational complexity is the existence of a one-way function, namely a function from strings to strings which is computationally easy to compute but hard to invert. Levin (2023) formulated the notion of one-way functions from reals (infinite bit-sequences) to reals in terms of computability, and asked whether partial computable one-way functions exist. We give a strong positive answer using the hardness of the halting problem and exhibiting a total computable one-way function.
Coboundary expansion is a high dimensional generalization of the Cheeger constant to simplicial complexes. Originally, this notion was motivated by the fact that it implies topological expansion, but nowadays a significant part of the motivation stems from its deep connection to problems in theoretical computer science such as agreement expansion in the low soundness regime. In this paper, we prove coboundary expansion with non-Abelian coefficients for the coset complex construction of Kaufman and Oppenheim. Our proof uses a novel global argument, as opposed to the local-to-global arguments that are used to prove cosystolic expansion.
We discover a novel connection between two classical mathematical notions, Eulerian orientations and Hadamard codes by studying the counting problem of Eulerian orientations (\#EO) with local constraint functions imposed on vertices. We present two special classes of constraint functions and a chain reaction algorithm, and show that the \#EO problem defined by each class alone is polynomial-time solvable by the algorithm. These tractable classes of functions are defined inductively, and quite remarkably the base level of these classes is characterized perfectly by the well-known Hadamard code. Thus, we establish a novel connection between counting Eulerian orientations and coding theory. We also prove a \#P-hardness result for the \#EO problem when constraint functions from the two tractable classes appear together.
Although measuring held-out accuracy has been the primary approach to evaluate generalization, it often overestimates the performance of NLP models, while alternative approaches for evaluating models either focus on individual tasks or on specific behaviors. Inspired by principles of behavioral testing in software engineering, we introduce CheckList, a task-agnostic methodology for testing NLP models. CheckList includes a matrix of general linguistic capabilities and test types that facilitate comprehensive test ideation, as well as a software tool to generate a large and diverse number of test cases quickly. We illustrate the utility of CheckList with tests for three tasks, identifying critical failures in both commercial and state-of-art models. In a user study, a team responsible for a commercial sentiment analysis model found new and actionable bugs in an extensively tested model. In another user study, NLP practitioners with CheckList created twice as many tests, and found almost three times as many bugs as users without it.