Following Inoue et al., we define a word to be a repetition if it is a (fractional) power of exponent at least 2. A word has a repetition factorization if it is the product of repetitions. We study repetition factorizations in several (generalized) automatic sequences, including the infinite Fibonacci word, the Thue-Morse word, paperfolding words, and the Rudin-Shapiro sequence.
Anderson acceleration (AA) is a technique for accelerating the convergence of an underlying fixed-point iteration. AA is widely used within computational science, with applications ranging from electronic structure calculation to the training of neural networks. Despite AA's widespread use, relatively little is understood about it theoretically. An important and unanswered question in this context is: To what extent can AA actually accelerate convergence of the underlying fixed-point iteration? While simple enough to state, this question appears rather difficult to answer. For example, it is unanswered even in the simplest (non-trivial) case where the underlying fixed-point iteration consists of applying a two-dimensional affine function. In this note we consider a restarted variant of AA applied to solve symmetric linear systems with restart window of size one. Several results are derived from the analytical solution of a nonlinear eigenvalue problem characterizing residual propagation of the AA iteration. This includes a complete characterization of the method to solve $2 \times 2$ linear systems, rigorously quantifying how the asymptotic convergence factor depends on the initial iterate, and quantifying by how much AA accelerates the underlying fixed-point iteration. We also prove that even if the underlying fixed-point iteration diverges, the associated AA iteration may still converge.
The rise of AI in human contexts places new demands on automated systems to be transparent and explainable. We examine some anthropomorphic ideas and principles relevant to such accountablity in order to develop a theoretical framework for thinking about digital systems in complex human contexts and the problem of explaining their behaviour. Structurally, systems are made of modular and hierachical components, which we abstract in a new system model using notions of modes and mode transitions. A mode is an independent component of the system with its own objectives, monitoring data, and algorithms. The behaviour of a mode, including its transitions to other modes, is determined by functions that interpret each mode's monitoring data in the light of its objectives and algorithms. We show how these belief functions can help explain system behaviour by visualising their evaluation as trajectories in higher-dimensional geometric spaces. These ideas are formalised mathematically by abstract and concrete simplicial complexes. We offer three techniques: a framework for design heuristics, a general system theory based on modes, and a geometric visualisation, and apply them in three types of human-centred systems.
The power law is useful in describing count phenomena such as network degrees and word frequencies. With a single parameter, it captures the main feature that the frequencies are linear on the log-log scale. Nevertheless, there have been criticisms of the power law, for example that a threshold needs to be pre-selected without its uncertainty quantified, that the power law is simply inadequate, and that subsequent hypothesis tests are required to determine whether the data could have come from the power law. We propose a modelling framework that combines two different generalisations of the power law, namely the generalised Pareto distribution and the Zipf-polylog distribution, to resolve these issues. The proposed mixture distributions are shown to fit the data well and quantify the threshold uncertainty in a natural way. A model selection step embedded in the Bayesian inference algorithm further answers the question whether the power law is adequate.
The optimal one-sided parametric polynomial approximants of a circular arc are considered. More precisely, the approximant must be entirely in or out of the underlying circle of an arc. The natural restriction to an arc's approximants interpolating boundary points is assumed. However, the study of approximants, which additionally interpolate corresponding tangent directions and curvatures at the boundary of an arc, is also considered. Several low-degree polynomial approximants are studied in detail. When several solutions fulfilling the interpolation conditions exist, the optimal one is characterized, and a numerical algorithm for its construction is suggested. Theoretical results are demonstrated with several numerical examples and a comparison with general (i.e. non-one-sided) approximants are provided.
Semantic similarity between natural language texts is typically measured either by looking at the overlap between subsequences (e.g., BLEU) or by using embeddings (e.g., BERTScore, S-BERT). Within this paper, we argue that when we are only interested in measuring the semantic similarity, it is better to directly predict the similarity using a fine-tuned model for such a task. Using a fine-tuned model for the Semantic Textual Similarity Benchmark tasks (STS-B) from the GLUE benchmark, we define the STSScore approach and show that the resulting similarity is better aligned with our expectations on a robust semantic similarity measure than other approaches.
Open sets are central to mathematics, especially analysis and topology, in ways few notions are. In most, if not all, computational approaches to mathematics, open sets are only studied indirectly via their 'codes' or 'representations'. In this paper, we study how hard it is to compute, given an arbitrary open set of reals, the most common representation, i.e. a countable set of open intervals. We work in Kleene's higher-order computability theory, which was historically based on the S1-S9 schemes and which now has an intuitive lambda calculus formulation due to the authors. We establish many computational equivalences between on one hand the 'structure' functional that converts open sets to the aforementioned representation, and on the other hand functionals arising from mainstream mathematics, like basic properties of semi-continuous functions, the Urysohn lemma, and the Tietze extension theorem. We also compare these functionals to known operations on regulated and bounded variation functions, and the Lebesgue measure restricted to closed sets. We obtain a number of natural computational equivalences for the latter involving theorems from mainstream mathematics.
Recent years have witnessed an increasing number of artificial intelligence (AI) applications in transportation. As a new and emerging technology, AI's potential to advance transportation goals and the full extent of its impacts on the transportation sector is not yet well understood. As the transportation community explores these topics, it is critical to understand how transportation professionals, the driving force behind AI Transportation applications, perceive AI's potential efficiency and equity impacts. Toward this goal, we surveyed transportation professionals in the United States and collected a total of 354 responses. Based on the survey responses, we conducted both descriptive analysis and latent class cluster analysis (LCCA). The former provides an overview of prevalent attitudes among transportation professionals, while the latter allows the identification of distinct segments based on their latent attitudes toward AI. We find widespread optimism regarding AI's potential to improve many aspects of transportation (e.g., efficiency, cost reduction, and traveler experience); however, responses are mixed regarding AI's potential to advance equity. Moreover, many respondents are concerned that AI ethics are not well understood in the transportation community and that AI use in transportation could exaggerate existing inequalities. Through LCCA, we have identified four latent segments: AI Neutral, AI Optimist, AI Pessimist, and AI Skeptic. The latent class membership is significantly associated with respondents' age, education level, and AI knowledge level. Overall, the study results shed light on the extent to which the transportation community as a whole is ready to leverage AI systems to transform current practices and inform targeted education to improve the understanding of AI among transportation professionals.
We study the complexity of the following related computational tasks concerning a fixed countable graph G: 1. Does a countable graph H provided as input have a(n induced) subgraph isomorphic to G? 2. Given a countable graph H that has a(n induced) subgraph isomorphic to G, find such a subgraph. The framework for our investigations is given by effective Wadge reducibility and by Weihrauch reducibility. Our work follows on "Reverse mathematics and Weihrauch analysis motivated by finite complexity theory" (Computability, 2021) by BeMent, Hirst and Wallace, and we answer several of their open questions.
Sequential models, such as Recurrent Neural Networks and Neural Ordinary Differential Equations, have long suffered from slow training due to their inherent sequential nature. For many years this bottleneck has persisted, as many thought sequential models could not be parallelized. We challenge this long-held belief with our parallel algorithm that accelerates GPU evaluation of sequential models by up to 3 orders of magnitude faster without compromising output accuracy. The algorithm does not need any special structure in the sequential models' architecture, making it applicable to a wide range of architectures. Using our method, training sequential models can be more than 10 times faster than the common sequential method without any meaningful difference in the training results. Leveraging this accelerated training, we discovered the efficacy of the Gated Recurrent Unit in a long time series classification problem with 17k time samples. By overcoming the training bottleneck, our work serves as the first step to unlock the potential of non-linear sequential models for long sequence problems.
In this paper, we propose a non-parametric score to evaluate the quality of the solution to an iterative algorithm for Independent Component Analysis (ICA) with arbitrary Gaussian noise. The novelty of this score stems from the fact that it just assumes a finite second moment of the data and uses the characteristic function to evaluate the quality of the estimated mixing matrix without any knowledge of the parameters of the noise distribution. We also provide a new characteristic function-based contrast function for ICA and propose a fixed point iteration to optimize the corresponding objective function. Finally, we propose a theoretical framework to obtain sufficient conditions for the local and global optima of a family of contrast functions for ICA. This framework uses quasi-orthogonalization inherently, and our results extend the classical analysis of cumulant-based objective functions to noisy ICA. We demonstrate the efficacy of our algorithms via experimental results on simulated datasets.