Palimpsests refer to historical manuscripts where erased writings have been partially covered by the superimposition of a second writing. By employing imaging techniques, e.g., multispectral imaging, it becomes possible to identify features that are imperceptible to the naked eye, including faded and erased inks. When dealing with overlapping inks, Artificial Intelligence techniques can be utilized to disentangle complex nodes of overlapping letters. In this work, we propose deep learning-based semantic segmentation as a method for identifying and segmenting individual letters in overlapping characters. The experiment was conceived as a proof of concept, focusing on the palimpsests of the Ars Grammatica by Prisciano as a case study. Furthermore, caveats and prospects of our approach combined with multispectral imaging are also discussed.
There are several tools available to infer phylogenetic trees, which depict the evolutionary relationships among biological entities such as viral and bacterial strains in infectious outbreaks, or cancerous cells in tumor progression trees. These tools rely on several inference methods available to produce phylogenetic trees, with resulting trees not being unique. Thus, methods for comparing phylogenies that are capable of revealing where two phylogenetic trees agree or differ are required. An approach is then to compute a similarity or dissimilarity measure between trees, with the Robinson- Foulds distance being one of the most used, and which can be computed in linear time and space. Nevertheless, given the large and increasing volume of phylogenetic data, phylogenetic trees are becoming very large with hundreds of thousands of leafs. In this context, space requirements become an issue both while computing tree distances and while storing trees. We propose then an efficient implementation of the Robinson-Foulds distance over trees succinct representations. Our implementation generalizes also the Robinson-Foulds distances to labelled phylogenetic trees, i.e., trees containing labels on all nodes, instead of only on leaves. Experimental results show that we are able to still achieve linear time while requiring less space. Our implementation is available as an open-source tool at //github.com/pedroparedesbranco/TreeDiff.
This paper considers the epistemic justification for a simplicity preference in inductive inference that may be obtained from the machine learning framework of statistical learning theory. Uniting elements from both earlier arguments suggesting and rejecting such a justification, the paper spells out a qualified means-ends and model-relative justificatory argument, built on statistical learning theory's central mathematical learning guarantee for the method of empirical risk minimization.
Rational best approximations (in a Chebyshev sense) to real functions are characterized by an equioscillating approximation error. Similar results do not hold true for rational best approximations to complex functions in general. In the present work, we consider unitary rational approximations to the exponential function on the imaginary axis, which map the imaginary axis to the unit circle. In the class of unitary rational functions, best approximations are shown to exist, to be uniquely characterized by equioscillation of a phase error, and to possess a super-linear convergence rate. Furthermore, the best approximations have full degree (i.e., non-degenerate), achieve their maximum approximation error at points of equioscillation, and interpolate at intermediate points. Asymptotic properties of poles, interpolation nodes, and equioscillation points of these approximants are studied. Three algorithms, which are found very effective to compute unitary rational approximations including candidates for best approximations, are sketched briefly. Some consequences to numerical time-integration are discussed. In particular, time propagators based on unitary best approximants are unitary, symmetric and A-stable.
In this note we prove almost sure unisolvence of RBF interpolation on randomly distributed sequences by a wide class of polyharmonic splines (including Thin-Plate Splines), without polynomial addition.
Any experiment with climate models relies on a potentially large set of spatio-temporal boundary conditions. These can represent both the initial state of the system and/or forcings driving the model output throughout the experiment. Whilst these boundary conditions are typically fixed using available reconstructions in climate modelling studies, they are highly uncertain, that uncertainty is unquantified, and the effect on the output of the experiment can be considerable. We develop efficient quantification of these uncertainties that combines relevant data from multiple models and observations. Starting from the coexchangeability model, we develop a coexchangable process model to capture multiple correlated spatio-temporal fields of variables. We demonstrate that further exchangeability judgements over the parameters within this representation lead to a Bayes linear analogy of a hierarchical model. We use the framework to provide a joint reconstruction of sea-surface temperature and sea-ice concentration boundary conditions at the last glacial maximum (19-23 ka) and use it to force an ensemble of ice-sheet simulations using the FAMOUS-Ice coupled atmosphere and ice-sheet model. We demonstrate that existing boundary conditions typically used in these experiments are implausible given our uncertainties and demonstrate the impact of using more plausible boundary conditions on ice-sheet simulation.
Interactions between genes and environmental factors may play a key role in the etiology of many common disorders. Several regularized generalized linear models (GLMs) have been proposed for hierarchical selection of gene by environment interaction (GEI) effects, where a GEI effect is selected only if the corresponding genetic main effect is also selected in the model. However, none of these methods allow to include random effects to account for population structure, subject relatedness and shared environmental exposure. In this paper, we develop a unified approach based on regularized penalized quasi-likelihood (PQL) estimation to perform hierarchical selection of GEI effects in sparse regularized mixed models. We compare the selection and prediction accuracy of our proposed model with existing methods through simulations under the presence of population structure and shared environmental exposure. We show that for all simulation scenarios, compared to other penalized methods, our proposed method enforced sparsity by controlling the number of false positives in the model while having the best predictive performance. Finally, we apply our method to a real data application using the Orofacial Pain: Prospective Evaluation and Risk Assessment (OPPERA) study, and found that our method retrieves previously reported significant loci.
At the staggering pace with which the capabilities of large language models (LLMs) are increasing, creating future-proof evaluation sets to assess their understanding becomes more and more challenging. In this paper, we propose a novel paradigm for evaluating LLMs which leverages the idea that correct world understanding should be consistent across different (Fregean) senses of the same meaning. Accordingly, we measure understanding not in terms of correctness but by evaluating consistency across multiple senses that are generated by the model itself. We showcase our approach by instantiating a test where the different senses are different languages, hence using multilingual self-consistency as a litmus test for the model's understanding and simultaneously addressing the important topic of multilinguality. Taking one of the latest versions of ChatGPT as our object of study, we evaluate multilingual consistency for two different tasks across three different languages. We show that its multilingual consistency is still lacking, and that its task and world understanding are thus not language-independent. As our approach does not require any static evaluation corpora in languages other than English, it can easily and cheaply be extended to different languages and tasks and could become an integral part of future benchmarking efforts.
In this contribution, we derive a consistent variational formulation for computational homogenization methods and show that traditional FE2 and IGA2 approaches are special discretization and solution techniques of this most general framework. This allows us to enhance dramatically the numerical analysis as well as the solution of the arising algebraic system. In particular, we expand the dimension of the continuous system, discretize the higher dimensional problem consistently and apply afterwards a discrete null-space matrix to remove the additional dimensions. A benchmark problem, for which we can obtain an analytical solution, demonstrates the superiority of the chosen approach aiming to reduce the immense computational costs of traditional FE2 and IGA2 formulations to a fraction of the original requirements. Finally, we demonstrate a further reduction of the computational costs for the solution of general non-linear problems.
Artificial neural networks thrive in solving the classification problem for a particular rigid task, acquiring knowledge through generalized learning behaviour from a distinct training phase. The resulting network resembles a static entity of knowledge, with endeavours to extend this knowledge without targeting the original task resulting in a catastrophic forgetting. Continual learning shifts this paradigm towards networks that can continually accumulate knowledge over different tasks without the need to retrain from scratch. We focus on task incremental classification, where tasks arrive sequentially and are delineated by clear boundaries. Our main contributions concern 1) a taxonomy and extensive overview of the state-of-the-art, 2) a novel framework to continually determine the stability-plasticity trade-off of the continual learner, 3) a comprehensive experimental comparison of 11 state-of-the-art continual learning methods and 4 baselines. We empirically scrutinize method strengths and weaknesses on three benchmarks, considering Tiny Imagenet and large-scale unbalanced iNaturalist and a sequence of recognition datasets. We study the influence of model capacity, weight decay and dropout regularization, and the order in which the tasks are presented, and qualitatively compare methods in terms of required memory, computation time, and storage.
When and why can a neural network be successfully trained? This article provides an overview of optimization algorithms and theory for training neural networks. First, we discuss the issue of gradient explosion/vanishing and the more general issue of undesirable spectrum, and then discuss practical solutions including careful initialization and normalization methods. Second, we review generic optimization methods used in training neural networks, such as SGD, adaptive gradient methods and distributed methods, and theoretical results for these algorithms. Third, we review existing research on the global issues of neural network training, including results on bad local minima, mode connectivity, lottery ticket hypothesis and infinite-width analysis.