This paper analyzes the stability of the class of Time-Accurate and Highly-Stable Explicit Runge-Kutta (TASE-RK) methods, introduced in 2021 by Bassenne et al. (J. Comput. Phys.) for the numerical solution of stiff Initial Value Problems (IVPs). Such numerical methods are easy to implement and require the solution of a limited number of linear systems per step, whose coefficient matrices involve the exact Jacobian $J$ of the problem. To significantly reduce the computational cost of TASE-RK methods without altering their consistency properties, it is possible to replace $J$ with a matrix $A$ (not necessarily tied to $J$) in their formulation, for instance fixed for a certain number of consecutive steps or even constant. However, the stability properties of TASE-RK methods strongly depend on this choice, and so far have been studied assuming $A=J$. In this manuscript, we theoretically investigate the conditional and unconditional stability of TASE-RK methods by considering arbitrary $A$. To this end, we first split the Jacobian as $J=A+B$. Then, through the use of stability diagrams and their connections with the field of values, we analyze both the case in which $A$ and $B$ are simultaneously diagonalizable and not. Numerical experiments, conducted on Partial Differential Equations (PDEs) arising from applications, show the correctness and utility of the theoretical results derived in the paper, as well as the good stability and efficiency of TASE-RK methods when $A$ is suitably chosen.
By abstracting over well-known properties of De Bruijn's representation with nameless dummies, we design a new theory of syntax with variable binding and capture-avoiding substitution. We propose it as a simpler alternative to Fiore, Plotkin, and Turi's approach, with which we establish a strong formal link. We also show that our theory easily incorporates simple types and equations between terms.
We consider the statistical linear inverse problem of making inference on an unknown source function in an elliptic partial differential equation from noisy observations of its solution. We employ nonparametric Bayesian procedures based on Gaussian priors, leading to convenient conjugate formulae for posterior inference. We review recent results providing theoretical guarantees on the quality of the resulting posterior-based estimation and uncertainty quantification, and we discuss the application of the theory to the important classes of Gaussian series priors defined on the Dirichlet-Laplacian eigenbasis and Mat\'ern process priors. We provide an implementation of posterior inference for both classes of priors, and investigate its performance in a numerical simulation study.
We propose and analyse boundary-preserving schemes for the strong approximations of some scalar SDEs with non-globally Lipschitz drift and diffusion coefficients whose state-space is bounded. The schemes consists of a Lamperti transform followed by a Lie--Trotter splitting. We prove $L^{p}(\Omega)$-convergence of order $1$, for every $p \geq 1$, of the schemes and exploit the Lamperti transform to confine the numerical approximations to the state-space of the considered SDE. We provide numerical experiments that confirm the theoretical results and compare the proposed Lamperti-splitting schemes to other numerical schemes for SDEs.
The goal of Universal Cross-Domain Retrieval (UCDR) is to achieve robust performance in generalized test scenarios, wherein data may belong to strictly unknown domains and categories during training. Recently, pre-trained models with prompt tuning have shown strong generalization capabilities and attained noteworthy achievements in various downstream tasks, such as few-shot learning and video-text retrieval. However, applying them directly to UCDR may not sufficiently to handle both domain shift (i.e., adapting to unfamiliar domains) and semantic shift (i.e., transferring to unknown categories). To this end, we propose \textbf{Pro}mpting-to-\textbf{S}imulate (ProS), the first method to apply prompt tuning for UCDR. ProS employs a two-step process to simulate Content-aware Dynamic Prompts (CaDP) which can impact models to produce generalized features for UCDR. Concretely, in Prompt Units Learning stage, we introduce two Prompt Units to individually capture domain and semantic knowledge in a mask-and-align way. Then, in Context-aware Simulator Learning stage, we train a Content-aware Prompt Simulator under a simulated test scenarios to produce the corresponding CaDP. Extensive experiments conducted on three benchmark datasets show that our method achieves new state-of-the-art performance without bringing excessive parameters. Our method is publicly available at //github.com/fangkaipeng/ProS.
Neural Architecture Search (NAS) paves the way for the automatic definition of Neural Network (NN) architectures, attracting increasing research attention and offering solutions in various scenarios. This study introduces a novel NAS solution, called Flat Neural Architecture Search (FlatNAS), which explores the interplay between a novel figure of merit based on robustness to weight perturbations and single NN optimization with Sharpness-Aware Minimization (SAM). FlatNAS is the first work in the literature to systematically explore flat regions in the loss landscape of NNs in a NAS procedure, while jointly optimizing their performance on in-distribution data, their out-of-distribution (OOD) robustness, and constraining the number of parameters in their architecture. Differently from current studies primarily concentrating on OOD algorithms, FlatNAS successfully evaluates the impact of NN architectures on OOD robustness, a crucial aspect in real-world applications of machine and deep learning. FlatNAS achieves a good trade-off between performance, OOD generalization, and the number of parameters, by using only in-distribution data in the NAS exploration. The OOD robustness of the NAS-designed models is evaluated by focusing on robustness to input data corruptions, using popular benchmark datasets in the literature.
In this paper we consider a superlinear one-dimensional elliptic boundary value problem that generalizes the one studied by Moore and Nehari in [43]. Specifically, we deal with piecewise-constant weight functions in front of the nonlinearity with an arbitrary number $\kappa\geq 1$ of vanishing regions. We study, from an analytic and numerical point of view, the number of positive solutions, depending on the value of a parameter $\lambda$ and on $\kappa$. Our main results are twofold. On the one hand, we study analytically the behavior of the solutions, as $\lambda\downarrow-\infty$, in the regions where the weight vanishes. Our result leads us to conjecture the existence of $2^{\kappa+1}-1$ solutions for sufficiently negative $\lambda$. On the other hand, we support such a conjecture with the results of numerical simulations which also shed light on the structure of the global bifurcation diagrams in $\lambda$ and the profiles of positive solutions. Finally, we give additional numerical results suggesting that the same high multiplicity result holds true for a much larger class of weights, also arbitrarily close to situations where there is uniqueness of positive solutions.
Exceptionally elegant formulae exist for the fractional Laplacian operator applied to weighted classical orthogonal polynomials. We utilize these results to construct a solver, based on frame properties, for equations involving the fractional Laplacian of any power, $s \in (0,1)$, on an unbounded domain in one or two dimensions. The numerical method represents solutions in an expansion of weighted classical orthogonal polynomials as well as their unweighted counterparts with a specific extension to $\mathbb{R}^d$, $d \in \{1,2\}$. We examine the frame properties of this family of functions for the solution expansion and, under standard frame conditions, derive an a priori estimate for the stationary equation. Moreover, we prove one achieves the expected order of convergence when considering an implicit Euler discretization in time for the fractional heat equation. We apply our solver to numerous examples including the fractional heat equation (utilizing up to a $6^\text{th}$-order Runge--Kutta time discretization), a fractional heat equation with a time-dependent exponent $s(t)$, and a two-dimensional problem, observing spectral convergence in the spatial dimension for sufficiently smooth data.
In this work, we present new constructions for topological subsystem codes using semi-regular Euclidean and hyperbolic tessellations. They give us new families of codes, and we also provide a new family of codes obtained through an already existing construction, due to Sarvepalli and Brown. We also prove new results that allow us to obtain the parameters of these new codes.
Fully decentralized learning is gaining momentum for training AI models at the Internet's edge, addressing infrastructure challenges and privacy concerns. In a decentralized machine learning system, data is distributed across multiple nodes, with each node training a local model based on its respective dataset. The local models are then shared and combined to form a global model capable of making accurate predictions on new data. Our exploration focuses on how different types of network structures influence the spreading of knowledge - the process by which nodes incorporate insights gained from learning patterns in data available on other nodes across the network. Specifically, this study investigates the intricate interplay between network structure and learning performance using three network topologies and six data distribution methods. These methods consider different vertex properties, including degree centrality, betweenness centrality, and clustering coefficient, along with whether nodes exhibit high or low values of these metrics. Our findings underscore the significance of global centrality metrics (degree, betweenness) in correlating with learning performance, while local clustering proves less predictive. We highlight the challenges in transferring knowledge from peripheral to central nodes, attributed to a dilution effect during model aggregation. Additionally, we observe that central nodes exert a pull effect, facilitating the spread of knowledge. In examining degree distribution, hubs in Barabasi-Albert networks positively impact learning for central nodes but exacerbate dilution when knowledge originates from peripheral nodes. Finally, we demonstrate the formidable challenge of knowledge circulation outside of segregated communities.
This paper introduces a novel evaluation framework for Large Language Models (LLMs) such as Llama-2 and Mistral, focusing on the adaptation of Precision and Recall metrics from image generation to text generation. This approach allows for a nuanced assessment of the quality and diversity of generated text without the need for aligned corpora. By conducting a comprehensive evaluation of state-of-the-art language models, the study reveals significant insights into their performance on open-ended generation tasks, which are not adequately captured by traditional benchmarks. The findings highlight a trade-off between the quality and diversity of generated samples, particularly when models are fine-tuned with human feedback. This work extends the toolkit for distribution-based NLP evaluation, offering insights into the practical capabilities and challenges faced by current LLMs in generating diverse and high-quality text.