The convolution quadrature method originally developed for the Riemann-Liouville fractional calculus is extended in this work to the Hadamard fractional calculus by using the exponential type meshes. Local truncation error analysis is presented for singular solutions. By adopting the fractional BDF-$p(1\leq p \leq 6)$ for the Caputo-Hadamard fractional derivative in solving subdiffusion problem with singular source terms, and using the finite element method to discretize the space variable, we carry out the sharp error analysis rigorously and obtain the optimal accuracy by the novel correction technique. Our correction method is a natural generalization of the one developed for subdiffusion problems with smooth source terms. Numerical tests confirm the correctness of our theoretical results.
Gradient-enhanced Kriging (GE-Kriging) is a well-established surrogate modelling technique for approximating expensive computational models. However, it tends to get impractical for high-dimensional problems due to the size of the inherent correlation matrix and the associated high-dimensional hyper-parameter tuning problem. To address these issues, a new method, called sliced GE-Kriging (SGE-Kriging), is developed in this paper for reducing both the size of the correlation matrix and the number of hyper-parameters. We first split the training sample set into multiple slices, and invoke Bayes' theorem to approximate the full likelihood function via a sliced likelihood function, in which multiple small correlation matrices are utilized to describe the correlation of the sample set rather than one large one. Then, we replace the original high-dimensional hyper-parameter tuning problem with a low-dimensional counterpart by learning the relationship between the hyper-parameters and the derivative-based global sensitivity indices. The performance of SGE-Kriging is finally validated by means of numerical experiments with several benchmarks and a high-dimensional aerodynamic modeling problem. The results show that the SGE-Kriging model features an accuracy and robustness that is comparable to the standard one but comes at much less training costs. The benefits are most evident for high-dimensional problems with tens of variables.
In this paper, we conduct rigorous error analysis of the Lie-Totter time-splitting Fourier spectral scheme for the nonlinear Schr\"odinger equation with a logarithmic nonlinear term $f(u)=u\ln|u|^2$ (LogSE) and periodic boundary conditions on a $d$-dimensional torus $\mathbb T^d$. Different from existing works based on regularisation of the nonlinear term $ f(u)\approx f^\varepsilon(u)=u\ln (|u| + \varepsilon )^2,$ we directly discretize the LogSE with the understanding $f(0)=0.$ Remarkably, in the time-splitting scheme, the solution flow map of the nonlinear part: $g(u)= u {\rm e}^{-{\rm} i t \ln|u|^{2}}$ has a higher regularity than $f(u)$ (which is not differentiable at $u=0$ but H\"older continuous), where $g(u)$ is Lipschitz continuous and possesses a certain fractional Sobolev regularity with index $0<s<1$. Accordingly, we can derive the $L^2$-error estimate: $O\big((\tau^{s/2} + N^{-s})\ln\! N\big)$ of the proposed scheme for the LogSE with low regularity solution $u\in C((0,T]; H^s( \mathbb{T}^d)\cap L^\infty( \mathbb{T}^d)).$ Moreover, we can show that the estimate holds for $s=1$ with more delicate analysis of the nonlinear term and the associated solution flow maps. Furthermore, we provide ample numerical results to demonstrate such a fractional-order convergence for initial data with low regularity. This work is the first one devoted to the analysis of splitting scheme for the LogSE without regularisation in the low regularity setting, as far as we can tell.
This work introduces UstanceBR, a multimodal corpus in the Brazilian Portuguese Twitter domain for target-based stance prediction. The corpus comprises 86.8 k labelled stances towards selected target topics, and extensive network information about the users who published these stances on social media. In this article we describe the corpus multimodal data, and a number of usage examples in both in-domain and zero-shot stance prediction based on text- and network-related information, which are intended to provide initial baseline results for future studies in the field.
We present a unified framework for deriving PAC-Bayesian generalization bounds. Unlike most previous literature on this topic, our bounds are anytime-valid (i.e., time-uniform), meaning that they hold at all stopping times, not only for a fixed sample size. Our approach combines four tools in the following order: (a) nonnegative supermartingales or reverse submartingales, (b) the method of mixtures, (c) the Donsker-Varadhan formula (or other convex duality principles), and (d) Ville's inequality. Our main result is a PAC-Bayes theorem which holds for a wide class of discrete stochastic processes. We show how this result implies time-uniform versions of well-known classical PAC-Bayes bounds, such as those of Seeger, McAllester, Maurer, and Catoni, in addition to many recent bounds. We also present several novel bounds. Our framework also enables us to relax traditional assumptions; in particular, we consider nonstationary loss functions and non-i.i.d. data. In sum, we unify the derivation of past bounds and ease the search for future bounds: one may simply check if our supermartingale or submartingale conditions are met and, if so, be guaranteed a (time-uniform) PAC-Bayes bound.
Minimizing data storage poses a significant challenge in large-scale metagenomic projects. In this paper, we present a new method for improving the encoding of FASTQ files generated by metagenomic sequencing. This method incorporates metagenomic classification followed by a recursive filter for clustering reads by DNA sequence similarity to improve the overall reference-free compression. In the results, we show an overall improvement in the compression of several datasets. As hypothesized, we show a progressive compression gain for higher coverage depth and number of identified species. Additionally, we provide an implementation that is freely available at //github.com/cobilab/mizar and can be customized to work with other FASTQ compression tools.
In the last years, social media has gained an unprecedented amount of attention, playing a pivotal role in shaping the contemporary landscape of communication and connection. However, Coordinated Inhautentic Behaviour (CIB), defined as orchestrated efforts by entities to deceive or mislead users about their identity and intentions, has emerged as a tactic to exploit the online discourse. In this study, we quantify the efficacy of CIB tactics by defining a general framework for evaluating the influence of a subset of nodes in a directed tree. We design two algorithms that provide optimal and greedy post-hoc placement strategies that lead to maximising the configuration influence. We then consider cascades from information spreading on Twitter to compare the observed behaviour with our algorithms. The results show that, according to our model, coordinated accounts are quite inefficient in terms of their network influence, thus suggesting that they may play a less pivotal role than expected. Moreover, the causes of these poor results may be found in two separate aspects: a bad placement strategy and a scarcity of resources.
This article presents a priori error estimates of the miscible displacement of one incompressible fluid by another through a porous medium characterized by a coupled system of nonlinear elliptic and parabolic equations. The study utilizes the $H(\rm{div})$ conforming virtual element method (VEM) for the approximation of the velocity, while a non-conforming virtual element approach is employed for the concentration. The pressure is discretised using the standard piecewise discontinuous polynomial functions. These spatial discretization techniques are combined with a backward Euler difference scheme for time discretization. The article also includes numerical results that validate the theoretical estimates presented.
In this paper, we propose an iterative convolution-thresholding method (ICTM) based on prediction-correction for solving the topology optimization problem in steady-state heat transfer equations. The problem is formulated as a constrained minimization problem of the complementary energy, incorporating a perimeter/surface-area regularization term, while satisfying a steady-state heat transfer equation. The decision variables of the optimization problem represent the domains of different materials and are represented by indicator functions. The perimeter/surface-area term of the domain is approximated using Gaussian kernel convolution with indicator functions. In each iteration, the indicator function is updated using a prediction-correction approach. The prediction step is based on the variation of the objective functional by imposing the constraints, while the correction step ensures the monotonically decreasing behavior of the objective functional. Numerical results demonstrate the efficiency and robustness of our proposed method, particularly when compared to classical approaches based on the ICTM.
In this paper, our objective is to present a constraining principle governing the spectral properties of the sample covariance matrix. This principle exhibits harmonious behavior across diverse limiting frameworks, eliminating the need for constraints on the rates of dimension $p$ and sample size $n$, as long as they both tend to infinity. We accomplish this by employing a suitable normalization technique on the original sample covariance matrix. Following this, we establish a harmonic central limit theorem for linear spectral statistics within this expansive framework. This achievement effectively eliminates the necessity for a bounded spectral norm on the population covariance matrix and relaxes constraints on the rates of dimension $p$ and sample size $n$, thereby significantly broadening the applicability of these results in the field of high-dimensional statistics. We illustrate the power of the established results by considering the test for covariance structure under high dimensionality, freeing both $p$ and $n$.
This paper explores the impact of biologically plausible neuron models on the performance of Spiking Neural Networks (SNNs) for regression tasks. While SNNs are widely recognized for classification tasks, their application to Scientific Machine Learning and regression remains underexplored. We focus on the membrane component of SNNs, comparing four neuron models: Leaky Integrate-and-Fire, FitzHugh-Nagumo, Izhikevich, and Hodgkin-Huxley. We investigate their effect on SNN accuracy and efficiency for function regression tasks, by using Euler and Runge-Kutta 4th-order approximation schemes. We show how more biologically plausible neuron models improve the accuracy of SNNs while reducing the number of spikes in the system. The latter represents an energetic gain on actual neuromorphic chips since it directly reflects the amount of energy required for the computations.