We present a novel approach of discretizing variable coefficient diffusion operators in the context of meshfree generalized finite difference methods. Our ansatz uses properties of derived operators and combines the discrete Laplace operator with reconstruction functions approximating the diffusion coefficient. Provided that the reconstructions are of a sufficiently high order, we prove that the order of accuracy of the discrete Laplace operator transfers to the derived diffusion operator. We show that the new discrete diffusion operator inherits the diagonal dominance property of the discrete Laplace operator. Finally, we present the possibility of discretizing anisotropic diffusion operators with the help of derived operators. Our numerical results for Poisson's equation and the heat equation show that even low-order reconstructions preserve the order of the underlying discrete Laplace operator for sufficiently smooth diffusion coefficients. In experiments, we demonstrate the applicability of the new discrete diffusion operator to interface problems with point clouds not aligning to the interface and numerically show first-order convergence.
This paper investigates the asymptotic behavior of structural break tests in the harmonic domain for time dependent spherical random fields. In particular, we prove a functional central limit theorem result for the fluctuations over time of the sample spherical harmonic coefficients, under the null of isotropy and stationarity; furthermore, we prove consistency of the corresponding CUSUM test, under a broad range of alternatives, including deterministic trend, abrupt change, and a nontrivial power alternative. Our results are then applied to NCEP data on global temperature: our estimates suggest that Climate Change does not simply affect global average temperatures, but also the nature of spatial fluctuations at different scales.
Text plagiarism detection task is a common natural language processing task that aims to detect whether a given text contains plagiarism or copying from other texts. In existing research, detection of high level plagiarism is still a challenge due to the lack of high quality datasets. In this paper, we propose a plagiarized text data generation method based on GPT-3.5, which produces 32,927 pairs of text plagiarism detection datasets covering a wide range of plagiarism methods, bridging the gap in this part of research. Meanwhile, we propose a plagiarism identification method based on Faiss with BERT with high efficiency and high accuracy. Our experiments show that the performance of this model outperforms other models in several metrics, including 98.86\%, 98.90%, 98.86%, and 0.9888 for Accuracy, Precision, Recall, and F1 Score, respectively. At the end, we also provide a user-friendly demo platform that allows users to upload a text library and intuitively participate in the plagiarism analysis.
This article presents a general approximation-theoretic framework to analyze measure transport algorithms for probabilistic modeling. A primary motivating application for such algorithms is sampling -- a central task in statistical inference and generative modeling. We provide a priori error estimates in the continuum limit, i.e., when the measures (or their densities) are given, but when the transport map is discretized or approximated using a finite-dimensional function space. Our analysis relies on the regularity theory of transport maps and on classical approximation theory for high-dimensional functions. A third element of our analysis, which is of independent interest, is the development of new stability estimates that relate the distance between two maps to the distance~(or divergence) between the pushforward measures they define. We present a series of applications of our framework, where quantitative convergence rates are obtained for practical problems using Wasserstein metrics, maximum mean discrepancy, and Kullback--Leibler divergence. Specialized rates for approximations of the popular triangular Kn{\"o}the-Rosenblatt maps are obtained, followed by numerical experiments that demonstrate and extend our theory.
The clever hybridization of quantum computing concepts and evolutionary algorithms (EAs) resulted in a new field called quantum-inspired evolutionary algorithms (QIEAs). Unlike traditional EAs, QIEAs employ quantum bits to adopt a probabilistic representation of the state of a feature in a given solution. This unprecedented feature enables them to achieve better diversity and perform global search, effectively yielding a tradeoff between exploration and exploitation. We conducted a comprehensive survey across various publishers and gathered 56 papers. We thoroughly analyzed these publications, focusing on the novelty elements and types of heuristics employed by the extant quantum-inspired evolutionary algorithms (QIEAs) proposed to solve the feature subset selection (FSS) problem. Importantly, we provided a detailed analysis of the different types of objective functions and popular quantum gates, i.e., rotation gates, employed throughout the literature. Additionally, we suggested several open research problems to attract the attention of the researchers.
We introduce a general random model of a combinatorial optimization problem with geometric structure that encapsulates both linear programming and integer linear programming. Let $Q$ be a bounded set called the feasible set, $E$ be an arbitrary set called the constraint set, and $A$ be a random linear transform. We define and study the $\ell^q$-margin, $M_q := d_q(AQ, E)$. The margin quantifies the feasibility of finding $y \in AQ$ satisfying the constraint $y \in E$. Our contribution is to establish strong concentration of the margin for any $q \in (2,\infty]$, assuming only that $E$ has permutation symmetry. The case of $q = \infty$ is of particular interest in applications -- specifically to combinatorial ``balancing'' problems -- and is markedly out of the reach of the classical isoperimetric and concentration-of-measure tools that suffice for $q \le 2$. Generality is a key feature of this result: we assume permutation symmetry of the constraint set and nothing else. This allows us to encode many optimization problems in terms of the margin, including random versions of: the closest vector problem, integer linear feasibility, perceptron-type problems, $\ell^q$-combinatorial discrepancy for $2 \le q \le \infty$, and matrix balancing. Concentration of the margin implies a host of new sharp threshold results in these models, and also greatly simplifies and extends some key known results.
Contrastive Learning (CL)-based recommender systems have gained prominence in the context of Heterogeneous Graph (HG) due to their capacity to enhance the consistency of representations across different views. Nonetheless, existing frameworks often neglect the fact that user-item interactions within HG are governed by diverse latent intents (for instance, preferences towards specific brands or the demographic characteristics of item audiences), which are pivotal in capturing fine-grained relations. The exploration of these underlying intents, particularly through the lens of meta-paths in HGs, presents us with two principal challenges: i) How to integrate CL mechanisms with latent intents; ii) How to mitigate the noise associated with these complicated intents.To address these challenges, we propose an innovative framework termed Intent-Guided Heterogeneous Graph Contrastive Learning (IHGCL), which designed to enhance CL-based recommendation by capturing the intents contained within meta-paths. Specifically, the IHGCL framework includes: i) it employs a meta-path-based dual contrastive learning approach to effectively integrate intents into the recommendation, constructing meta-path contrast and view contrast; ii) it uses an bottlenecked autoencoder that combines mask propagation with the information bottleneck principle to significantly reduce noise perturbations introduced by meta-paths. Empirical evaluations conducted across six distinct datasets demonstrate the superior performance of our IHGCL framework relative to conventional baseline methods. Our model implementation is available at //github.com/wangyu0627/IHGCL.
Quantum computing holds immense potential for solving classically intractable problems by leveraging the unique properties of quantum mechanics. The scalability of quantum architectures remains a significant challenge. Multi-core quantum architectures are proposed to solve the scalability problem, arising a new set of challenges in hardware, communications and compilation, among others. One of these challenges is to adapt a quantum algorithm to fit within the different cores of the quantum computer. This paper presents a novel approach for circuit partitioning using Deep Reinforcement Learning, contributing to the advancement of both quantum computing and graph partitioning. This work is the first step in integrating Deep Reinforcement Learning techniques into Quantum Circuit Mapping, opening the door to a new paradigm of solutions to such problems.
Resilient cyber-physical systems comprise computing systems able to continuously interact with the physical environment in which they operate, despite runtime errors. The term resilience refers to the ability to cope with unexpected inputs while delivering correct service. Examples of resilient computing systems are Google's PageRank and the Bubblesort algorithm. Engineering for resilient cyber-physical systems requires a paradigm shift, prioritizing adaptability to dynamic environments. Software as a tool for self-management is a key instrument for dealing with uncertainty and embedding resilience in these systems. Yet, software engineers encounter the ongoing challenge of ensuring resilience despite environmental dynamic change. My thesis aims to pioneer an engineering discipline for resilient cyber-physical systems. Over four years, we conducted studies, built methods and tools, delivered software packages, and a website offering guidance to practitioners. This paper provides a condensed overview of the problems tackled, our methodology, key contributions, and results highlights. Seeking feedback from the community, this paper serves both as preparation for the thesis defense and as insight into future research prospects.
We introduce a novel class of algorithms to efficiently approximate the unknown return distributions in policy evaluation problems from distributional reinforcement learning (DRL). The proposed distributional dynamic programming algorithms are suitable for underlying Markov decision processes (MDPs) having an arbitrary probabilistic reward mechanism, including continuous reward distributions with unbounded support being potentially heavy-tailed. For a plain instance of our proposed class of algorithms we prove error bounds, both within Wasserstein and Kolmogorov--Smirnov distances. Furthermore, for return distributions having probability density functions the algorithms yield approximations for these densities; error bounds are given within supremum norm. We introduce the concept of quantile-spline discretizations to come up with algorithms showing promising results in simulation experiments. While the performance of our algorithms can rigorously be analysed they can be seen as universal black box algorithms applicable to a large class of MDPs. We also derive new properties of probability metrics commonly used in DRL on which our quantitative analysis is based.
We introduce a simple, stochastic, a-posteriori, turbulence closure model based on a reduced subgrid scale term. This subgrid scale term is tailor-made to capture the statistics of a small set of spatially-integrate quantities of interest (QoIs), with only one unresolved scalar time series per QoI. In contrast to other data-driven surrogates the dimension of the ``learning problem" is reduced from an evolving field to one scalar time series per QoI. We use an a-posteriori, nudging approach to find the distribution of the scalar series over time. This approach has the advantage of taking the interaction between the solver and the surrogate into account. A stochastic surrogate parametrization is obtained by random sampling from the found distribution for the scalar time series. Compared to an a-priori trained convolutional neural network, evaluating the new method is computationally much cheaper and gives similar long-term statistics.