The NP-hard Multiple Hitting Set problem is finding a minimum-cardinality set intersecting each of the sets in a given input collection a given number of times. Generalizing a well-known data reduction algorithm due to Weihe, we show a problem kernel for Multiple Hitting Set parameterized by the Dilworth number, a graph parameter introduced by Foldes and Hammer in 1978 yet seemingly so far unexplored in the context of parameterized complexity theory. Using matrix multiplication, we speed up the algorithm to quadratic sequential time and logarithmic parallel time. We experimentally evaluate our algorithms. By implementing our algorithm on GPUs, we show the feasability of realizing kernelization algorithms on SIMD (Single Instruction, Multiple Date) architectures.
With the emergence of Machine Learning, there has been a surge in leveraging its capabilities for problem-solving across various domains. In the code clone realm, the identification of type-4 or semantic clones has emerged as a crucial yet challenging task. Researchers aim to utilize Machine Learning to tackle this challenge, often relying on the BigCloneBench dataset. However, it's worth noting that BigCloneBench, originally not designed for semantic clone detection, presents several limitations that hinder its suitability as a comprehensive training dataset for this specific purpose. Furthermore, CLCDSA dataset suffers from a lack of reusable examples aligning with real-world software systems, rendering it inadequate for cross-language clone detection approaches. In this work, we present a comprehensive semantic clone and cross-language clone benchmark, GPTCloneBench by exploiting SemanticCloneBench and OpenAI's GPT-3 model. In particular, using code fragments from SemanticCloneBench as sample inputs along with appropriate prompt engineering for GPT-3 model, we generate semantic and cross-language clones for these specific fragments and then conduct a combination of extensive manual analysis, tool-assisted filtering, functionality testing and automated validation in building the benchmark. From 79,928 clone pairs of GPT-3 output, we created a benchmark with 37,149 true semantic clone pairs, 19,288 false semantic pairs(Type-1/Type-2), and 20,770 cross-language clones across four languages (Java, C, C#, and Python). Our benchmark is 15-fold larger than SemanticCloneBench, has more functional code examples for software systems and programming language support than CLCDSA, and overcomes BigCloneBench's qualities, quantification, and language variety limitations.
Gaussian elimination (GE) is the most used dense linear solver. Error analysis of GE with selected pivoting strategies on well-conditioned systems can focus on studying the behavior of growth factors. Although exponential growth is possible with GE with partial pivoting (GEPP), growth tends to stay much smaller in practice. Support for this behavior was provided last year by Huang and Tikhomirov's average-case analysis of GEPP, which showed GEPP growth factors stay at most polynomial with very high probability when using small Gaussian perturbations. GE with complete pivoting (GECP) has also seen a lot of recent interest, with recent improvements to lower bounds on worst-case GECP growth provided by Edelman and Urschel earlier this year. We are interested in studying how GEPP and GECP behave on the same linear systems as well as studying large growth on particular subclasses of matrices, including orthogonal matrices. We will also study systems when GECP leads to larger growth than GEPP, which will lead to new empirical lower bounds on how much worse GECP can behave compared to GEPP in terms of growth. We also present an empirical study on a family of exponential GEPP growth matrices whose polynomial behavior in small neighborhoods limits to the initial GECP growth factor.
The first linear programming bound of McEliece, Rodemich, Rumsey, and Welch is the best known asymptotic upper bound for binary codes, for a certain subrange of distances. Starting from the work of Friedman and Tillich, there are, by now, some arguably easier and more direct arguments for this bound. We show that this more recent line of argument runs into certain difficulties if one tries to go beyond this bound (say, towards the second linear programming bound of McEliece, Rodemich, Rumsey, and Welch).
This paper addresses a mathematically tractable model of the Prisoner's Dilemma using the framework of active inference. In this work, we design pairs of Bayesian agents that are tracking the joint game state of their and their opponent's choices in an Iterated Prisoner's Dilemma game. The specification of the agents' belief architecture in the form of a partially-observed Markov decision process allows careful and rigourous investigation into the dynamics of two-player gameplay, including the derivation of optimal conditions for phase transitions that are required to achieve certain game-theoretic steady states. We show that the critical time points governing the phase transition are linearly related to each other as a function of learning rate and the reward function. We then investigate the patterns that emerge when varying the agents' learning rates, as well as the relationship between the stochastic and deterministic solutions to the two-agent system.
The Pauli stabilizer formalism is perhaps the most thoroughly studied means of procuring quantum error-correcting codes, whereby the code is obtained through commutative Pauli operators and ``stabilized'' by them. In this work we will show that every quantum error-correcting code, including Pauli stabilizer codes and subsystem codes, has a similar structure, in that the code can be stabilized by commutative ``Paulian'' operators which share many features with Pauli operators and which form a \textbf{Paulian stabilizer group}. By facilitating a controlled gate we can measure these Paulian operators to acquire the error syndrome. Examples concerning codeword stabilized codes and bosonic codes will be presented; specifically, one of the examples has been demonstrated experimentally and the observable for detecting the error turns out to be Paulian, thereby showing the potential utility of this approach. This work provides a possible approach to implement error-correcting codes and to find new codes.
We propose and analyse an explicit boundary-preserving scheme for the strong approximations of some SDEs with non-globally Lipschitz drift and diffusion coefficients whose state-space is bounded. The scheme consists of a Lamperti transform followed by a Lie--Trotter splitting. We prove $L^{p}(\Omega)$-convergence of order $1$, for every $p \in \mathbb{N}$, of the scheme and exploit the Lamperti transform to confine the numerical approximations to the state-space of the considered SDE. We provide numerical experiments that confirm the theoretical results and compare the proposed Lamperti-splitting scheme to other numerical schemes for SDEs.
The QZ algorithm computes the Schur form of a matrix pencil. It is an iterative algorithm and at some point, it must decide that an eigenvalue has converged and move on with another one. Choosing a criterion that makes this decision is nontrivial. If it is too strict, the algorithm might waste iterations on already converged eigenvalues. If it is not strict enough, the computed eigenvalues might be inaccurate. Additionally, the criterion should not be computationally expensive to evaluate. This paper introduces a new criterion based on the size of and the gap between the eigenvalues. This is similar to the work of Ahues and Tissuer for the QR algorithm. Theoretical arguments and numerical experiments suggest that it outperforms the most popular criteria in terms of accuracy. Additionally, this paper evaluates some commonly used criteria for infinite eigenvalues.
Introduction: Oblique Target-rotation in the context of exploratory factor analysis is a relevant method for the investigation of the oblique independent clusters model. It was argued that minimizing single cross-loadings by means of target rotation may lead to large effects of sampling error on the target rotated factor solutions. Method: In order to minimize effects of sampling error on results of Target-rotation we propose to compute the mean cross-loadings for each block of salient loadings of the independent clusters model and to perform target rotation for the block-wise mean cross-loadings. The resulting transformation-matrix is than applied to the complete unrotated loading matrix in order to produce mean Target-rotated factors. Results: A simulation study based on correlated independent factor models revealed that mean oblique Target-rotation resulted in smaller negative bias of factor inter-correlations than conventional Target-rotation based on single loadings, especially when sample size was small and when the number of factors was large. An empirical example revealed that the similarity of Target-rotated factors computed for small subsamples with Target-rotated factors of the total sample was more pronounced for mean Target-rotation than for conventional Target-rotation. Discussion: Mean Target-rotation can be recommended in the context of oblique independent factor models, especially for small samples. An R-script and an SPSS-script for this form of Target-rotation are provided in the Appendix.
Assessments of algorithmic bias in large language models (LLMs) are generally catered to uncovering systemic discrimination based on protected characteristics such as sex and ethnicity. However, there are over 180 documented cognitive biases that pervade human reasoning and decision making that are routinely ignored when discussing the ethical complexities of AI. We demonstrate the presence of these cognitive biases in LLMs and discuss the implications of using biased reasoning under the guise of expertise. We call for stronger education, risk management, and continued research as widespread adoption of this technology increases. Finally, we close with a set of best practices for when and how to employ this technology as widespread adoption continues to grow.
Conjugate gradient is an efficient algorithm for solving large sparse linear systems. It has been utilized to accelerate the computation in Bayesian analysis for many large-scale problems. This article discusses the applications of conjugate gradient in Bayesian computation, with a focus on sparse regression and spatial analysis. A self-contained introduction of conjugate gradient is provided to facilitate potential applications in a broader range of problems.