Combinatorial designs are closely related to linear codes. In recent year, there are a lot of $t$-designs constructed from certain linear codes. In this paper, we aim to construct $2$-designs from binary three-weight codes. For any binary three-weight code $\mathcal{C}$ with length $n$, let $A_{n}(\mathcal{C})$ be the number of codewords in $\mathcal{C}$ with Hamming weight $n$, then we show that $\mathcal{C}$ holds $2$-designs when $\mathcal{C}$ is projective and $A_{n}(\mathcal{C})=1$. Furthermore, by extending some certain binary projective two-weight codes and basing on the defining set method, we construct two classes of binary projective three-weight codes which are suitable for holding $2$-designs.
We propose a novel neural network architecture based on conformer transducer that adds contextual information flow to the ASR systems. Our method improves the accuracy of recognizing uncommon words while not harming the word error rate of regular words. We explore the uncommon words accuracy improvement when we use the new model and/or shallow fusion with context language model. We found that combination of both provides cumulative gain in uncommon words recognition accuracy.
Multi-product formulas (MPF) are linear combinations of Trotter circuits offering high-quality simulation of Hamiltonian time evolution with fewer Trotter steps. Here we report two contributions aimed at making multi-product formulas more viable for near-term quantum simulations. First, we extend the theory of Trotter error with commutator scaling developed by Childs, Su, Tran et al. to multi-product formulas. Our result implies that multi-product formulas can achieve a quadratic reduction of Trotter error in 1-norm (nuclear norm) on arbitrary time intervals compared with the regular product formulas without increasing the required circuit depth or qubit connectivity. The number of circuit repetitions grows only by a constant factor. Second, we introduce dynamic multi-product formulas with time-dependent coefficients chosen to minimize a certain efficiently computable proxy for the Trotter error. We use a minimax estimation method to make dynamic multi-product formulas robust to uncertainty from algorithmic errors, sampling and hardware noise. We call this method Minimax MPF and we provide a rigorous bound on its error.
AntiCopyPaster is an IntelliJ IDEA plugin, implemented to detect and refactor duplicate code interactively as soon as a duplicate is introduced. The plugin only recommends the extraction of a duplicate when it is worth it. In contrast to current Extract Method refactoring approaches, our tool seamlessly integrates with the developer's workflow and actively provides recommendations for refactorings. This work extends our tool to allow developers to customize the detection rules, i.e., metrics, based on their needs and preferences. The plugin and its source code are publicly available on GitHub at //github.com/refactorings/anti-copy-paster. The demonstration video can be found on YouTube: //youtu.be/ Y1sbfpds2Ms.
The comparison of frequency distributions is a common statistical task with broad applications and a long history of methodological development. However, existing measures do not quantify the magnitude and direction by which one distribution is shifted relative to another. In the present study, we define distributional shift (DS) as the concentration of frequencies away from the greatest discrete class, e.g., a histogram's right-most bin. We derive a measure of DS based on the sum of cumulative frequencies, intuitively quantifying shift as a statistical moment. We then define relative distributional shift (RDS) as the difference in DS between distributions. Using simulated random sampling, we demonstrate that RDS is highly related to measures that are popularly used to compare frequency distributions. Focusing on a specific use case, i.e., simulated healthcare Evaluation and Management coding profiles, we show how RDS can be used to examine many pairs of empirical and expected distributions via shift-significance plots. In comparison to other measures, RDS has the unique advantage of being a signed (directional) measure based on a simple difference in an intuitive property.
Frameproof codes have been extensively studied for many years due to their application in copyright protection and their connection to extremal set theory. In this paper, we investigate upper bounds on the cardinality of wide-sense $t$-frameproof codes. For $t=2$, we apply results from Sperner theory to give a better upper bound, which significantly improves a recent bound by Zhou and Zhou. For $t\geq 3$, we provide a general upper bound by establishing a relation between wide-sense frameproof codes and cover-free families. Finally, when the code length $n$ is at most $\frac{15+\sqrt{33}}{24}(t-1)^2$, we show that a wide-sense $t$-frameproof code has at most $n$ codewords, and the unique optimal code consists of all weight-one codewords. As byproducts, our results improve several best known results on binary $t$-frameproof codes.
The present work presents a stable POD-Galerkin based reduced-order model (ROM) for two-dimensional Rayleigh-B\'enard convection in a square geometry for three Rayleigh numbers: $10^4$ (steady state), $3\times 10^5$ (periodic), and $6 \times 10^6$ (chaotic). Stability is obtained through a particular (staggered-grid) full-order model (FOM) discretization that leads to a ROM that is pressure-free and has skew-symmetric (energy-conserving) convective terms. This yields long-time stable solutions without requiring stabilizing mechanisms, even outside the training data range. The ROM's stability is validated for the different test cases by investigating the Nusselt and Reynolds number time series and the mean and variance of the vertical temperature profile. In general, these quantities converge to the FOM when increasing the number of modes, and turn out to be a good measure of accuracy. However, for the chaotic case, convergence with increasing numbers of modes is relatively difficult and a high number of modes is required to resolve the low-energy structures that are important for the global dynamics.
The well-known Cluster Vertex Deletion problem (CVD) asks for a given graph $G$ and an integer $k$ whether it is possible to delete a set $S$ of at most $k$ vertices of $G$ such that the resulting graph $G-S$ is a cluster graph (a disjoint union of cliques). We give a complete characterization of graphs $H$ for which CVD on $H$-free graphs is polynomially solvable and for which it is NP-complete. Moreover, in the NP-completeness cases, CVD cannot be solved in sub-exponential time in the vertex number of the $H$-free input graphs unless the Exponential-Time Hypothesis fails. We also consider the connected variant of CVD, the Connected Cluster Vertex Deletion problem (CCVD), in which the set $S$ has to induce a connected subgraph of $G$. It turns out that CCVD admits the same complexity dichotomy for $H$-free graphs. Our results enlarge a list of rare dichotomy theorems for well-studied problems on $H$-free graphs.
We construct a new family of permutationally invariant codes that correct $t$ Pauli errors for any $t\ge 1$. We also show that codes in the new family correct quantum deletion errors as well as spontaneous decay errors. Our construction contains some of the previously known permutationally invariant quantum codes as particular cases, which also admit transversal gates. In many cases, the codes in the new family are shorter than the best previously known explicit permutationally invariant codes for Pauli errors and deletions. Furthermore, our new code family includes a new $((4,2,2))$ optimal single-deletion-correcting code. As a separate result, we generalize the conditions for permutationally invariant codes to correct $t$ Pauli errors from the previously known results for $t=1$ to any number of errors. For small $t$, these conditions can be used to construct new examples of codes by computer.
Researchers would often like to leverage data from a collection of sources (e.g., primary studies in a meta-analysis) to estimate causal effects in a target population of interest. However, traditional meta-analytic methods do not produce causally interpretable estimates for a well-defined target population. In this paper, we present the CausalMetaR R package, which implements efficient and robust methods to estimate causal effects in a given internal or external target population using multi-source data. The package includes estimators of average and subgroup treatment effects for the entire target population. To produce efficient and robust estimates of causal effects, the package implements doubly robust and non-parametric efficient estimators and supports using flexible data-adaptive (e.g., machine learning techniques) methods and cross-fitting techniques to estimate the nuisance models (e.g., the treatment model, the outcome model). We describe the key features of the package and demonstrate how to use the package through an example.
In recent years, object detection has experienced impressive progress. Despite these improvements, there is still a significant gap in the performance between the detection of small and large objects. We analyze the current state-of-the-art model, Mask-RCNN, on a challenging dataset, MS COCO. We show that the overlap between small ground-truth objects and the predicted anchors is much lower than the expected IoU threshold. We conjecture this is due to two factors; (1) only a few images are containing small objects, and (2) small objects do not appear enough even within each image containing them. We thus propose to oversample those images with small objects and augment each of those images by copy-pasting small objects many times. It allows us to trade off the quality of the detector on large objects with that on small objects. We evaluate different pasting augmentation strategies, and ultimately, we achieve 9.7\% relative improvement on the instance segmentation and 7.1\% on the object detection of small objects, compared to the current state of the art method on MS COCO.