Graph burning is a graph process that models the spread of social contagion. Initially, all the vertices of a graph G are unburnt. At each step, an unburnt vertex is put on fire and the fire from burnt vertices of the previous step spreads to their adjacent unburnt vertices. This process continues till all vertices are burnt. The burning number $b(G)$ of the graph $G$ is the minimum number of steps required to burn all the vertices in the graph. The burning number conjecture by Bonato et al. states that for a connected graph G of order n, its burning number $b(G) \leq \lceil \sqrt{n} \rceil$. It is easy to observe that in order to burn a graph it is enough to burn its spanning tree. Hence it suffices to prove that for any tree T of order n, its burning number $b(T) \leq \lceil \sqrt{n} \rceil$ where $T$ is the spanning tree of $G$. It was proved in 2018 that $b(T) \leq \lceil \sqrt{n + n_2 + 1/4} +1/2 \rceil$ for a tree $T$ where $n_2$ is the number of degree $2$ vertices in $T$. In this article, we give an algorithm to burn a tree and we improve the existing bound using this algorithm. We prove that $b(T)\leq \lceil \sqrt{n + n_2 + 8}\rceil -1$ which is an improved bound. Also, under certain restriction of degree $2$ vertices, we have improved upon the result of Bonato et al.(in 2021). We also provide an algorithm to burn a binary tree and prove the burning number conjecture for the same.
This paper introduces novel weighted conformal p-values and methods for model-free selective inference. The problem is as follows: given test units with covariates $X$ and missing responses $Y$, how do we select units for which the responses $Y$ are larger than user-specified values while controlling the proportion of false positives? Can we achieve this without any modeling assumptions on the data and without any restriction on the model for predicting the responses? Last, methods should be applicable when there is a covariate shift between training and test data, which commonly occurs in practice. We answer these questions by first leveraging any prediction model to produce a class of well-calibrated weighted conformal p-values, which control the type-I error in detecting a large response. These p-values cannot be passed on to classical multiple testing procedures since they may not obey a well-known positive dependence property. Hence, we introduce weighted conformalized selection (WCS), a new procedure which controls false discovery rate (FDR) in finite samples. Besides prediction-assisted candidate selection, WCS (1) allows to infer multiple individual treatment effects, and (2) extends to outlier detection with inlier distributions shifts. We demonstrate performance via simulations and applications to causal inference, drug discovery, and outlier detection datasets.
The action of a noise operator on a code transforms it into a distribution on the respective space. Some common examples from information theory include Bernoulli noise acting on a code in the Hamming space and Gaussian noise acting on a lattice in the Euclidean space. We aim to characterize the cases when the output distribution is close to the uniform distribution on the space, as measured by R\'enyi divergence of order $\alpha \in (1,\infty]$. A version of this question is known as the channel resolvability problem in information theory, and it has implications for security guarantees in wiretap channels, error correction, discrepancy, worst-to-average case complexity reductions, and many other problems. Our work quantifies the requirements for asymptotic uniformity (perfect smoothing) and identifies explicit code families that achieve it under the action of the Bernoulli and ball noise operators on the code. We derive expressions for the minimum rate of codes required to attain asymptotically perfect smoothing. In proving our results, we leverage recent results from harmonic analysis of functions on the Hamming space. Another result pertains to the use of code families in Wyner's transmission scheme on the binary wiretap channel. We identify explicit families that guarantee strong secrecy when applied in this scheme, showing that nested Reed-Muller codes can transmit messages reliably and securely over a binary symmetric wiretap channel with a positive rate. Finally, we establish a connection between smoothing and error correction in the binary symmetric channel.
Miura surfaces are the solutions of a constrained nonlinear elliptic system of equations. This system is derived by homogenization from the Miura fold, which is a type of origami fold with multiple applications in engineering. A previous inquiry, gave suboptimal conditions for existence of solutions and proposed an $H^2$-conformal finite element method to approximate them. In this paper, the existence of Miura surfaces is studied using a mixed formulation. It is also proved that the constraints propagate from the boundary to the interior of the domain for well-chosen boundary conditions. Then, a numerical method based on a least-squares formulation, Taylor--Hood finite elements and a Newton method is introduced to approximate Miura surfaces. The numerical method is proved to converge at order one in space and numerical tests are performed to demonstrate its robustness.
We present a method for computing nearly singular integrals that occur when single or double layer surface integrals, for harmonic potentials or Stokes flow, are evaluated at points nearby. Such values could be needed in solving an integral equation when one surface is close to another or to obtain values at grid points. We replace the singular kernel with a regularized version having a length parameter $\delta$ in order to control discretization error. Analysis near the singularity leads to an expression for the error due to regularization which has terms with unknown coefficients multiplying known quantities. By computing the integral with three choices of $\delta$ we can solve for an extrapolated value that has regularization error reduced to $O(\delta^5)$. In examples with $\delta/h$ constant and moderate resolution we observe total error about $O(h^5)$. For convergence as $h \to 0$ we can choose $\delta$ proportional to $h^q$ with $q < 1$ to ensure the discretization error is dominated by the regularization error. With $q = 4/5$ we find errors about $O(h^4)$. For harmonic potentials we extend the approach to a version with $O(\delta^7)$ regularization; it typically has smaller errors but the order of accuracy is less predictable.
Finding a minimum is an essential part of mathematical models, and it plays an important role in some optimization problems. Durr and Hoyer proposed a quantum searching algorithm (DHA), with a certain probability of success, to achieve quadratic speed than classical ones. In this paper, we propose an optimized quantum minimum searching algorithm with sure-success probability, which utilizes Grover-Long searching to implement the optimal exact searching, and the dynamic strategy to reduce the iterations of our algorithm. Besides, we optimize the oracle circuit to reduce the number of gates by the simplified rules. The performance evaluation including the theoretical success rate and computational complexity shows that our algorithm has higher accuracy and efficiency than DHA algorithm. Finally, a simulation experiment based on Cirq is performed to verify its feasibility.
Multi-level modeling is an important approach for analyzing complex survey data using multi-stage sampling. However, estimation of multi-level models can be challenging when we combine several datasets with distinct hierarchies with sampling weights. This paper presents a method for combining multiple datasets with different hierarchical structures due to distinct informative sampling designs for the same survey. To develop an approach with complete generality, we propose to define a pseudo-cluster, a cluster containing only a singleton observation, to unify the data structure and thereby enable estimation of multi-level models incorporating sampling weights across the combined sample. We justify incorporating sampling weights at each level of the hierarchical model and in doing-so define a pseudo-likelihood estimation procedure. Simulation studies are used to illustrate the effect of incorporating sampling designs in this challenging multi-level modeling scenario. We demonstrate in the simulation studies that considering a linear mixed model with sampling weights provides unbiased estimates of model parameters and enhances the estimation of the variance components of the random effects. The proposed method is illustrated through a novel application from the National Survey of Healthcare Organizations and Systems that sought to determine which organizational characteristics or traits, as measured in the surveys, have the strongest average relationship to the percentage of depression and anxiety diagnoses in physician practices in the US.
We propose a new class of models for variable clustering called Asymptotic Independent block (AI-block) models, which defines population-level clusters based on the independence of the maxima of a multivariate stationary mixing random process among clusters. This class of models is identifiable, meaning that there exists a maximal element with a partial order between partitions, allowing for statistical inference. We also present an algorithm for recovering the clusters of variables without specifying the number of clusters \emph{a priori}. Our work provides some theoretical insights into the consistency of our algorithm, demonstrating that under certain conditions it can effectively identify clusters in the data with a computational complexity that is polynomial in the dimension. This implies that groups can be learned nonparametrically in which block maxima of a dependent process are only sub-asymptotic. To further illustrate the significance of our work, we applied our method to neuroscience and environmental real-datasets. These applications highlight the potential and versatility of the proposed approach.
Decision trees are widely used in machine learning due to their simplicity in construction and interpretability. However, as data sizes grow, traditional methods for constructing and retraining decision trees become increasingly slow, scaling polynomially with the number of training examples. In this work, we introduce a novel quantum algorithm, named Des-q, for constructing and retraining decision trees in regression and binary classification tasks. Assuming the data stream produces small increments of new training examples, we demonstrate that our Des-q algorithm significantly reduces the time required for tree retraining, achieving a poly-logarithmic time complexity in the number of training examples, even accounting for the time needed to load the new examples into quantum-accessible memory. Our approach involves building a decision tree algorithm to perform k-piecewise linear tree splits at each internal node. These splits simultaneously generate multiple hyperplanes, dividing the feature space into k distinct regions. To determine the k suitable anchor points for these splits, we develop an efficient quantum-supervised clustering method, building upon the q-means algorithm of Kerenidis et al. Des-q first efficiently estimates each feature weight using a novel quantum technique to estimate the Pearson correlation. Subsequently, we employ weighted distance estimation to cluster the training examples in k disjoint regions and then proceed to expand the tree using the same procedure. We benchmark the performance of the simulated version of our algorithm against the state-of-the-art classical decision tree for regression and binary classification on multiple data sets with numerical features. Further, we showcase that the proposed algorithm exhibits similar performance to the state-of-the-art decision tree while significantly speeding up the periodic tree retraining.
We propose an approach to compute inner and outer-approximations of the sets of values satisfying constraints expressed as arbitrarily quantified formulas. Such formulas arise for instance when specifying important problems in control such as robustness, motion planning or controllers comparison. We propose an interval-based method which allows for tractable but tight approximations. We demonstrate its applicability through a series of examples and benchmarks using a prototype implementation.
The goal of explainable Artificial Intelligence (XAI) is to generate human-interpretable explanations, but there are no computationally precise theories of how humans interpret AI generated explanations. The lack of theory means that validation of XAI must be done empirically, on a case-by-case basis, which prevents systematic theory-building in XAI. We propose a psychological theory of how humans draw conclusions from saliency maps, the most common form of XAI explanation, which for the first time allows for precise prediction of explainee inference conditioned on explanation. Our theory posits that absent explanation humans expect the AI to make similar decisions to themselves, and that they interpret an explanation by comparison to the explanations they themselves would give. Comparison is formalized via Shepard's universal law of generalization in a similarity space, a classic theory from cognitive science. A pre-registered user study on AI image classifications with saliency map explanations demonstrate that our theory quantitatively matches participants' predictions of the AI.