The {\em binary deletion channel} with deletion probability $d$ ($\text{BDC}_d$) is a random channel that deletes each bit of the input message i.i.d with probability $d$. It has been studied extensively as a canonical example of a channel with synchronization errors. Perhaps the most important question regarding the BDC is determining its capacity. Mitzenmacher and Drinea (ITIT 2006) and Kirsch and Drinea (ITIT 2009) show a method by which distributions on run lengths can be converted to codes for the BDC, yielding a lower bound of $\mathcal{C}(\text{BDC}_d) > 0.1185 \cdot (1-d)$. Fertonani and Duman (ITIT 2010), Dalai (ISIT 2011) and Rahmati and Duman (ITIT 2014) use computer aided analyses based on the Blahut-Arimoto algorithm to prove an upper bound of $\mathcal{C}(\text{BDC}_d) < 0.4143\cdot(1-d)$ in the high deletion probability regime ($d > 0.65$). In this paper, we show that the Blahut-Arimoto algorithm can be implemented with a lower space complexity, allowing us to extend the upper bound analyses, and prove an upper bound of $\mathcal{C}(\text{BDC}_d) < 0.3745 \cdot(1-d)$ for all $d \geq 0.68$. Furthermore, we show that an extension of the Blahut-Arimoto algorithm can also be used to select better run length distributions for Mitzenmacher and Drinea's construction, yielding a lower bound of $\mathcal{C}(\text{BDC}_d) > 0.1221 \cdot (1 - d)$.
Many isomorphism problems for tensors, groups, algebras, and polynomials were recently shown to be equivalent to one another under polynomial-time reductions, prompting the introduction of the complexity class TI (Grochow & Qiao, ITCS '21; SIAM J. Comp., '23). Using the tensorial viewpoint, Grochow & Qiao (CCC '21) then gave moderately exponential-time search- and counting-to-decision reductions for a class of $p$-groups. A significant issue was that the reductions usually incurred a quadratic increase in the length of the tensors involved. When the tensors represent $p$-groups, this corresponds to an increase in the order of the group of the form $|G|^{\Theta(\log |G|)}$, negating any asymptotic gains in the Cayley table model. In this paper, we present a new kind of tensor gadget that allows us to replace those quadratic-length reductions with linear-length ones, yielding the following consequences: 1. Combined with the recent breakthrough $|G|^{O((\log |G|)^{5/6})}$-time isomorphism-test for $p$-groups of class 2 and exponent $p$ (Sun, STOC '23), our reductions extend this runtime to $p$-groups of class $c$ and exponent $p$ where $c<p$. 2. Our reductions show that Sun's algorithm solves several TI-complete problems over $F_p$, such as isomorphism problems for cubic forms, algebras, and tensors, in time $p^{O(n^{1.8} \log p)}$. 3. Polynomial-time search- and counting-to-decision reduction for testing isomorphism of $p$-groups of class $2$ and exponent $p$ in the Cayley table model. This answers questions of Arvind and T\'oran (Bull. EATCS, 2005) for this group class, thought to be one of the hardest cases of Group Isomorphism. 4. If Graph Isomorphism is in P, then testing equivalence of cubic forms and testing isomorphism of algebra over a finite field $F_q$ can both be solved in time $q^{O(n)}$, improving from the brute-force upper bound $q^{O(n^2)}$.
In this paper we report our new finding on the linear sampling and factorization methods: in addition to shape identification, the linear sampling and factorization methods have capability in parameter identification. Our demonstration is for shape/parameter identification associated with a restricted Fourier integral operator which arises from the multi-frequency inverse source problem for a fixed observation direction and the Born inverse scattering problems. Within the framework of linear sampling method, we develop both a shape identification theory and a parameter identification theory which are stimulated, analyzed, and implemented with the help of the prolate spheroidal wave functions and their generalizations. Both the shape and parameter identification theories are general, since the theories allow any general regularization scheme such as the Tikhonov or the singular value cut off regularization. We further propose a prolate-Galerkin formulation of the linear sampling method for implementation and provide numerical experiments to demonstrate how the linear sampling method is capable of reconstructing both the shape and the parameter.
We give a new presentation of the main result of Arunachalam, Bri\"et and Palazuelos (SICOMP'19) and show that quantum query algorithms are characterized by a new class of polynomials which we call Fourier completely bounded polynomials. We conjecture that all such polynomials have an influential variable. This conjecture is weaker than the famous Aaronson-Ambainis (AA) conjecture (Theory of Computing'14), but has the same implications for classical simulation of quantum query algorithms. We prove a new case of the AA conjecture by showing that it holds for homogeneous Fourier completely bounded polynomials. This implies that if the output of $d$-query quantum algorithm is a homogeneous polynomial $p$ of degree $2d$, then it has a variable with influence at least $Var[p]^2$. In addition, we give an alternative proof of the results of Bansal, Sinha and de Wolf (CCC'22 and QIP'23) showing that block-multilinear completely bounded polynomials have influential variables. Our proof is simpler, obtains better constants and does not use randomness.
Suppose we are given access to $n$ independent samples from distribution $\mu$ and we wish to output one of them with the goal of making the output distributed as close as possible to a target distribution $\nu$. In this work we show that the optimal total variation distance as a function of $n$ is given by $\tilde\Theta(\frac{D}{f'(n)})$ over the class of all pairs $\nu,\mu$ with a bounded $f$-divergence $D_f(\nu\|\mu)\leq D$. Previously, this question was studied only for the case when the Radon-Nikodym derivative of $\nu$ with respect to $\mu$ is uniformly bounded. We then consider an application in the seemingly very different field of smoothed online learning, where we show that recent results on the minimax regret and the regret of oracle-efficient algorithms still hold even under relaxed constraints on the adversary (to have bounded $f$-divergence, as opposed to bounded Radon-Nikodym derivative). Finally, we also study efficacy of importance sampling for mean estimates uniform over a function class and compare importance sampling with rejection sampling.
Learning the graphical structure of Bayesian networks is key to describing data-generating mechanisms in many complex applications but poses considerable computational challenges. Observational data can only identify the equivalence class of the directed acyclic graph underlying a Bayesian network model, and a variety of methods exist to tackle the problem. Under certain assumptions, the popular PC algorithm can consistently recover the correct equivalence class by reverse-engineering the conditional independence (CI) relationships holding in the variable distribution. The dual PC algorithm is a novel scheme to carry out the CI tests within the PC algorithm by leveraging the inverse relationship between covariance and precision matrices. By exploiting block matrix inversions we can also perform tests on partial correlations of complementary (or dual) conditioning sets. The multiple CI tests of the dual PC algorithm proceed by first considering marginal and full-order CI relationships and progressively moving to central-order ones. Simulation studies show that the dual PC algorithm outperforms the classic PC algorithm both in terms of run time and in recovering the underlying network structure, even in the presence of deviations from Gaussianity. Additionally, we show that the dual PC algorithm applies for Gaussian copula models, and demonstrate its performance in that setting.
By an analogy to the duality between the recurrence time and the longest match length, we introduce a quantity dual to the maximal repetition length, which we call the repetition time. Using the generalized Kac lemma for successive recurrence times by Chen Moy, we sandwich the repetition time in terms of min-entropies with no or relatively short conditioning. The sole assumption is stationarity and ergodicity. The proof is surprisingly short and the claim is fully general in contrast to earlier approaches by Szpankowski and by D\k{e}bowski. We discuss the analogy of this result with the Wyner-Ziv/Ornstein-Weiss theorem, which sandwiches the recurrence time in terms of Shannon entropies. We formulate the respective sandwich bounds in a way that applies also to the case of stretched exponential growth observed empirically for natural language.
Over the last several decades, improvements in the fields of analytic combinatorics and computer algebra have made determining the asymptotic behaviour of sequences satisfying linear recurrence relations with polynomial coefficients largely a matter of routine, under assumptions that hold often in practice. The algorithms involved typically take a sequence, encoded by a recurrence relation and initial terms, and return the leading terms in an asymptotic expansion up to a big-O error term. Less studied, however, are effective techniques giving an explicit bound on asymptotic error terms. Among other things, such explicit bounds typically allow the user to automatically prove sequence positivity (an active area of enumerative and algebraic combinatorics) by exhibiting an index when positive leading asymptotic behaviour dominates any error terms. In this article, we present a practical algorithm for computing such asymptotic approximations with rigorous error bounds, under the assumption that the generating series of the sequence is a solution of a differential equation with regular (Fuchsian) dominant singularities. Our algorithm approximately follows the singularity analysis method of Flajolet and Odlyzko, except that all big-O terms involved in the derivation of the asymptotic expansion are replaced by explicit error terms. The computation of the error terms combines analytic bounds from the literature with effective techniques from rigorous numerics and computer algebra. We implement our algorithm in the SageMath computer algebra system and exhibit its use on a variety of applications (including our original motivating example, solution uniqueness in the Canham model for the shape of genus one biomembranes).
Nowadays, the deployment of deep learning models on edge devices for addressing real-world classification problems is becoming more prevalent. Moreover, there is a growing popularity in the approach of early classification, a technique that involves classifying the input data after observing only an early portion of it, aiming to achieve reduced communication and computation requirements, which are crucial parameters in edge intelligence environments. While early classification in the field of time series analysis has been broadly researched, existing solutions for multivariate time series problems primarily focus on early classification along the temporal dimension, treating the multiple input channels in a collective manner. In this study, we propose a more flexible early classification pipeline that offers a more granular consideration of input channels and extends the early classification paradigm to the channel dimension. To implement this method, we utilize reinforcement learning techniques and introduce constraints to ensure the feasibility and practicality of our objective. To validate its effectiveness, we conduct experiments using synthetic data and we also evaluate its performance on real datasets. The comprehensive results from our experiments demonstrate that, for multiple datasets, our method can enhance the early classification paradigm by achieving improved accuracy for equal input utilization.
Optimal design is a critical yet challenging task within many applications. This challenge arises from the need for extensive trial and error, often done through simulations or running field experiments. Fortunately, sequential optimal design, also referred to as Bayesian optimization when using surrogates with a Bayesian flavor, has played a key role in accelerating the design process through efficient sequential sampling strategies. However, a key opportunity exists nowadays. The increased connectivity of edge devices sets forth a new collaborative paradigm for Bayesian optimization. A paradigm whereby different clients collaboratively borrow strength from each other by effectively distributing their experimentation efforts to improve and fast-track their optimal design process. To this end, we bring the notion of consensus to Bayesian optimization, where clients agree (i.e., reach a consensus) on their next-to-sample designs. Our approach provides a generic and flexible framework that can incorporate different collaboration mechanisms. In lieu of this, we propose transitional collaborative mechanisms where clients initially rely more on each other to maneuver through the early stages with scant data, then, at the late stages, focus on their own objectives to get client-specific solutions. Theoretically, we show the sub-linear growth in regret for our proposed framework. Empirically, through simulated datasets and a real-world collaborative material discovery experiment, we show that our framework can effectively accelerate and improve the optimal design process and benefit all participants.
Discretization of the uniform norm of functions from a given finite dimensional subspace of continuous functions is studied. Previous known results show that for any $N$-dimensional subspace of the space of continuous functions it is sufficient to use $e^{CN}$ sample points for an accurate upper bound for the uniform norm by the discrete norm and that one cannot improve on the exponential growth of the number of sampling points for a good discretization theorem in the uniform norm. In this paper we focus on two types of results, which allow us to obtain good discretization of the uniform norm with polynomial in $N$ number of points. In the first way we weaken the discretization inequality by allowing a bound of the uniform norm by the discrete norm multiplied by an extra factor, which may depend on $N$. In the second way we impose restrictions on the finite dimensional subspace under consideration. In particular, we prove a general result, which connects the upper bound on the number of sampling points in the discretization theorem for the uniform norm with the best $m$-term bilinear approximation of the Dirichlet kernel associated with the given subspace.