We study the number of non-zero terms in two specific families of ternary cyclotomic polynomial, we find formulas for the number of terms by writing the cyclotomic polynomial as a sum of smaller sub-polynomials and study the properties of these polynomial.
In causal inference on directed acyclic graphs, the orientation of edges is in general only recovered up to Markov equivalence classes. We study Markov equivalence classes of uniformly random directed acyclic graphs. Using a tower decomposition, we show that the ratio between the number of Markov equivalence classes and directed acyclic graphs approaches a positive constant when the number of sites goes to infinity. For a typical directed acyclic graph, the expected number of elements in its Markov equivalence class remains bounded. More precisely, we prove that for a uniformly chosen directed acyclic graph, the size of its Markov equivalence class has super-polynomial tails.
A power series being given as the solution of a linear differential equation with appropriate initial conditions, minimization consists in finding a non-trivial linear differential equation of minimal order having this power series as a solution. This problem exists in both homogeneous and inhomogeneous variants; it is distinct from, but related to, the classical problem of factorization of differential operators. Recently, minimization has found applications in Transcendental Number Theory, more specifically in the computation of non-zero algebraic points where Siegel's $E$-functions take algebraic values. We present algorithms for these questions and discuss implementation and experiments.
This article establishes novel strong uniform laws of large numbers for randomly weighted sums such as bootstrap means. By leveraging recent advances, these results extend previous work in their general applicability to a wide range of weighting procedures and in their flexibility with respect to the effective bootstrap sample size. In addition to the standard multinomial bootstrap and the $m$-out-of-$n$ bootstrap, our results apply to a large class of randomly weighted sums involving negatively orthant dependent (NOD) weights, including the Bayesian bootstrap, jackknife, resampling without replacement, simple random sampling with over-replacement, independent weights, and multivariate Gaussian weighting schemes. Weights are permitted to be non-identically distributed and possibly even negative. Our proof technique is based on extending a proof of the i.i.d.\ strong uniform law of large numbers to employ strong laws for randomly weighted sums; in particular, we exploit a recent Marcinkiewicz--Zygmund strong law for NOD weighted sums.
We extend results known for the randomized Gauss-Seidel and the Gauss-Southwell methods for the case of a Hermitian and positive definite matrix to certain classes of non-Hermitian matrices. We obtain convergence results for a whole range of parameters describing the probabilities in the randomized method or the greedy choice strategy in the Gauss-Southwell-type methods. We identify those choices which make our convergence bounds best possible. Our main tool is to use weighted l1-norms to measure the residuals. A major result is that the best convergence bounds that we obtain for the expected values in the randomized algorithm are as good as the best for the deterministic, but more costly algorithms of Gauss-Southwell type. Numerical experiments illustrate the convergence of the method and the bounds obtained. Comparisons with the randomized Kaczmarz method are also presented.
We consider the problem of jointly minimizing forms of two Boolean functions $f, g \colon \{0,1\}^J \to \{0,1\}$ such that $f + g \leq 1$ and so as to separate disjoint sets $A \cup B \subseteq \{0,1\}^J$ such that $f(A) = \{1\}$ and $g(B) = \{1\}$. We hypothesize that this problem is easier to solve or approximate than the well-understood problem of minimizing the form of one Boolean function $h: \{0,1\}^J \to \{0,1\}$ such that $h(A) = \{1\}$ and $h(B) = \{0\}$. For a large class of forms, including binary decision trees and ordered binary decision diagrams, we refute this hypothesis. For disjunctive normal forms, we show that the problem is at least as hard as MIN-SET-COVER. For all these forms, we establish that no $o(\ln (|A| + |B| -1))$-approximation algorithm exists unless P$=$NP.
Clustering is a usual unsupervised machine learning technique for grouping the data points into groups based upon similar features. We focus here on unsupervised clustering for contaminated data, i.e in the case where K-medians should be preferred to K-means because of its robustness. More precisely, we concentrate on a common question in clustering: how to chose the number of clusters? The answer proposed here is to consider the choice of the optimal number of clusters as the minimization of a risk function via penalization. In this paper, we obtain a suitable penalty shape for our criterion and derive an associated oracle-type inequality. Finally, the performance of this approach with different types of K-medians algorithms is compared on a simulation study with other popular techniques. All studied algorithms are available in the R package Kmedians on CRAN.
Artificial neural networks took a lot of inspiration from their biological counterparts in becoming our best machine perceptual systems. This work summarizes some of that history and incorporates modern theoretical neuroscience into experiments with artificial neural networks from the field of deep learning. Specifically, iterative magnitude pruning is used to train sparsely connected networks with 33x fewer weights without loss in performance. These are used to test and ultimately reject the hypothesis that weight sparsity alone improves image noise robustness. Recent work mitigated catastrophic forgetting using weight sparsity, activation sparsity, and active dendrite modeling. This paper replicates those findings, and extends the method to train convolutional neural networks on a more challenging continual learning task. The code has been made publicly available.
A nonlinear partial differential equation (PDE) that models the possible shapes that a periodic Miura tessellation can take in the homogenization limit has been established recently and solved only in specific cases. In this paper, the existence and uniqueness of a solution to the PDE is proved for general Dirichlet boundary conditions. Then a H^2-conforming discretization is introduced to approximate the solution of the PDE and a fixed point algorithm is proposed to solve the associated discrete problem. A convergence proof for the method is given as well as a convergence rate. Finally, numerical experiments show the robustness of the method and that non trivial shapes can be achieved using periodic Miura tessellations.
We consider the problem of solving the Min-Sum Submodular Cover problem using local search. The Min-Sum Submodular Cover problem generalizes the NP-complete Min-Sum Set Cover problem, replacing the input set cover instance with a monotone submodular set function. A simple greedy algorithm achieves an approximation factor of 4, which is tight unless P=NP [Streeter and Golovin, NeurIPS, 2008]. We complement the greedy algorithm with analysis of a local search algorithm. Building on work of Munagala et al. [ICDT, 2005], we show that, using simple initialization, a straightforward local search algorithm achieves a $(4+\epsilon)$-approximate solution in time $O(n^3\log(n/\epsilon))$, provided that the monotone submodular set function is also second-order supermodular. Second-order supermodularity has been shown to hold for a number of submodular functions of practical interest, including functions associated with set cover, matching, and facility location. We present experiments on two special cases of Min-Sum Submodular Cover and find that the local search algorithm can outperform the greedy algorithm on small data sets.
In this article, we present an extension of the splitting algorithm proposed in [22] to networks of conservation laws with piecewise linear discontinuous flux functions in the unknown. We start with the discussion of a suitable Riemann solver at the junction and then describe a strategy how to use the splitting algorithm on the network. In particular, we focus on two types of junctions, i.e., junctions where the number of outgoing roads does not exceed the number of incoming roads (dispersing type) and junctions with two incoming and one outgoing road (merging type). Finally, numerical examples demonstrate the accuracy of the splitting algorithm by comparisons to the exact solution and other approaches used in the literature.