We devise a polynomial-time algorithm for partitioning a simple polygon $P$ into a minimum number of star-shaped polygons. The question of whether such an algorithm exists has been open for more than four decades [Avis and Toussaint, Pattern Recognit., 1981] and it has been repeated frequently, for example in O'Rourke's famous book [Art Gallery Theorems and Algorithms, 1987]. In addition to its strong theoretical motivation, the problem is also motivated by practical domains such as CNC pocket milling, motion planning, and shape parameterization. The only previously known algorithm for a non-trivial special case is for $P$ being both monotone and rectilinear [Liu and Ntafos, Algorithmica, 1991]. For general polygons, an algorithm was only known for the restricted version in which Steiner points are disallowed [Keil, SIAM J. Comput., 1985], meaning that each corner of a piece in the partition must also be a corner of $P$. Interestingly, the solution size for the restricted version may be linear for instances where the unrestricted solution has constant size. The covering variant in which the pieces are star-shaped but allowed to overlap--known as the Art Gallery Problem--was recently shown to be $\exists\mathbb R$-complete and is thus likely not in NP [Abrahamsen, Adamaszek and Miltzow, STOC 2018 & J. ACM 2022]; this is in stark contrast to our result. Arguably the most related work to ours is the polynomial-time algorithm to partition a simple polygon into a minimum number of convex pieces by Chazelle and Dobkin~[STOC, 1979 & Comp. Geom., 1985].
Disentangling model activations into meaningful features is a central problem in interpretability. However, the absence of ground-truth for these features in realistic scenarios makes validating recent approaches, such as sparse dictionary learning, elusive. To address this challenge, we propose a framework for evaluating feature dictionaries in the context of specific tasks, by comparing them against \emph{supervised} feature dictionaries. First, we demonstrate that supervised dictionaries achieve excellent approximation, control, and interpretability of model computations on the task. Second, we use the supervised dictionaries to develop and contextualize evaluations of unsupervised dictionaries along the same three axes. We apply this framework to the indirect object identification (IOI) task using GPT-2 Small, with sparse autoencoders (SAEs) trained on either the IOI or OpenWebText datasets. We find that these SAEs capture interpretable features for the IOI task, but they are less successful than supervised features in controlling the model. Finally, we observe two qualitative phenomena in SAE training: feature occlusion (where a causally relevant concept is robustly overshadowed by even slightly higher-magnitude ones in the learned features), and feature over-splitting (where binary features split into many smaller, less interpretable features). We hope that our framework will provide a useful step towards more objective and grounded evaluations of sparse dictionary learning methods.
We propose a new notion of uniqueness for the adversarial Bayes classifier in the setting of binary classification. Analyzing this concept produces a simple procedure for computing all adversarial Bayes classifiers for a well-motivated family of one dimensional data distributions. This characterization is then leveraged to show that as the perturbation radius increases, certain the regularity of adversarial Bayes classifiers improves. Various examples demonstrate that the boundary of the adversarial Bayes classifier frequently lies near the boundary of the Bayes classifier.
Discourse relation classification is an especially difficult task without explicit context markers \cite{Prasad2008ThePD}. Current approaches to implicit relation prediction solely rely on two neighboring sentences being targeted, ignoring the broader context of their surrounding environments \cite{Atwell2021WhereAW}. In this research, we propose three new methods in which to incorporate context in the task of sentence relation prediction: (1) Direct Neighbors (DNs), (2) Expanded Window Neighbors (EWNs), and (3) Part-Smart Random Neighbors (PSRNs). Our findings indicate that the inclusion of context beyond one discourse unit is harmful in the task of discourse relation classification.
In the literature on Kleene algebra, a number of variants have been proposed which impose additional structure specified by a theory, such as Kleene algebra with tests (KAT) and the recent Kleene algebra with observations (KAO), or make specific assumptions about certain constants, as for instance in NetKAT. Many of these variants fit within the unifying perspective offered by Kleene algebra with hypotheses, which comes with a canonical language model constructed from a given set of hypotheses. For the case of KAT, this model corresponds to the familiar interpretation of expressions as languages of guarded strings. A relevant question therefore is whether Kleene algebra together with a given set of hypotheses is complete with respect to its canonical language model. In this paper, we revisit, combine and extend existing results on this question to obtain tools for proving completeness in a modular way. We showcase these tools by giving new and modular proofs of completeness for KAT, KAO and NetKAT, and we prove completeness for new variants of KAT: KAT extended with a constant for the full relation, KAT extended with a converse operation, and a version of KAT where the collection of tests only forms a distributive lattice.
We implement Genetic Algorithms for triangulations of four-dimensional reflexive polytopes which induce Calabi-Yau threefold hypersurfaces via Batryev's construction. We demonstrate that such algorithms efficiently optimize physical observables such as axion decay constants or axion-photon couplings in string theory compactifications. For our implementation, we choose a parameterization of triangulations that yields homotopy inequivalent Calabi-Yau threefolds by extending fine, regular triangulations of two-faces, thereby eliminating exponentially large redundancy factors in the map from polytope triangulations to Calabi-Yau hypersurfaces. In particular, we discuss how this encoding renders the entire Kreuzer-Skarke list amenable to a variety of optimization strategies, including but not limited to Genetic Algorithms. To achieve optimal performance, we tune the hyperparameters of our Genetic Algorithm using Bayesian optimization. We find that our implementation vastly outperforms other sampling and optimization strategies like Markov Chain Monte Carlo or Simulated Annealing. Finally, we showcase that our Genetic Algorithm efficiently performs optimization even for the maximal polytope with Hodge numbers $h^{1,1} = 491$, where we use it to maximize axion-photon couplings.
We present a simple algorithm for differentiable rendering of surfaces represented by Signed Distance Fields (SDF), which makes it easy to integrate rendering into gradient-based optimization pipelines. To tackle visibility-related derivatives that make rendering non-differentiable, existing physically based differentiable rendering methods often rely on elaborate guiding data structures or reparameterization with a global impact on variance. In this article, we investigate an alternative that embraces nonzero bias in exchange for low variance and architectural simplicity. Our method expands the lower-dimensional boundary integral into a thin band that is easy to sample when the underlying surface is represented by an SDF. We demonstrate the performance and robustness of our formulation in end-to-end inverse rendering tasks, where it obtains results that are competitive with or superior to existing work.
We popularize the question whether, for $m$ large enough, all $m$-uniform shift-chain hypergraphs are properly $2$-colorable. On the other hand, we show that for every $m$ some $m$-uniform shift-chains are not polychromatic $3$-colorable.
We propose a novel Bayesian approach for the adaptive and online estimation of the frequency distribution of a finite number of categories under the local differential privacy (LDP) framework. The proposed algorithm performs Bayesian parameter estimation via posterior sampling and adapts the randomization mechanism for LDP based on the obtained posterior samples. We propose a randomized mechanism for LDP which uses a subset of categories as an input and whose performance depends on the selected subset and the true frequency distribution. By using the posterior sample as an estimate of the frequency distribution, the algorithm performs a computationally tractable subset selection step to maximize the utility of the privatized response of the next user. We propose several utility functions related to well-known information metrics, such as (but not limited to) Fisher information matrix, total variation distance, and information entropy. We compare each of these utility metrics in terms of their computational complexity. We employ stochastic gradient Langevin dynamics for posterior sampling, a computationally efficient approximate Markov chain Monte Carlo method. We provide a theoretical analysis showing that (i) the posterior distribution targeted by the algorithm converges to the true parameter even for approximate posterior sampling, and (ii) the algorithm selects the optimal subset with high probability if posterior sampling is performed exactly. We also provide numerical results that empirically demonstrate the estimation accuracy of our algorithm where we compare it with nonadaptive and semi-adaptive approaches under experimental settings with various combinations of privacy parameters and population distribution parameters.
The notion of branch numbers of a linear transformation is crucial for both linear and differential cryptanalysis. The number of non-zero elements in a state difference or linear mask directly correlates with the active S-Boxes. The differential or linear branch number indicates the minimum number of active S-Boxes in two consecutive rounds of an SPN cipher, specifically for differential or linear cryptanalysis, respectively. This paper presents a new algorithm for computing the branch number of non-singular matrices over finite fields. The algorithm is based on the existing classical method but demonstrates improved computational complexity compared to its predecessor. We conduct a comparative study of the proposed algorithm and the classical approach, providing an analytical estimation of the algorithm's complexity. Our analysis reveals that the computational complexity of our algorithm is the square root of that of the classical approach.
Runtime analysis, as a branch of the theory of AI, studies how the number of iterations algorithms take before finding a solution (its runtime) depends on the design of the algorithm and the problem structure. Drift analysis is a state-of-the-art tool for estimating the runtime of randomised algorithms, such as evolutionary and bandit algorithms. Drift refers roughly to the expected progress towards the optimum per iteration. This paper considers the problem of deriving concentration tail-bounds on the runtime/regret of algorithms. It provides a novel drift theorem that gives precise exponential tail-bounds given positive, weak, zero and even negative drift. Previously, such exponential tail bounds were missing in the case of weak, zero, or negative drift. Our drift theorem can be used to prove a strong concentration of the runtime/regret of algorithms in AI. For example, we prove that the regret of the \rwab bandit algorithm is highly concentrated, while previous analyses only considered the expected regret. This means that the algorithm obtains the optimum within a given time frame with high probability, i.e. a form of algorithm reliability. Moreover, our theorem implies that the time needed by the co-evolutionary algorithm RLS-PD to obtain a Nash equilibrium in a \bilinear max-min-benchmark problem is highly concentrated. However, we also prove that the algorithm forgets the Nash equilibrium, and the time until this occurs is highly concentrated. This highlights a weakness in the RLS-PD which should be addressed by future work.