Decision trees offer the benefit of easy interpretation because they allow the classification of input data based on if--then rules. However, as decision trees are constructed by an algorithm that achieves clear classification with minimum necessary rules, the trees possess the drawback of extracting only minimum rules, even when various latent rules exist in data. Approaches that construct multiple trees using randomly selected feature subsets do exist. However, the number of trees that can be constructed remains at the same scale because the number of feature subsets is a combinatorial explosion. Additionally, when multiple trees are constructed, numerous rules are generated, of which several are untrustworthy and/or highly similar. Therefore, we propose "MAABO-MT" and "GS-MRM" algorithms that strategically construct trees with high estimation performance among all possible trees with small computational complexity and extract only reliable and non-similar rules, respectively. Experiments are conducted using several open datasets to analyze the effectiveness of the proposed method. The results confirm that MAABO-MT can discover reliable rules at a lower computational cost than other methods that rely on randomness. Furthermore, the proposed method is confirmed to provide deeper insights than single decision trees commonly used in previous studies. Therefore, MAABO-MT and GS-MRM can efficiently extract rules from combinatorially exploded decision trees.
We address the problem of configuring a power distribution network with reliability and resilience objectives by satisfying the demands of the consumers and saturating each production source as little as possible. We consider power distribution networks containing source nodes producing electricity, nodes representing electricity consumers and switches between them. Configuring this network consists in deciding the orientation of the links between the nodes of the network. The electric flow is a direct consequence of the chosen configuration and can be computed in polynomial time. It is valid if it satisfies the demand of each consumer and capacity constraints on the network. In such a case, we study the problem of determining a feasible solution that balances the loads of the sources, that is their production rates. We use three metrics to measure the quality of a solution: minimizing the maximum load, maximizing the minimum load and minimizing the difference of the maximum and the minimum loads. This defines optimization problems called respectively min-M, max-m and min-R. In the case where the graph of the network is a tree, it is known that the problem of building a valid configuration is polynomial. We show the three optimization variants have distinct properties regarding the theoretical complexity and the approximability. Particularly, we show that min-M is polynomial, that max-m is NP-Hard but belongs to the class FPTAS and that min-R is NP-Hard, cannot 1 be approximated to within any exponential relative ratio but, for any $\epsilon$ > 0, there exists an algorithm for which the value of the returned solution equals the value of an optimal solution shifted by at most $\epsilon$.
We propose new linear combinations of compositions of a basic second-order scheme with appropriately chosen coefficients to construct higher order numerical integrators for differential equations. They can be considered as a generalization of extrapolation methods and multi-product expansions. A general analysis is provided and new methods up to order 8 are built and tested. The new approach is shown to reduce the latency problem when implemented in a parallel environment and leads to schemes that are significantly more efficient than standard extrapolation when the linear combination is delayed by a number of steps.
A fully discrete semi-convex-splitting finite-element scheme with stabilization for a degenerate Cahn-Hilliard cross-diffusion system is analyzed. The system consists of parabolic fourth-order equations for the volume fraction of the fiber phase and the solute concentration, modeling pre-patterning of lymphatic vessel morphology. The existence of discrete solutions is proved, and it is shown that the numerical scheme is energy stable up to stabilization, conserves the solute mass, and preserves the lower and upper bounds of the fiber phase fraction. Numerical experiments in two space dimensions using FreeFEM illustrate the phase segregation and pattern formation.
We consider the extension of two-variable guarded fragment logic with local Presburger quantifiers. These are quantifiers that can express properties such as ``the number of incoming blue edges plus twice the number of outgoing red edges is at most three times the number of incoming green edges'' and captures various description logics with counting, but without constant symbols. We show that the satisfiability of this logic is EXP-complete. While the lower bound already holds for the standard two-variable guarded fragment logic, the upper bound is established by a novel, yet simple deterministic graph theoretic based algorithm.
The strong convergence of the semi-implicit Euler-Maruyama (EM) method for stochastic differential equations with non-linear coefficients driven by a class of L\'evy processes is investigated. The dependence of the convergence order of the numerical scheme on the parameters of the class of L\'evy processes is discovered, which is different from existing results. In addition, the existence and uniqueness of numerical invariant measure of the semi-implicit EM method is studied and its convergence to the underlying invariant measure is also proved. Numerical examples are provided to confirm our theoretical results.
We use a labelled deduction system ( LND$_{ED-}$TRS ) based on the concept of computational paths (sequences of rewrites) as equalities between two terms of the same type, which allowed us to carry out in homotopic theory an approach using the concept of computational paths. From this, we show that the computational paths can be used to perform the proofs of the $LND_{EQ}-TRS_{2}$ rewriting system.
We prove a tight parallel repetition theorem for $3$-message computationally-secure quantum interactive protocols between an efficient challenger and an efficient adversary. We also prove under plausible assumptions that the security of $4$-message computationally secure protocols does not generally decrease under parallel repetition. These mirror the classical results of Bellare, Impagliazzo, and Naor [BIN97]. Finally, we prove that all quantum argument systems can be generically compiled to an equivalent $3$-message argument system, mirroring the transformation for quantum proof systems [KW00, KKMV07]. As immediate applications, we show how to derive hardness amplification theorems for quantum bit commitment schemes (answering a question of Yan [Yan22]), EFI pairs (answering a question of Brakerski, Canetti, and Qian [BCQ23]), public-key quantum money schemes (answering a question of Aaronson and Christiano [AC13]), and quantum zero-knowledge argument systems. We also derive an XOR lemma [Yao82] for quantum predicates as a corollary.
Given an integer partition of $n$, we consider the impartial combinatorial game LCTR in which moves consist of removing either the left column or top row of its Young diagram. We show that for both normal and mis\`ere play, the optimal strategy can consist mostly of mirroring the opponent's moves. We also establish that both LCTR and Downright are domestic as well as returnable, and on the other hand neither tame nor forced. For both games, those structural observations allow for computing the Sprague-Grundy value any position in $O(\log(n))$ time, assuming that the time unit allows for reading an integer, or performing a basic arithmetic operation. This improves on the previously known bound of $O(n)$ due to Ili\'c (2019). We also cover some other complexity measures of both games, such as state-space complexity, and number of leaves and nodes in the corresponding game tree.
We offer an alternative proof, using the Stein-Chen method, of Bollob\'{a}s' theorem concerning the distribution of the extreme degrees of a random graph. Our proof also provides a rate of convergence of the extreme degree to its asymptotic distribution. The same method also applies in a more general setting where the probability of every pair of vertices being connected by edges depends on the number of vertices.
Learning distance functions between complex objects, such as the Wasserstein distance to compare point sets, is a common goal in machine learning applications. However, functions on such complex objects (e.g., point sets and graphs) are often required to be invariant to a wide variety of group actions e.g. permutation or rigid transformation. Therefore, continuous and symmetric product functions (such as distance functions) on such complex objects must also be invariant to the product of such group actions. We call these functions symmetric and factor-wise group invariant (or SFGI functions in short). In this paper, we first present a general neural network architecture for approximating SFGI functions. The main contribution of this paper combines this general neural network with a sketching idea to develop a specific and efficient neural network which can approximate the $p$-th Wasserstein distance between point sets. Very importantly, the required model complexity is independent of the sizes of input point sets. On the theoretical front, to the best of our knowledge, this is the first result showing that there exists a neural network with the capacity to approximate Wasserstein distance with bounded model complexity. Our work provides an interesting integration of sketching ideas for geometric problems with universal approximation of symmetric functions. On the empirical front, we present a range of results showing that our newly proposed neural network architecture performs comparatively or better than other models (including a SOTA Siamese Autoencoder based approach). In particular, our neural network generalizes significantly better and trains much faster than the SOTA Siamese AE. Finally, this line of investigation could be useful in exploring effective neural network design for solving a broad range of geometric optimization problems (e.g., $k$-means in a metric space).