Let tw(G) denote the treewidth of graph G. Given a graph G and a positive integer k such that tw(G) <= k + 1, we are to decide if tw(G) <= k. We give a certifying algorithm RTW ("R" for recursive) for this task: it returns one or more tree-decompositions of G of width <= k if the answer is YES and a minimal contraction H of G such that tw(H) > k otherwise. RTW uses a heuristic variant of Tamaki's PID algorithm for treewidth (ESA2017), which we call HPID. RTW, given G and k, interleaves the execution of HPID with recursive calls on G /e for edges e of G, where G / e denotes the graph obtained from G by contracting edge e. If we find that tw(G / e) > k, then we have tw(G) > k with the same certificate. If we find that tw(G / e) <= k, we "uncontract" the bags of the certifying tree-decompositions of G / e into bags of G and feed them to HPID to help progress. If the question is not resolved after the recursive calls are made for all edges, we finish HPID in an exhaustive mode. If it turns out that tw(G) > k, then G is a certificate for tw(G') > k for every G' of which G is a contraction, because we have found tw(G / e) <= k for every edge e of G. This final round of HPID guarantees the correctness of the algorithm, while its practical efficiency derives from our methods of "uncontracting" bags of tree-decompositions of G / e to useful bags of G, as well as of exploiting those bags in HPID. Experiments show that our algorithm drastically extends the scope of practically solvable instances. In particular, when applied to the 100 instances in the PACE 2017 bonus set, the number of instances solved by our implementation on a typical laptop, with the timeout of 100, 1000, and 10000 seconds per instance, are 72, 92, and 98 respectively, while these numbers are 11, 38, and 68 for Tamaki's PID solver and 65, 82, and 85 for his new solver (SEA 2022).
A Las Vegas randomized algorithm is given to compute the Hermite normal form of a nonsingular integer matrix $A$ of dimension $n$. The algorithm uses quadratic integer multiplication and cubic matrix multiplication and has running time bounded by $O(n^3 (\log n + \log ||A||)^2(\log n)^2)$ bit operations, where $||A||= \max_{ij} |A_{ij}|$ denotes the largest entry of $A$ in absolute value. A variant of the algorithm that uses pseudo-linear integer multiplication is given that has running time $(n^3 \log ||A||)^{1+o(1)}$ bit operations, where the exponent $"+o(1)"$ captures additional factors $c_1 (\log n)^{c_2} (\log \log ||A||)^{c_3}$ for positive real constants $c_1,c_2,c_3$.
In our approach, we consider the data as instances of a random field within a relevant Bochner space. Our key observation is that the classes can predominantly reside in two distinct subspaces. To uncover the separation between these classes, we employ the Karhunen-Loeve expansion and construct the appropriate subspaces. This allows us to effectively reveal the distinction between the classes. The novel features forming the above bases are constructed by applying a coordinate transformation based on the recent Functional Data Analysis theory for anomaly detection. The associated signal decomposition is an exact hierarchical tensor product expansion with known optimality properties for approximating stochastic processes (random fields) with finite dimensional function spaces. Using a hierarchical finite dimensional expansion of the nominal class, a series of orthogonal nested subspaces is constructed for detecting anomalous signal components. Projection coefficients of input data in these subspaces are then used to train a Machine Learning (ML classifier. However, due to the split of the signal into nominal and anomalous projection components, clearer separation surfaces for the classes arise. In fact we show that with a sufficiently accurate estimation of the covariance structure of the nominal class, a sharp classification can be obtained. This is particularly advantageous for large unbalanced datasets. We demonstrate it on a number of high-dimensional datasets. This approach yields significant increases in accuracy of ML methods compared to using the same ML algorithm with the original feature data. Our tests on the Alzheimer's Disease ADNI dataset shows a dramatic increase in accuracy (from 48% to 89% accuracy). Furthermore, tests using unbalanced semi-synthetic datasets created from the benchmark GCM dataset confirm increased accuracy as the dataset becomes more unbalanced.
We present an algorithm for the solution of Sylvester equations with right-hand side of low rank. The method is based on projection onto a block rational Krylov subspace, with two key contributions with respect to the state-of-the-art. First, we show how to maintain the last pole equal to infinity throughout the iteration, by means of pole reodering. This allows for a cheap evaluation of the true residual at every step. Second, we extend the convergence analysis in [Beckermann B., An error analysis for rational Galerkin projection applied to the Sylvester equation, SINUM, 2011] to the block case. This extension allows to link the convergence with the problem of minimizing the norm of a small rational matrix over the spectra or field-of-values of the involved matrices. This is in contrast with the non-block case, where the minimum problem is scalar, instead of matrix-valued. Replacing the norm of the objective function with an easier to evaluate function yields several adaptive pole selection strategies, providing a theoretical analysis for known heuristics, as well as effective novel techniques.
Due to their power and ease of use, tree-based machine learning models, such as random forests and gradient-boosted tree ensembles, have become very popular. To interpret them, local feature attributions based on marginal expectations, e.g. marginal (interventional) Shapley, Owen or Banzhaf values, may be employed. Such methods are true to the model and implementation invariant, i.e. dependent only on the input-output function of the model. We contrast this with the popular TreeSHAP algorithm by presenting two (statistically similar) decision trees that compute the exact same function for which the "path-dependent" TreeSHAP yields different rankings of features, whereas the marginal Shapley values coincide. Furthermore, we discuss how the internal structure of tree-based models may be leveraged to help with computing their marginal feature attributions according to a linear game value. One important observation is that these are simple (piecewise-constant) functions with respect to a certain grid partition of the input space determined by the trained model. Another crucial observation, showcased by experiments with XGBoost, LightGBM and CatBoost libraries, is that only a portion of all features appears in a tree from the ensemble. Thus, the complexity of computing marginal Shapley (or Owen or Banzhaf) feature attributions may be reduced. This remains valid for a broader class of game values which we shall axiomatically characterize. A prime example is the case of CatBoost models where the trees are oblivious (symmetric) and the number of features in each of them is no larger than the depth. We exploit the symmetry to derive an explicit formula, with improved complexity and only in terms of the internal model parameters, for marginal Shapley (and Banzhaf and Owen) values of CatBoost models. This results in a fast, accurate algorithm for estimating these feature attributions.
Partite, $3$-uniform hypergraphs are $3$-uniform hypergraphs in which each hyperedge contains exactly one point from each of the $3$ disjoint vertex classes. We consider the degree sequence problem of partite, $3$-uniform hypergraphs, that is, to decide if such a hypergraph with prescribed degree sequences exists. We prove that this decision problem is NP-complete in general, and give a polynomial running time algorithm for third almost-regular degree sequences, that is, when each degree in one of the vertex classes is $k$ or $k-1$ for some fixed $k$, and there is no restriction for the other two vertex classes. We also consider the sampling problem, that is, to uniformly sample partite, $3$-uniform hypergraphs with prescribed degree sequences. We propose a Parallel Tempering method, where the hypothetical energy of the hypergraphs measures the deviation from the prescribed degree sequence. The method has been implemented and tested on synthetic and real data. It can also be applied for $\chi^2$ testing of contingency tables. We have shown that this hypergraph-based $\chi^2$ test is more sensitive than the standard $\chi^2$ test. The extra sensitivity is especially advantageous on small data sets, where the proposed Parallel Tempering method shows promising performance.
We present new Dirichlet-Neumann and Neumann-Dirichlet algorithms with a time domain decomposition applied to unconstrained parabolic optimal control problems. After a spatial semi-discretization, we use the Lagrange multiplier approach to derive a coupled forward-backward optimality system, which can then be solved using a time domain decomposition. Due to the forward-backward structure of the optimality system, three variants can be found for the Dirichlet-Neumann and Neumann-Dirichlet algorithms. We analyze their convergence behavior and determine the optimal relaxation parameter for each algorithm. Our analysis reveals that the most natural algorithms are actually only good smoothers, and there are better choices which lead to efficient solvers. We illustrate our analysis with numerical experiments.
Partial differential equations (PDEs) have become an essential tool for modeling complex physical systems. Such equations are typically solved numerically via mesh-based methods, such as finite element methods, the outputs of which consist of the solutions on a set of mesh nodes over the spatial domain. However, these simulations are often prohibitively costly to survey the input space. In this paper, we propose an efficient emulator that simultaneously predicts the outputs on a set of mesh nodes, with theoretical justification of its uncertainty quantification. The novelty of the proposed method lies in the incorporation of the mesh node coordinates into the statistical model. In particular, the proposed method segments the mesh nodes into multiple clusters via a Dirichlet process prior and fits a Gaussian process model in each. Most importantly, by revealing the underlying clustering structures, the proposed method can extract valuable flow physics present in the systems that can be used to guide further investigations. Real examples are demonstrated to show that our proposed method has smaller prediction errors than its main competitors, with competitive computation time, and identifies interesting clusters of mesh nodes that exhibit coherent input-output relationships and possess physical significance, such as satisfying boundary conditions. An R package for the proposed methodology is provided in an open repository.
We bring in here a novel algebraic approach for attacking the McEliece cryptosystem. It consists in introducing a subspace of matrices representing quadratic forms. Those are associated with quadratic relationships for the component-wise product in the dual of the code used in the cryptosystem. Depending on the characteristic of the code field, this space of matrices consists only of symmetric matrices or skew-symmetric matrices. This matrix space is shown to contain unusually low-rank matrices (rank $2$ or $3$ depending on the characteristic) which reveal the secret polynomial structure of the code. Finding such matrices can then be used to recover the secret key of the scheme. We devise a dedicated approach in characteristic $2$ consisting in using a Gr\"obner basis modeling that a skew-symmetric matrix is of rank $2$. This allows to analyze the complexity of solving the corresponding algebraic system with Gr\"obner bases techniques. This computation behaves differently when applied to the skew-symmetric matrix space associated with a random code rather than with a Goppa or an alternant code. This gives a distinguisher of the latter code family. We give a bound on its complexity which turns out to interpolate nicely between polynomial and exponential depending on the code parameters. A distinguisher for alternant/Goppa codes was already known [FGO+11]. It is of polynomial complexity but works only in a narrow parameter regime. This new distinguisher is also polynomial for the parameter regime necessary for [FGO+11] but contrarily to the previous one is able to operate for virtually all code parameters relevant to cryptography. Moreover, we use this matrix space to find a polynomial time attack of the McEliece cryptosystem provided that the Goppa code is distinguishable by the method of [FGO+11] and its degree is less than $q-1$, where $q$ is the alphabet size of the code.
In this article, we study the inconsistency of systems of $\min-\rightarrow$ fuzzy relational equations. We give analytical formulas for computing the Chebyshev distances $\nabla = \inf_{d \in \mathcal{D}} \Vert \beta - d \Vert$ associated to systems of $\min-\rightarrow$ fuzzy relational equations of the form $\Gamma \Box_{\rightarrow}^{\min} x = \beta$, where $\rightarrow$ is a residual implicator among the G\"odel implication $\rightarrow_G$, the Goguen implication $\rightarrow_{GG}$ or Lukasiewicz's implication $\rightarrow_L$ and $\mathcal{D}$ is the set of second members of consistent systems defined with the same matrix $\Gamma$. The main preliminary result that allows us to obtain these formulas is that the Chebyshev distance $\nabla$ is the lower bound of the solutions of a vector inequality, whatever the residual implicator used. Finally, we show that, in the case of the $\min-\rightarrow_{G}$ system, the Chebyshev distance $\nabla$ may be an infimum, while it is always a minimum for $\min-\rightarrow_{GG}$ and $\min-\rightarrow_{L}$ systems.
In recent years, object detection has experienced impressive progress. Despite these improvements, there is still a significant gap in the performance between the detection of small and large objects. We analyze the current state-of-the-art model, Mask-RCNN, on a challenging dataset, MS COCO. We show that the overlap between small ground-truth objects and the predicted anchors is much lower than the expected IoU threshold. We conjecture this is due to two factors; (1) only a few images are containing small objects, and (2) small objects do not appear enough even within each image containing them. We thus propose to oversample those images with small objects and augment each of those images by copy-pasting small objects many times. It allows us to trade off the quality of the detector on large objects with that on small objects. We evaluate different pasting augmentation strategies, and ultimately, we achieve 9.7\% relative improvement on the instance segmentation and 7.1\% on the object detection of small objects, compared to the current state of the art method on MS COCO.