Enumeration problems aim at outputting, without repetition, the set of solutions to a given problem instance. However, outputting the entire solution set may be prohibitively expensive if it is too big. In this case, outputting a small, sufficiently diverse subset of the solutions would be preferable. This leads to the Diverse-version of the original enumeration problem, where the goal is to achieve a certain level d of diversity by selecting k solutions. In this paper, we look at the Diverse-version of the query answering problem for Conjunctive Queries and extensions thereof. That is, we study the problem if it is possible to achieve a certain level d of diversity by selecting k answers to the given query and, in the positive case, to actually compute such k answers.
Constraint Programming (CP) has been successfully used to model and solve complex combinatorial problems. However, modeling is often not trivial and requires expertise, which is a bottleneck to wider adoption. In Constraint Acquisition (CA), the goal is to assist the user by automatically learning the model. In (inter)active CA, this is done by interactively posting queries to the user, e.g., asking whether a partial solution satisfies their (unspecified) constraints or not. While interac tive CA methods learn the constraints, the learning is related to symbolic concept learning, as the goal is to learn an exact representation. However, a large number of queries is still required to learn the model, which is a major limitation. In this paper, we aim to alleviate this limitation by tightening the connection of CA and Machine Learning (ML), by, for the first time in interactive CA, exploiting statistical ML methods. We propose to use probabilistic classification models to guide interactive CA to generate more promising queries. We discuss how to train classifiers to predict whether a candidate expression from the bias is a constraint of the problem or not, using both relation-based and scope-based features. We then show how the predictions can be used in all layers of interactive CA: the query generation, the scope finding, and the lowest-level constraint finding. We experimentally evaluate our proposed methods using different classifiers and show that our methods greatly outperform the state of the art, decreasing the number of queries needed to converge by up to 72%.
Many applications, such as optimization, uncertainty quantification and inverse problems, require repeatedly performing simulations of large-dimensional physical systems for different choices of parameters. This can be prohibitively expensive. In order to save computational cost, one can construct surrogate models by expressing the system in a low-dimensional basis, obtained from training data. This is referred to as model reduction. Past investigations have shown that, when performing model reduction of Hamiltonian systems, it is crucial to preserve the symplectic structure associated with the system in order to ensure long-term numerical stability. Up to this point structure-preserving reductions have largely been limited to linear transformations. We propose a new neural network architecture in the spirit of autoencoders, which are established tools for dimension reduction and feature extraction in data science, to obtain more general mappings. In order to train the network, a non-standard gradient descent approach is applied that leverages the differential-geometric structure emerging from the network design. The new architecture is shown to significantly outperform existing designs in accuracy.
A smooth T-surface can be thought of as a generalization of a surface of revolution in such a way that the axis of rotation is not fixed at one point but rather traces a smooth path on the base plane. Furthermore, the action, by which the aforementioned surface is obtained does not need to be merely rotation but any ``suitable" planar equiform transformation applied to the points of a certain smooth profile curve. In analogy to the smooth setting, if the axis footpoints sweep a polyline on the base plane and if the profile curve is discretely chosen then a T-hedra (discrete T-surface) with trapezoidal faces is obtained. The goal of this article is to reconstruct a T-hedron from an already given point cloud of a T-surface. In doing so, a kinematic approach is taken into account, where the algorithm at first tries to find the aforementioned axis direction associated with the point cloud. Then the algorithm finds a polygonal path through which the axis footpoint moves. Finally, by properly cutting the point cloud with the planes passing through the axis and its footpoints, it reconstructs the surface. The presented method is demonstrated on base of examples. From an applied point of view, the straightforwardness of the generation of these surfaces predestines them for building and design processes. In fact, one can find many built objects belonging to the sub-classes of T-surfaces such as \emph{surfaces of revolution} and \emph{moulding surfaces}. Furthermore, the planarity of the faces of the discrete version paves the way for steel/glass construction in industry. Finally, these surfaces are also suitable for transformable designs as they allow an isometric deformation.
Current quantum computers can only solve optimization problems of a very limited size. For larger problems, decomposition methods are required in which the original problem is broken down into several smaller sub-problems. These are then solved on the quantum computer and their solutions are merged into a final solution for the original problem. Often, these decomposition methods do not take the specific problem structure into account. In this paper, we present a tailored method using a divide-and-conquer strategy to solve the number partitioning problem (NPP) with a large number of variables. The idea is to perform a specialized decomposition into smaller NPPs, which can be solved on a quantum computer, and then recombine the results into another small auxiliary NPP. Solving this auxiliary problem yields an approximate solution of the original larger problem. We experimentally verify that our method allows to solve NPPs with over a thousand variables using a D-Wave quantum annealer.
Proper scoring rules evaluate the quality of probabilistic predictions, playing an essential role in the pursuit of accurate and well-calibrated models. Every proper score decomposes into two fundamental components -- proper calibration error and refinement -- utilizing a Bregman divergence. While uncertainty calibration has gained significant attention, current literature lacks a general estimator for these quantities with known statistical properties. To address this gap, we propose a method that allows consistent, and asymptotically unbiased estimation of all proper calibration errors and refinement terms. In particular, we introduce Kullback--Leibler calibration error, induced by the commonly used cross-entropy loss. As part of our results, we prove the relation between refinement and f-divergences, which implies information monotonicity in neural networks, regardless of which proper scoring rule is optimized. Our experiments validate empirically the claimed properties of the proposed estimator and suggest that the selection of a post-hoc calibration method should be determined by the particular calibration error of interest.
Algorithmic predictions are increasingly used to inform the allocations of goods and interventions in the public sphere. In these domains, predictions serve as a means to an end. They provide stakeholders with insights into likelihood of future events as a means to improve decision making quality, and enhance social welfare. However, if maximizing welfare is the ultimate goal, prediction is only a small piece of the puzzle. There are various other policy levers a social planner might pursue in order to improve bottom-line outcomes, such as expanding access to available goods, or increasing the effect sizes of interventions. Given this broad range of design decisions, a basic question to ask is: What is the relative value of prediction in algorithmic decision making? How do the improvements in welfare arising from better predictions compare to those of other policy levers? The goal of our work is to initiate the formal study of these questions. Our main results are theoretical in nature. We identify simple, sharp conditions determining the relative value of prediction vis-\`a-vis expanding access, within several statistical models that are popular amongst quantitative social scientists. Furthermore, we illustrate how these theoretical insights may be used to guide the design of algorithmic decision making systems in practice.
Maximum likelihood estimation (MLE) is a fundamental problem in statistics. Characteristics of the MLE problem for algebraic statistical models are reflected in the geometry of the likelihood correspondence, a variety that ties together data and their maximum likelihood estimators. We construct the ideal of the likelihood correspondence for the large class of toric models and find a Gr\"obner basis in the case of complete and joint independence models arising from multi-way contingency tables. These results provide insight into their properties and offer faster computational strategies for solving the MLE problem.
As artificial intelligence (AI) models continue to scale up, they are becoming more capable and integrated into various forms of decision-making systems. For models involved in moral decision-making, also known as artificial moral agents (AMA), interpretability provides a way to trust and understand the agent's internal reasoning mechanisms for effective use and error correction. In this paper, we provide an overview of this rapidly-evolving sub-field of AI interpretability, introduce the concept of the Minimum Level of Interpretability (MLI) and recommend an MLI for various types of agents, to aid their safe deployment in real-world settings.
Self-supervised learning, dubbed the dark matter of intelligence, is a promising path to advance machine learning. Yet, much like cooking, training SSL methods is a delicate art with a high barrier to entry. While many components are familiar, successfully training a SSL method involves a dizzying set of choices from the pretext tasks to training hyper-parameters. Our goal is to lower the barrier to entry into SSL research by laying the foundations and latest SSL recipes in the style of a cookbook. We hope to empower the curious researcher to navigate the terrain of methods, understand the role of the various knobs, and gain the know-how required to explore how delicious SSL can be.
We consider the problem of explaining the predictions of graph neural networks (GNNs), which otherwise are considered as black boxes. Existing methods invariably focus on explaining the importance of graph nodes or edges but ignore the substructures of graphs, which are more intuitive and human-intelligible. In this work, we propose a novel method, known as SubgraphX, to explain GNNs by identifying important subgraphs. Given a trained GNN model and an input graph, our SubgraphX explains its predictions by efficiently exploring different subgraphs with Monte Carlo tree search. To make the tree search more effective, we propose to use Shapley values as a measure of subgraph importance, which can also capture the interactions among different subgraphs. To expedite computations, we propose efficient approximation schemes to compute Shapley values for graph data. Our work represents the first attempt to explain GNNs via identifying subgraphs explicitly and directly. Experimental results show that our SubgraphX achieves significantly improved explanations, while keeping computations at a reasonable level.