The branch-and-bound algorithm based on decision diagrams introduced by Bergman et al. in 2016 is a framework for solving discrete optimization problems with a dynamic programming formulation. It works by compiling a series of bounded-width decision diagrams that can provide lower and upper bounds for any given subproblem. Eventually, every part of the search space will be either explored or pruned by the algorithm, thus proving optimality. This paper presents new ingredients to speed up the search by exploiting the structure of dynamic programming models. The key idea is to prevent the repeated exploration of nodes corresponding to the same dynamic programming states by storing and querying thresholds in a data structure called the Barrier. These thresholds are based on dominance relations between partial solutions previously found. They can be further strengthened by integrating the filtering techniques introduced by Gillard et al. in 2021. Computational experiments show that the pruning brought by the Barrier allows to significantly reduce the number of nodes expanded by the algorithm. This results in more benchmark instances of difficult optimization problems being solved in less time while using narrower decision diagrams.
Information diagram and the I-measure are useful mnemonics where random variables are treated as sets, and entropy and mutual information are treated as a signed measure. Although the I-measure has been successful in machine proofs of entropy inequalities, the theoretical underpinning of the ``random variables as sets'' analogy has been unclear until the recent works on mappings from random variables to sets by Ellerman (recovering order-$2$ Tsallis entropy over general probability space), and Down and Mediano (recovering Shannon entropy over discrete probability space). We generalize these constructions by designing a mapping which recovers the Shannon entropy (and the information density) over general probability space. Moreover, it has an intuitive interpretation based on the arrival time in a Poisson process, allowing us to understand the union, intersection and difference between (sets corresponding to) random variables and events. Cross entropy, KL divergence, and conditional entropy given an event, can be obtained as set intersections. We propose a generalization of the information diagram that also includes events, and demonstrate its usage by a diagrammatic proof of Fano's inequality.
We study the Feedback Vertex Set and the Vertex Cover problem in a natural variant of the classical online model that allows for delayed decisions and reservations. Both problems can be characterized by an obstruction set of subgraphs that the online graph needs to avoid. In the case of the Vertex Cover problem, the obstruction set consists of an edge (i.e., the graph of two adjacent vertices), while for the Feedback Vertex Set problem, the obstruction set contains all cycles. In the delayed-decision model, an algorithm needs to maintain a valid partial solution after every request, thus allowing it to postpone decisions until the current partial solution is no longer valid for the current request. The reservation model grants an online algorithm the new and additional option to pay a so-called reservation cost for any given element in order to delay the decision of adding or rejecting it until the end of the instance. For the Feedback Vertex Set problem, we first analyze the variant with only delayed decisions, proving a lower bound of $4$ and an upper bound of $5$ on the competitive ratio. Then we look at the variant with both delayed decisions and reservation. We show that given bounds on the competitive ratio of a problem with delayed decisions impliy lower and upper bounds for the same problem when adding the option of reservations. This observation allows us to give a lower bound of $\min{\{1+3\alpha,4\}}$ and an upper bound of $\min{\{1+5\alpha,5\}}$ for the Feedback Vertex Set problem. Finally, we show that the online Vertex Cover problem, when both delayed decisions and reservations are allowed, is $\min{\{1+2\alpha, 2\}}$-competitive, where $\alpha \in \mathbb{R}_{\geq 0}$ is the reservation cost per reserved vertex.
Entropic risk (ERisk) is an established risk measure in finance, quantifying risk by an exponential re-weighting of rewards. We study ERisk for the first time in the context of turn-based stochastic games with the total reward objective. This gives rise to an objective function that demands the control of systems in a risk-averse manner. We show that the resulting games are determined and, in particular, admit optimal memoryless deterministic strategies. This contrasts risk measures that previously have been considered in the special case of Markov decision processes and that require randomization and/or memory. We provide several results on the decidability and the computational complexity of the threshold problem, i.e. whether the optimal value of ERisk exceeds a given threshold. In the most general case, the problem is decidable subject to Shanuel's conjecture. If all inputs are rational, the resulting threshold problem can be solved using algebraic numbers, leading to decidability via a polynomial-time reduction to the existential theory of the reals. Further restrictions on the encoding of the input allow the solution of the threshold problem in NP$\cap$coNP. Finally, an approximation algorithm for the optimal value of ERisk is provided.
This paper introduces a simplified variation of the PaDiM (Pixel-Wise Anomaly Detection through Instance Modeling) method for anomaly detection in images, fitting a single multivariate Gaussian (MVG) distribution to the feature vectors extracted from a backbone convolutional neural network (CNN) and using their Mahalanobis distance as the anomaly score. We introduce an intermediate step in this framework by applying a whitening transformation to the feature vectors, which enables the generation of heatmaps capable of visually explaining the features learned by the MVG. The proposed technique is evaluated on the MVTec-AD dataset, and the results show the importance of visual model validation, providing insights into issues in this framework that were otherwise invisible. The visualizations generated for this paper are publicly available at //doi.org/10.5281/zenodo.7937978.
In many applications, when building linear regression models, it is important to account for the presence of outliers, i.e., corrupted input data points. Such problems can be formulated as mixed-integer optimization problems involving cubic terms, each given by the product of a binary variable and a quadratic term of the continuous variables. Existing approaches in the literature, typically relying on the linearization of the cubic terms using big-M constraints, suffer from weak relaxation and poor performance in practice. In this work we derive stronger second-order conic relaxations that do not involve big-M constraints. Our computational experiments indicate that the proposed formulations are several orders-of-magnitude faster than existing big-M formulations in the literature for this problem.
Change point detection is a commonly used technique in time series analysis, capturing the dynamic nature in which many real-world processes function. With the ever increasing troves of multivariate high-dimensional time series data, especially in neuroimaging and finance, there is a clear need for scalable and data-driven change point detection methods. Currently, change point detection methods for multivariate high-dimensional data are scarce, with even less available in high-level, easily accessible software packages. To this end, we introduce the R package fabisearch, available on the Comprehensive R Archive Network (CRAN), which implements the factorized binary search (FaBiSearch) methodology. FaBiSearch is a novel statistical method for detecting change points in the network structure of multivariate high-dimensional time series which employs non-negative matrix factorization (NMF), an unsupervised dimension reduction and clustering technique. Given the high computational cost of NMF, we implement the method in C++ code and use parallelization to reduce computation time. Further, we also utilize a new binary search algorithm to efficiently identify multiple change points and provide a new method for network estimation for data between change points. We show the functionality of the package and the practicality of the method by applying it to a neuroimaging and a finance data set. Lastly, we provide an interactive, 3-dimensional, brain-specific network visualization capability in a flexible, stand-alone function. This function can be conveniently used with any node coordinate atlas, and nodes can be color coded according to community membership (if applicable). The output is an elegantly displayed network laid over a cortical surface, which can be rotated in the 3-dimensional space.
Covariate-adjusted randomization (CAR) can reduce the risk of covariate imbalance and, when accounted for in analysis, increase the power of a trial. Despite CAR advances, stratified randomization remains the most common CAR method. Matched Randomization (MR) randomizes treatment assignment within optimally identified matched pairs based on covariates and a distance matrix. When participants enroll sequentially, Sequentially Matched Randomization (SMR) randomizes within matches found "on-the-fly" to meet a pre-specified matching threshold. However, pre-specifying the ideal threshold can be challenging and SMR yields less-optimal matches than MR. We extend SMR to allow multiple participants to be randomized simultaneously, to use a dynamic threshold, and to allow matches to break and rematch if a better match later enrolls (Sequential Rematched Randomization; SRR). In simplified settings and a real-world application, we assess whether these extensions improve covariate balance, estimator/study efficiency, and optimality of matches. We investigate whether adjusting for more covariates can be detrimental upon covariate balance and efficiency as is the case of traditional stratified randomization. As secondary objectives, we use the case study to assess how SMR schemes compare side-by-side with common and related CAR schemes and whether adjusting for covariates in the design can be as powerful as adjusting for covariates in a parametric model. We find each SMR extension, individually and collectively, to improve covariate balance, estimator efficiency, study power, and quality of matches. We provide a case-study where CAR schemes with randomization-based inference can be as and more powerful than Non-CAR schemes with parametric adjustment for covariates.
Magnetic polarizability tensors (MPTs) provide an economical characterisation of conducting metallic objects and can aid in the solution of metal detection inverse problems, such as scrap metal sorting, searching for unexploded ordnance in areas of former conflict, and security screening at event venues and transport hubs. Previous work has established explicit formulae for their coefficients and a rigorous mathematical theory for the characterisation they provide. In order to assist with efficient computation of MPT spectral signatures of different objects to enable the construction of large dictionaries of characterisations for classification approaches, this work proposes a new, highly-efficient, strategy for predicting MPT coefficients. This is achieved by solving an eddy current type problem using hp-finite elements in combination with a proper orthogonal decomposition reduced order modelling (ROM) methodology and offers considerable computational savings over our previous approach. Furthermore, an adaptive approach is described for generating new frequency snapshots to further improve the accuracy of the ROM. To improve the resolution of highly conducting and magnetic objects, a recipe is proposed to choose the number and thicknesses of prismatic boundary layers for accurate resolution of thin skin depths in such problems. The paper includes a series of challenging examples to demonstrate the success of the proposed methodologies.
In this work we propose a two-step alternative clearing method of day-ahead electricity markets. In the first step, using the aggregation of bids, an approximate clearing is performed, and based on the outcome of this problem, the estimates for the clearing prices of individual periods are derived. These assumptions regarding the range of clearing prices explicitly determine the acceptance indicators for a subset of the original bids. In the subsequent stage, another round of clearing is performed to determine the acceptance indicators of the remaining bids and the market clearing prices. We show that the bid-aggregation based method may result in suboptimal solution or in an infeasible problem in the second step, but we also point out that these pitfalls of the algorithm may be avoided if a different aggregation pattern is used. We propose to define multiple different aggregation patterns, and to use parallel computing to enhance the performance of the algorithm. We test the proposed approach on setups of various problem sizes, and conclude that in the case of parallel computing with 4 threads a significant gain in computational speed may be achieved, with a high success rate.
In this paper, we propose a one-stage online clustering method called Contrastive Clustering (CC) which explicitly performs the instance- and cluster-level contrastive learning. To be specific, for a given dataset, the positive and negative instance pairs are constructed through data augmentations and then projected into a feature space. Therein, the instance- and cluster-level contrastive learning are respectively conducted in the row and column space by maximizing the similarities of positive pairs while minimizing those of negative ones. Our key observation is that the rows of the feature matrix could be regarded as soft labels of instances, and accordingly the columns could be further regarded as cluster representations. By simultaneously optimizing the instance- and cluster-level contrastive loss, the model jointly learns representations and cluster assignments in an end-to-end manner. Extensive experimental results show that CC remarkably outperforms 17 competitive clustering methods on six challenging image benchmarks. In particular, CC achieves an NMI of 0.705 (0.431) on the CIFAR-10 (CIFAR-100) dataset, which is an up to 19\% (39\%) performance improvement compared with the best baseline.