This article introduces a general mesh intersection algorithm that exactly computes the so-called Weiler model and that uses it to implement boolean operations with arbitrary multi-operand expressions, CSG (constructive solid geometry) and some mesh repair operations. From an input polygon soup, the algorithm first computes the co-refinement, with an exact representation of the intersection points. Then, the decomposition of 3D space into volumetric regions (Weiler model) is constructed, by sorting the facets around the non-manifold intersection edges (radial sort), using specialized exact predicates. Finally, based on the input boolean expression, the triangular facets that belong to the boundary of the result are classified. This is, to our knowledge, the first algorithm that computes an exact Weiler model. To implement all the involved predicates and constructions, two geometric kernels are proposed, tested and discussed (arithmetic expansions and multi-precision floating-point). As a guiding principle,the combinatorial information shared between each step is kept as simple as possible. It is made possible by treating all the particular cases in the kernel. In particular, triangles with intersections are remeshed using the (uniquely defined) Constrained Delaunay Triangulation, with symbolic perturbations to disambiguate configurations with co-cyclic points. It makes it easy to discard the duplicated triangles that appear when remeshing overlapping facets. The method is tested and compared with previous work, on the existing "thingi10K" dataset (to test co-refinement and mesh repair) and on a new "thingiCSG" dataset made publicly available (to test the full CSG pipeline) on a variety of interesting examples featuring different types of "pathologies"
We address the problem of the best uniform approximation of a continuous function on a convex domain. The approximation is by linear combinations of a finite system of functions (not necessarily Chebyshev) under arbitrary linear constraints. By modifying the concept of alternance and of the Remez iterative procedure we present a method, which demonstrates its efficiency in numerical problems. The linear rate of convergence is proved under some favourable assumptions. A special attention is paid to systems of complex exponents, Gaussian functions, lacunar algebraic and trigonometric polynomials. Applications to signal processing, linear ODE, switching dynamical systems, and to Markov-Bernstein type inequalities are considered.
An essential problem in statistics and machine learning is the estimation of expectations involving PDFs with intractable normalizing constants. The self-normalized importance sampling (SNIS) estimator, which normalizes the IS weights, has become the standard approach due to its simplicity. However, the SNIS has been shown to exhibit high variance in challenging estimation problems, e.g, involving rare events or posterior predictive distributions in Bayesian statistics. Further, most of the state-of-the-art adaptive importance sampling (AIS) methods adapt the proposal as if the weights had not been normalized. In this paper, we propose a framework that considers the original task as estimation of a ratio of two integrals. In our new formulation, we obtain samples from a joint proposal distribution in an extended space, with two of its marginals playing the role of proposals used to estimate each integral. Importantly, the framework allows us to induce and control a dependency between both estimators. We propose a construction of the joint proposal that decomposes in two (multivariate) marginals and a coupling. This leads to a two-stage framework suitable to be integrated with existing or new AIS and/or variational inference (VI) algorithms. The marginals are adapted in the first stage, while the coupling can be chosen and adapted in the second stage. We show in several examples the benefits of the proposed methodology, including an application to Bayesian prediction with misspecified models.
We propose a Fast Fourier Transform based Periodic Interpolation Method (FFT-PIM), a flexible and computationally efficient approach for computing the scalar potential given by a superposition sum in a unit cell of an infinitely periodic array. Under the same umbrella, FFT-PIM allows computing the potential for 1D, 2D, and 3D periodicities for dynamic and static problems, including problems with and without a periodic phase shift. The computational complexity of the FFT-PIM is of $O(N \log N)$ for $N$ spatially coinciding sources and observer points. The FFT-PIM uses rapidly converging series representations of the Green's function serving as a kernel in the superposition sum. Based on these representations, the FFT-PIM splits the potential into its near-zone component, which includes a small number of images surrounding the unit cell of interest, and far-zone component, which includes the rest of an infinite number of images. The far-zone component is evaluated by projecting the non-uniform sources onto a sparse uniform grid, performing superposition sums on this sparse grid, and interpolating the potential from the uniform grid to the non-uniform observation points. The near-zone component is evaluated using an FFT-based method, which is adapted to efficiently handle non-uniform source-observer distributions within the periodic unit cell. The FFT-PIM can be used for a broad range of applications, such as periodic problems involving integral equations in computational electromagnetic and acoustic, micromagnetic solvers, and density functional theory solvers.
In the study of extremes, the presence of asymptotic independence signifies that extreme events across multiple variables are probably less likely to occur together. Although well-understood in a bivariate context, the concept remains relatively unexplored when addressing the nuances of joint occurrence of extremes in higher dimensions. In this paper, we propose a notion of mutual asymptotic independence to capture the behavior of joint extremes in dimensions larger than two and contrast it with the classical notion of (pairwise) asymptotic independence. Furthermore, we define $k$-wise asymptotic independence which lies in between pairwise and mutual asymptotic independence. The concepts are compared using examples of Archimedean, Gaussian and Marshall-Olkin copulas among others. Notably, for the popular Gaussian copula, we provide explicit conditions on the correlation matrix for mutual asymptotic independence to hold; moreover, we are able to compute exact tail orders for various tail events.
This note discusses a simple modification of cross-conformal prediction inspired by recent work on e-values. The precursor of conformal prediction developed in the 1990s by Gammerman, Vapnik, and Vovk was also based on e-values and is called conformal e-prediction in this note. Replacing e-values by p-values led to conformal prediction, which has important advantages over conformal e-prediction without obvious disadvantages. The situation with cross-conformal prediction is, however, different: whereas for cross-conformal prediction validity is only an empirical fact (and can be broken with excessive randomization), this note draws the reader's attention to the obvious fact that cross-conformal e-prediction enjoys a guaranteed property of validity.
JAX is widely used in machine learning and scientific computing, the latter of which often relies on existing high-performance code that we would ideally like to incorporate into JAX. Reimplementing the existing code in JAX is often impractical and the existing interface in JAX for binding custom code either limits the user to a single Jacobian product or requires deep knowledge of JAX and its C++ backend for general Jacobian products. With JAXbind we drastically reduce the effort required to bind custom functions implemented in other programming languages with full support for Jacobian-vector products and vector-Jacobian products to JAX. Specifically, JAXbind provides an easy-to-use Python interface for defining custom, so-called JAX primitives. Via JAXbind, any function callable from Python can be exposed as a JAX primitive. JAXbind allows a user to interface the JAX function transformation engine with custom derivatives and batching rules, enabling all JAX transformations for the custom primitive.
This research presents a comprehensive approach to predicting the duration of traffic incidents and classifying them as short-term or long-term across the Sydney Metropolitan Area. Leveraging a dataset that encompasses detailed records of traffic incidents, road network characteristics, and socio-economic indicators, we train and evaluate a variety of advanced machine learning models including Gradient Boosted Decision Trees (GBDT), Random Forest, LightGBM, and XGBoost. The models are assessed using Root Mean Square Error (RMSE) for regression tasks and F1 score for classification tasks. Our experimental results demonstrate that XGBoost and LightGBM outperform conventional models with XGBoost achieving the lowest RMSE of 33.7 for predicting incident duration and highest classification F1 score of 0.62 for a 30-minute duration threshold. For classification, the 30-minute threshold balances performance with 70.84\% short-term duration classification accuracy and 62.72\% long-term duration classification accuracy. Feature importance analysis, employing both tree split counts and SHAP values, identifies the number of affected lanes, traffic volume, and types of primary and secondary vehicles as the most influential features. The proposed methodology not only achieves high predictive accuracy but also provides stakeholders with vital insights into factors contributing to incident durations. These insights enable more informed decision-making for traffic management and response strategies. The code is available by the link: //github.com/Future-Mobility-Lab/SydneyIncidents
This paper investigates the problem of regression model generation. A model is a superposition of primitive functions. The model structure is described by a weighted colored graph. Each graph vertex corresponds to some primitive function. An edge assigns a superposition of two functions. The weight of an edge equals the probability of superposition. To generate an optimal model one has to reconstruct its structure from its graph adjacency matrix. The proposed algorithm reconstructs the~minimum spanning tree from the~weighted colored graph. This paper presents a novel solution based on the prize-collecting Steiner tree algorithm. This algorithm is compared with its alternatives.
Graph-centric artificial intelligence (graph AI) has achieved remarkable success in modeling interacting systems prevalent in nature, from dynamical systems in biology to particle physics. The increasing heterogeneity of data calls for graph neural architectures that can combine multiple inductive biases. However, combining data from various sources is challenging because appropriate inductive bias may vary by data modality. Multimodal learning methods fuse multiple data modalities while leveraging cross-modal dependencies to address this challenge. Here, we survey 140 studies in graph-centric AI and realize that diverse data types are increasingly brought together using graphs and fed into sophisticated multimodal models. These models stratify into image-, language-, and knowledge-grounded multimodal learning. We put forward an algorithmic blueprint for multimodal graph learning based on this categorization. The blueprint serves as a way to group state-of-the-art architectures that treat multimodal data by choosing appropriately four different components. This effort can pave the way for standardizing the design of sophisticated multimodal architectures for highly complex real-world problems.
Machine-learning models have demonstrated great success in learning complex patterns that enable them to make predictions about unobserved data. In addition to using models for prediction, the ability to interpret what a model has learned is receiving an increasing amount of attention. However, this increased focus has led to considerable confusion about the notion of interpretability. In particular, it is unclear how the wide array of proposed interpretation methods are related, and what common concepts can be used to evaluate them. We aim to address these concerns by defining interpretability in the context of machine learning and introducing the Predictive, Descriptive, Relevant (PDR) framework for discussing interpretations. The PDR framework provides three overarching desiderata for evaluation: predictive accuracy, descriptive accuracy and relevancy, with relevancy judged relative to a human audience. Moreover, to help manage the deluge of interpretation methods, we introduce a categorization of existing techniques into model-based and post-hoc categories, with sub-groups including sparsity, modularity and simulatability. To demonstrate how practitioners can use the PDR framework to evaluate and understand interpretations, we provide numerous real-world examples. These examples highlight the often under-appreciated role played by human audiences in discussions of interpretability. Finally, based on our framework, we discuss limitations of existing methods and directions for future work. We hope that this work will provide a common vocabulary that will make it easier for both practitioners and researchers to discuss and choose from the full range of interpretation methods.