We characterize absolutely continuous symmetric copulas with square integrable densities in this paper. This characterization is used to create new copula families, that are perturbations of the independence copula. The full study of mixing properties of Markov chains generated by these copula families is conducted. An extension that includes the Farlie-Gumbel-Morgenstern family of copulas is proposed. We propose some examples of copulas that generate non-mixing Markov chains, but whose convex combinations generate $\psi$-mixing Markov chains. Some general results on $\psi$-mixing are given. The Spearman's correlation $\rho_S$ and Kendall's $\tau$ are provided for the created copula families. Some general remarks are provided for $\rho_S$ and $\tau$. A central limit theorem is provided for parameter estimators in one example. A simulation study is conducted to support derived asymptotic distributions for some examples.
The roto-translation group SE2 has been of active interest in image analysis due to methods that lift the image data to multi-orientation representations defined on this Lie group. This has led to impactful applications of crossing-preserving flows for image de-noising, geodesic tracking, and roto-translation equivariant deep learning. In this paper, we develop a computational framework for optimal transportation over Lie groups, with a special focus on SE2. We make several theoretical contributions (generalizable to matrix Lie groups) such as the non-optimality of group actions as transport maps, invariance and equivariance of optimal transport, and the quality of the entropic-regularized optimal transport plan using geodesic distance approximations. We develop a Sinkhorn like algorithm that can be efficiently implemented using fast and accurate distance approximations of the Lie group and GPU-friendly group convolutions. We report valuable advancements in the experiments on 1) image barycenters, 2) interpolation of planar orientation fields, and 3) Wasserstein gradient flows on SE2. We observe that our framework of lifting images to SE2 and optimal transport with left-invariant anisotropic metrics leads to equivariant transport along dominant contours and salient line structures in the image. This yields sharper and more meaningful interpolations compared to their counterparts on $\mathbb{R}^2$
Which polyominoes can be folded into a cube, using only creases along edges of the square lattice underlying the polyomino, with fold angles of $\pm 90^\circ$ and $\pm 180^\circ$, and allowing faces of the cube to be covered multiple times? Prior results studied tree-shaped polyominoes and polyominoes with holes and gave partial classifications for these cases. We show that there is an algorithm deciding whether a given polyomino can be folded into a cube. This algorithm essentially amounts to trying all possible ways of mapping faces of the polyomino to faces of the cube, but (perhaps surprisingly) checking whether such a mapping corresponds to a valid folding is equivalent to the unlink recognition problem from topology. We also give further results on classes of polyominoes which can or cannot be folded into cubes. Our results include (1) a full characterisation of all tree-shaped polyominoes that can be folded into the cube (2) that any rectangular polyomino which contains only one simple hole (out of five different types) does not fold into a cube, (3) a complete characterisation when a rectangular polyomino with two or more unit square holes (but no other holes) can be folded into a cube, and (4) a sufficient condition when a simply-connected polyomino can be folded to a cube. These results answer several open problems of previous work and close the cases of tree-shaped polyominoes and rectangular polyominoes with just one simple hole.
This paper analyzes the approximate control variate (ACV) approach to multifidelity uncertainty quantification in the case where weighted estimators are combined to form the components of the ACV. The weighted estimators enable one to precisely group models that share input samples to achieve improved variance reduction. We demonstrate that this viewpoint yields a generalized linear estimator that can assign any weight to any sample. This generalization shows that other linear estimators in the literature, particularly the multilevel best linear unbiased estimator (ML-BLUE) of Schaden and Ullman in 2020, becomes a specific version of the ACV estimator of Gorodetsky, Geraci, Jakeman, and Eldred, 2020. Moreover, this connection enables numerous extensions and insights. For example, we empirically show that having non-independent groups can yield better variance reduction compared to the independent groups used by ML-BLUE. Furthermore, we show that such grouped estimators can use arbitrary weighted estimators, not just the simple Monte Carlo estimators used in ML-BLUE. Furthermore, the analysis enables the derivation of ML-BLUE directly from a variance reduction perspective, rather than a regression perspective.
Table-to-text generation involves generating appropriate textual descriptions given structured tabular data. It has attracted increasing attention in recent years thanks to the popularity of neural network models and the availability of large-scale datasets. A common feature across existing methods is their treatment of the input as a string, i.e., by employing linearization techniques that do not always preserve information in the table, are verbose, and lack space efficiency. We propose to rethink data-to-text generation as a visual recognition task, removing the need for rendering the input in a string format. We present PixT3, a multimodal table-to-text model that overcomes the challenges of linearization and input size limitations encountered by existing models. PixT3 is trained with a new self-supervised learning objective to reinforce table structure awareness and is applicable to open-ended and controlled generation settings. Experiments on the ToTTo and Logic2Text benchmarks show that PixT3 is competitive and, in some settings, superior to generators that operate solely on text.
Self-supervised learning methods based on data augmentations, such as SimCLR, BYOL, or DINO, allow obtaining semantically meaningful representations of image datasets and are widely used prior to supervised fine-tuning. A recent self-supervised learning method, $t$-SimCNE, uses contrastive learning to directly train a 2D representation suitable for visualisation. When applied to natural image datasets, $t$-SimCNE yields 2D visualisations with semantically meaningful clusters. In this work, we used $t$-SimCNE to visualise medical image datasets, including examples from dermatology, histology, and blood microscopy. We found that increasing the set of data augmentations to include arbitrary rotations improved the results in terms of class separability, compared to data augmentations used for natural images. Our 2D representations show medically relevant structures and can be used to aid data exploration and annotation, improving on common approaches for data visualisation.
We propose a novel way to describe numerical methods for ordinary differential equations via the notion of multi-indice. The main idea is to replace rooted trees in Butcher's B-series by multi-indices. The latter were introduced recently in the context of describing solutions of singular stochastic partial differential equations. The combinatorial shift away from rooted trees allows for a compressed description of numerical schemes. Moreover, these multi-indices B-series characterise uniquely the Taylor development of local and affine equivariant maps.
Genome assembly is a prominent problem studied in bioinformatics, which computes the source string using a set of its overlapping substrings. Classically, genome assembly uses assembly graphs built using this set of substrings to compute the source string efficiently, having a tradeoff between scalability and avoiding information loss. The scalable de Bruijn graphs come at the price of losing crucial overlap information. The complete overlap information is stored in overlap graphs using quadratic space. Hierarchical overlap graphs [IPL20] (HOG) overcome these limitations, avoiding information loss despite using linear space. After a series of suboptimal improvements, Khan and Park et al. simultaneously presented two optimal algorithms [CPM2021], where only the former was seemingly practical. We empirically analyze all the practical algorithms for computing HOG, where the optimal algorithm [CPM2021] outperforms the previous algorithms as expected, though at the expense of extra memory. However, it uses non-intuitive approach and non-trivial data structures. We present arguably the most intuitive algorithm, using only elementary arrays, which is also optimal. Our algorithm empirically proves even better for both time and memory over all the algorithms, highlighting its significance in both theory and practice. We further explore the applications of hierarchical overlap graphs to solve various forms of suffix-prefix queries on a set of strings. Loukides et al. [CPM2023] recently presented state-of-the-art algorithms for these queries. However, these algorithms require complex black-box data structures and are seemingly impractical. Our algorithms, despite failing to match the state-of-the-art algorithms theoretically, answer different queries ranging from 0.01-100 milliseconds for a data set having around a billion characters.
There are several ways to establish the asymptotic normality of $L$-statistics, depending upon the selection of the weights generating function and the cumulative distribution function of the underlying model. Here, in this paper it is shown that the two of the asymptotic approaches are equivalent/equal for a particular choice of the weights generating function.
Group lasso is a commonly used regularization method in statistical learning in which parameters are eliminated from the model according to predefined groups. However, when the groups overlap, optimizing the group lasso penalized objective can be time-consuming on large-scale problems because of the non-separability induced by the overlapping groups. This bottleneck has seriously limited the application of overlapping group lasso regularization in many modern problems, such as gene pathway selection and graphical model estimation. In this paper, we propose a separable penalty as an approximation of the overlapping group lasso penalty. Thanks to the separability, the computation of regularization based on our penalty is substantially faster than that of the overlapping group lasso, especially for large-scale and high-dimensional problems. We show that the penalty is the tightest separable relaxation of the overlapping group lasso norm within the family of $\ell_{q_1}/\ell_{q_2}$ norms. Moreover, we show that the estimator based on the proposed separable penalty is statistically equivalent to the one based on the overlapping group lasso penalty with respect to their error bounds and the rate-optimal performance under the squared loss. We demonstrate the faster computational time and statistical equivalence of our method compared with the overlapping group lasso in simulation examples and a classification problem of cancer tumors based on gene expression and multiple gene pathways.
Due to their inherent capability in semantic alignment of aspects and their context words, attention mechanism and Convolutional Neural Networks (CNNs) are widely applied for aspect-based sentiment classification. However, these models lack a mechanism to account for relevant syntactical constraints and long-range word dependencies, and hence may mistakenly recognize syntactically irrelevant contextual words as clues for judging aspect sentiment. To tackle this problem, we propose to build a Graph Convolutional Network (GCN) over the dependency tree of a sentence to exploit syntactical information and word dependencies. Based on it, a novel aspect-specific sentiment classification framework is raised. Experiments on three benchmarking collections illustrate that our proposed model has comparable effectiveness to a range of state-of-the-art models, and further demonstrate that both syntactical information and long-range word dependencies are properly captured by the graph convolution structure.