In this paper we study the orbit closure problem for a reductive group $G\subseteq GL(X)$ acting on a finite dimensional vector space $V$ over ${\mathbb C}$. We assume that the center of $GL(X)$ lies within $G$ and acts on $V$ through a fixed non-trivial character. We study points $y,z\in V$ where (i) $z$ is obtained as the leading term of the action of a 1-parameter subgroup $\lambda (t)\subseteq G$ on $y$, and (ii) $y$ and $z$ have large distinctive stabilizers $K,H \subseteq G$. Let $O(z)$ (resp. $O(y)$) denote the $G$-orbits of $z$ (resp. $y$), and $\overline{O(z)}$ (resp. $\overline{O(y)}$) their closures, then (i) implies that $z\in \overline{O(y)}$. We address the question: under what conditions can (i) and (ii) be simultaneously satisfied, i.e, there exists a 1-PS $\lambda \subseteq G$ for which $z$ is observed as a limit of $y$. Using $\lambda$, we develop a leading term analysis which applies to $V$ as well as to ${\cal G}= Lie(G)$ the Lie algebra of $G$ and its subalgebras ${\cal K}$ and ${\cal H}$, the Lie algebras of $K$ and $H$ respectively. Through this we construct the Lie algebra $\hat{\cal K} \subseteq {\cal H}$ which connects $y$ and $z$ through their Lie algebras. We develop the properties of $\hat{\cal K}$ and relate it to the action of ${\cal H}$ on $\overline{N}=V/T_z O(z)$, the normal slice to the orbit $O(z)$. Next, we examine the possibility of {\em intermediate $G$-varieties} $W$ which lie between the orbit closures of $z$ and $y$, i.e. $\overline{O(z)} \subsetneq W \subsetneq O(y)$. These intermediate varieties are constructed using the grading obtained from $\lambda $ by its action on $V$ and ${\cal G}$. The paper hopes to contribute to the Geometric Complexity Theory approach of addressing problems in computational complexity in theoretical computer science.
We extend our formulation of Merge and Minimalism in terms of Hopf algebras to an algebraic model of a syntactic-semantic interface. We show that methods adopted in the formulation of renormalization (extraction of meaningful physical values) in theoretical physics are relevant to describe the extraction of meaning from syntactic expressions. We show how this formulation relates to computational models of semantics and we answer some recent controversies about implications for generative linguistics of the current functioning of large language models.
A perfect $k$-coloring of the Boolean hypercube $Q_n$ is a function from the set of binary words of length $n$ onto a $k$-set of colors such that for any colors $i$ and $j$ every word of color $i$ has exactly $S(i,j)$ neighbors (at Hamming distance $1$) of color $j$, where the coefficient $S(i,j)$ depend only on $i$ and $j$ but not on the particular choice of the words. The $k$-by-$k$ table of all coefficients $S(i,j)$ is called the quotient matrix. We characterize perfect colorings of $Q_n$ of degree at most $3$, that is, with quotient matrix whose all eigenvalues are not less than $n-6$, or, equivalently, such that every color corresponds to a Boolean function represented by a polynomial of degree at most $3$ over $R$. Additionally, we characterize $(n-4)$-correlation-immune perfect colorings of $Q_n$, whose all colors correspond to $(n-4)$-correlation-immune Boolean functions, or, equivalently, all non-main (different from $n$) eigenvalues of the quotient matrix are not greater than $6-n$. Keywords: perfect coloring, equitable partition, resilient function, correlation-immune function.
This work considers the low-rank approximation of a matrix $A(t)$ depending on a parameter $t$ in a compact set $D \subset \mathbb{R}^d$. Application areas that give rise to such problems include computational statistics and dynamical systems. Randomized algorithms are an increasingly popular approach for performing low-rank approximation and they usually proceed by multiplying the matrix with random dimension reduction matrices (DRMs). Applying such algorithms directly to $A(t)$ would involve different, independent DRMs for every $t$, which is not only expensive but also leads to inherently non-smooth approximations. In this work, we propose to use constant DRMs, that is, $A(t)$ is multiplied with the same DRM for every $t$. The resulting parameter-dependent extensions of two popular randomized algorithms, the randomized singular value decomposition and the generalized Nystr\"{o}m method, are computationally attractive, especially when $A(t)$ admits an affine linear decomposition with respect to $t$. We perform a probabilistic analysis for both algorithms, deriving bounds on the expected value as well as failure probabilities for the approximation error when using Gaussian random DRMs. Both, the theoretical results and numerical experiments, show that the use of constant DRMs does not impair their effectiveness; our methods reliably return quasi-best low-rank approximations.
We consider the general problem of Bayesian binary regression and we introduce a new class of distributions, the Perturbed Unified Skew Normal (pSUN, henceforth), which generalizes the Unified Skew-Normal (SUN) class. We show that the new class is conjugate to any binary regression model, provided that the link function may be expressed as a scale mixture of Gaussian densities. We discuss in detail the popular logit case, and we show that, when a logistic regression model is combined with a Gaussian prior, posterior summaries such as cumulants and normalizing constants can be easily obtained through the use of an importance sampling approach, opening the way to straightforward variable selection procedures. For more general priors, the proposed methodology is based on a simple Gibbs sampler algorithm. We also claim that, in the p > n case, the proposed methodology shows better performances - both in terms of mixing and accuracy - compared to the existing methods. We illustrate the performance through several simulation studies and two data analyses.
We describe the classification of orthogonal arrays OA$(2048,14,2,7)$, or, equivalently, completely regular $\{14;2\}$-codes in the $14$-cube ($30848$ equivalence classes). In particular, we find that there is exactly one almost-OA$(2048,14,2,7+1)$, up to equivalence. As derived objects, OA$(1024,13,2,6)$ ($202917$ classes) and completely regular $\{12,2;2,12\}$- and $\{14, 12, 2; 2, 12, 14\}$-codes in the $13$- and $14$-cubes, respectively, are also classified.
We consider an unknown multivariate function representing a system-such as a complex numerical simulator-taking both deterministic and uncertain inputs. Our objective is to estimate the set of deterministic inputs leading to outputs whose probability (with respect to the distribution of the uncertain inputs) of belonging to a given set is less than a given threshold. This problem, which we call Quantile Set Inversion (QSI), occurs for instance in the context of robust (reliability-based) optimization problems, when looking for the set of solutions that satisfy the constraints with sufficiently large probability. To solve the QSI problem, we propose a Bayesian strategy based on Gaussian process modeling and the Stepwise Uncertainty Reduction (SUR) principle, to sequentially choose the points at which the function should be evaluated to efficiently approximate the set of interest. We illustrate the performance and interest of the proposed SUR strategy through several numerical experiments.
We study the problem of lossless feature selection for a $d$-dimensional feature vector $X=(X^{(1)},\dots ,X^{(d)})$ and label $Y$ for binary classification as well as nonparametric regression. For an index set $S\subset \{1,\dots ,d\}$, consider the selected $|S|$-dimensional feature subvector $X_S=(X^{(i)}, i\in S)$. If $L^*$ and $L^*(S)$ stand for the minimum risk based on $X$ and $X_S$, respectively, then $X_S$ is called lossless if $L^*=L^*(S)$. For classification, the minimum risk is the Bayes error probability, while in regression, the minimum risk is the residual variance. We introduce nearest-neighbor based test statistics to test the hypothesis that $X_S$ is lossless. For the threshold $a_n=\log n/\sqrt{n}$, the corresponding tests are proved to be consistent under conditions on the distribution of $(X,Y)$ that are significantly milder than in previous work. Also, our threshold is dimension-independent, in contrast to earlier methods where for large $d$ the threshold becomes too large to be useful in practice.
Causal representation learning algorithms discover lower-dimensional representations of data that admit a decipherable interpretation of cause and effect; as achieving such interpretable representations is challenging, many causal learning algorithms utilize elements indicating prior information, such as (linear) structural causal models, interventional data, or weak supervision. Unfortunately, in exploratory causal representation learning, such elements and prior information may not be available or warranted. Alternatively, scientific datasets often have multiple modalities or physics-based constraints, and the use of such scientific, multimodal data has been shown to improve disentanglement in fully unsupervised settings. Consequently, we introduce a causal representation learning algorithm (causalPIMA) that can use multimodal data and known physics to discover important features with causal relationships. Our innovative algorithm utilizes a new differentiable parametrization to learn a directed acyclic graph (DAG) together with a latent space of a variational autoencoder in an end-to-end differentiable framework via a single, tractable evidence lower bound loss function. We place a Gaussian mixture prior on the latent space and identify each of the mixtures with an outcome of the DAG nodes; this novel identification enables feature discovery with causal relationships. Tested against a synthetic and a scientific dataset, our results demonstrate the capability of learning an interpretable causal structure while simultaneously discovering key features in a fully unsupervised setting.
We develop a method to compute the $H^2$-conforming finite element approximation to planar fourth order elliptic problems without having to implement $C^1$ elements. The algorithm consists of replacing the original $H^2$-conforming scheme with pre-processing and post-processing steps that require only an $H^1$-conforming Poisson type solve and an inner Stokes-like problem that again only requires at most $H^1$-conformity. We then demonstrate the method applied to the Morgan-Scott elements with three numerical examples.
We present a method for finding large fixed-size primes of the form $X^2+c$. We study the density of primes on the sets $E_c = \{N(X,c)=X^2+c,\ X \in (2\mathbb{Z}+(c-1))\}$, $c \in \mathbb{N}^*$. We describe an algorithm for generating values of $c$ such that a given prime $p$ is the minimum of the union of prime divisors of all elements in $E_c$. We also present quadratic forms generating divisors of Ec and study the prime divisors of its terms. This paper uses the results of Dirichlet's arithmetic progression theorem [1] and the article [6] to rewrite a conjecture of Shanks [2] on the density of primes in $E_c$. Finally, based on these results, we discuss the heuristics of large primes occurrences in the research set of our algorithm.