For positive integers $d$ and $p$ such that $d \ge p$, we obtain complete asymptotic expansions, for large $d$, of the normalizing constants for the matrix Bingham and matrix Langevin distributions on Stiefel manifolds. The accuracy of each truncated expansion is strictly increasing in $d$; also, for sufficiently large $d$, the accuracy is strictly increasing in $m$, the number of terms in the truncated expansion. We apply these results to obtain the rate of convergence of these asymptotic expansions if both $d, p \to \infty$. Using values of $d$ and $p$ arising in various data sets, we illustrate the rate of convergence of the truncated approximations as $d$ or $m$ increases. These results extend our recent work on asymptotic expansions for the normalizing constants of the high-dimensional Bingham distributions.
Quantum algorithms for tasks such as factorization, search, and simulation rely on control flow such as branching and iteration that depends on the value of data in superposition. High-level programming abstractions for control flow, such as switches, loops, and higher-order functions, are ubiquitous in classical languages. By contrast, many quantum languages do not provide high-level abstractions for control flow in superposition, and instead require the use of hardware-level logic gates to implement such control flow. The reason for this gap is that whereas a classical computer supports control flow using a program counter that can depend on data, the typical architecture of a quantum computer does not provide a program counter that can depend on data in superposition. As a result, the complete set of control flow abstractions that can be correctly realized on a quantum computer has not yet been established. In this work, we provide a complete characterization of the properties of control flow abstractions that are correctly realizable on a quantum computer. First, we prove that even on a quantum computer whose program counter exists in superposition, one cannot correctly realize control flow in quantum algorithms by lifting the classical conditional jump instruction to work in superposition. This theorem denies the ability to directly lift general abstractions for control flow such as the $\lambda$-calculus from classical to quantum programming. In response, we present the necessary and sufficient conditions for control flow to be correctly realizable on a quantum computer. We introduce the quantum control machine, an instruction set architecture featuring a conditional jump that is restricted to satisfy these conditions. We show how this design enables a developer to correctly express control flow in quantum algorithms using a program counter in place of logic gates.
Despite the considerable attention given to the questions of \textit{how much} and \textit{how to} explore in deep reinforcement learning, the investigation into \textit{when} to explore remains relatively less researched. While more sophisticated exploration strategies can excel in specific, often sparse reward environments, existing simpler approaches, such as $\epsilon$-greedy, persist in outperforming them across a broader spectrum of domains. The appeal of these simpler strategies lies in their ease of implementation and generality across a wide range of domains. The downside is that these methods are essentially a blind switching mechanism, which completely disregards the agent's internal state. In this paper, we propose to leverage the agent's internal state to decide \textit{when} to explore, addressing the shortcomings of blind switching mechanisms. We present Value Discrepancy and State Counts through homeostasis (VDSC), a novel approach for efficient exploration timing. Experimental results on the Atari suite demonstrate the superiority of our strategy over traditional methods such as $\epsilon$-greedy and Boltzmann, as well as more sophisticated techniques like Noisy Nets.
Underlying data distributions of natural language, programming code, and mathematical symbols vary vastly, presenting a complex challenge for large language models (LLMs) that strive to achieve high performance across all three domains simultaneously. Achieving a very high level of proficiency for an LLM within a specific domain often requires extensive training with relevant corpora, which is typically accompanied by a sacrifice in performance in other domains. In this paper, we propose to fuse models that are already highly-specialized directly. The proposed fusing framework, UltraFuser, consists of three distinct specialists that are already sufficiently trained on language, coding, and mathematics. A token-level gating mechanism is introduced to blend the specialists' outputs. A two-stage training strategy accompanied by balanced sampling is designed to ensure stability. To effectively train the fused model, we further construct a high-quality supervised instruction tuning dataset, UltraChat 2, which includes text, code, and mathematical content. This dataset comprises approximately 300,000 instructions and covers a wide range of topics in each domain. Experiments show that our model could simultaneously achieve mastery of the three crucial domains.
We propose a predictor-corrector adaptive method for the simulation of hyperbolic partial differential equations (PDEs) on networks under general uncertainty in parameters, initial conditions, or boundary conditions. The approach is based on the stochastic finite volume (SFV) framework that circumvents sampling schemes or simulation ensembles while also preserving fundamental properties, in particular hyperbolicity of the resulting systems and conservation of the discrete solutions. The initial boundary value problem (IBVP) on a set of network-connected one-dimensional domains that represent a pipeline is represented using active discretization of the physical and stochastic spaces, and we evaluate the propagation of uncertainty through network nodes by solving a junction Riemann problem. The adaptivity of our method in refining discretization based on error metrics enables computationally tractable evaluation of intertemporal uncertainty in order to support decisions about timing and quantity of pipeline operations to maximize delivery under transient and uncertain conditions. We illustrate our computational method using simulations for a representative network.
Let $\Gamma$ be a finite set of Jordan curves in the plane. For any curve $\gamma \in \Gamma$, we denote the bounded region enclosed by $\gamma$ as $\tilde{\gamma}$. We say that $\Gamma$ is a non-piercing family if for any two curves $\alpha , \beta \in \Gamma$, $\tilde{\alpha} \setminus \tilde{\beta}$ is a connected region. A non-piercing family of curves generalizes a family of $2$-intersecting curves in which each pair of curves intersect in at most two points. Snoeyink and Hershberger (``Sweeping Arrangements of Curves'', SoCG '89) proved that if we are given a family $\mathcal{C}$ of $2$-intersecting curves and a fixed curve $C\in\mathcal{C}$, then the arrangement can be \emph{swept} by $C$, i.e., $C$ can be continuously shrunk to any point $p \in \tilde{C}$ in such a way that the we have a family of $2$-intersecting curves throughout the process. In this paper, we generalize the result of Snoeyink and Hershberger to the setting of non-piercing curves. We show that given an arrangement of non-piercing curves $\Gamma$, and a fixed curve $\gamma\in \Gamma$, the arrangement can be swept by $\gamma$ so that the arrangement remains non-piercing throughout the process. We also give a shorter and simpler proof of the result of Snoeyink and Hershberger and cite applications of their result, where our result leads to a generalization.
Given the Fourier-Legendre expansions of $f$ and $g$, and mild conditions on $f$ and $g$, we derive the Fourier-Legendre expansion of their product in terms of their corresponding Fourier-Legendre coefficients. In this way, expansions of whole number powers of $f$ may be obtained. We establish upper bounds on rates of convergence. We then employ these expansions to solve semi-analytically a class of nonlinear PDEs with a polynomial nonlinearity of degree 2. The obtained numerical results illustrate the efficiency and performance accuracy of this Fourier-Legendre based solution methodology for solving an important class of nonlinear PDEs.
In this note, we give a linear-size translation from formulas of first-order logic into equations of the calculus of relations preserving validity and finite validity. Our translation also gives a linear-size conservative reduction from formulas of first-order logic into formulas of the three-variable fragment of first-order logic.
We study a general factor analysis framework where the $n$-by-$p$ data matrix is assumed to follow a general exponential family distribution entry-wise. While this model framework has been proposed before, we here further relax its distributional assumption by using a quasi-likelihood setup. By parameterizing the mean-variance relationship on data entries, we additionally introduce a dispersion parameter and entry-wise weights to model large variations and missing values. The resulting model is thus not only robust to distribution misspecification but also more flexible and able to capture non-Gaussian covariance structures of the data matrix. Our main focus is on efficient computational approaches to perform the factor analysis. Previous modeling frameworks rely on simulated maximum likelihood (SML) to find the factorization solution, but this method was shown to lead to asymptotic bias when the simulated sample size grows slower than the square root of the sample size $n$, eliminating its practical application for data matrices with large $n$. Borrowing from expectation-maximization (EM) and stochastic gradient descent (SGD), we investigate three estimation procedures based on iterative factorization updates. Our proposed solution does not show asymptotic biases, and scales even better for large matrix factorizations with error $O(1/p)$. To support our findings, we conduct simulation experiments and discuss its application in three case studies.
We study finding and listing $k$-cliques in a graph, for constant $k\geq 3$, a fundamental problem of both theoretical and practical importance. Our main contribution is a new output-sensitive algorithm for listing $k$-cliques in graphs, for arbitrary $k\geq 3$, coupled with lower bounds based on standard fine-grained assumptions, showing that our algorithm's running time is tight. Previously, the only known conditionally optimal output-sensitive algorithms were for the case of $3$-cliques by Bj\"{o}rklund, Pagh, Vassilevska W. and Zwick [ICALP'14]. Typical inputs to subgraph isomorphism or listing problems are measured by the number of nodes $n$ or the number of edges $m$. Our framework is very general in that it gives $k$-clique listing algorithms whose running times are measured in terms of the number of $\ell$-cliques $\Delta_\ell$ in the graph for any $1\leq \ell<k$. This generalizes the typical parameterization in terms of $n$ (the number of $1$-cliques) and $m$ (the number of $2$-cliques). If the matrix multiplication exponent $\omega$ is $2$, and if the size of the output, $\Delta_k$, is sufficiently large, then for every $\ell<k$, the running time of our algorithm for listing $k$-cliques is $$\tilde{O}\left(\Delta_\ell^{\frac{2}{\ell (k - \ell)}}\Delta_k^{1-\frac{2}{k(k-\ell)}}\right).$$ For sufficiently large $\Delta_k$, we prove that this runtime is in fact {\em optimal} for all $1 \leq \ell < k$ under the Exact $k$-Clique hypothesis. In the special cases of $k = 4$ and $5$, our algorithm in terms of $n$ is conditionally optimal for all values of $\Delta_k$ if $\omega = 2$. Moreover, our framework is powerful enough to provide an improvement upon the 19-year old runtimes for $4$ and $5$-clique detection in $m$-edge graphs, as a function of $m$ [Eisenbrand and Grandoni, TCS'04].
This work discusses the benefits of having multiple simulated environments with different degrees of realism for the development of algorithms in scenarios populated by autonomous nodes capable of communication and mobility. This approach aids the development experience and generates robust algorithms. It also proposes GrADyS-SIM NextGen as a solution that enables development on a single programming language and toolset over multiple environments with varying levels of realism. Finally, we illustrate the usefulness of this approach with a toy problem that makes use of the simulation framework, taking advantage of the proposed environments to iteratively develop a robust solution.