The theory of classical realizability is a framework for the Curry-Howard correspondence which enables to associate a program with each proof in Zermelo-Fraenkel set theory. But, almost all the applications of mathematics in physics, probability, statistics, etc. use Analysis i.e. the axiom of dependent choice (DC) or even the (full) axiom of choice (AC). It is therefore important to find explicit programs for these axioms. Various solutions have been found for DC, for instance the lambda-term called "bar recursion" or the instruction "quote" of LISP. We present here the first program for AC.
With the rapid global spread of COVID-19, more and more data related to this virus is becoming available, including genomic sequence data. The total number of genomic sequences that are publicly available on platforms such as GISAID is currently several million, and is increasing with every day. The availability of such \textit{Big Data} creates a new opportunity for researchers to study this virus in detail. This is particularly important with all of the dynamics of the COVID-19 variants which emerge and circulate. This rich data source will give us insights on the best ways to perform genomic surveillance for this and future pandemic threats, with the ultimate goal of mitigating or eliminating such threats. Analyzing and processing the several million genomic sequences is a challenging task. Although traditional methods for sequence classification are proven to be effective, they are not designed to deal with these specific types of genomic sequences. Moreover, most of the existing methods also face the issue of scalability. Previous studies which were tailored to coronavirus genomic data proposed to use spike sequences (corresponding to a subsequence of the genome), rather than using the complete genomic sequence, to perform different machine learning (ML) tasks such as classification and clustering. However, those methods suffer from scalability issues. In this paper, we propose an approach called Spike2Vec, an efficient and scalable feature vector representation for each spike sequence that can be used for downstream ML tasks. Through experiments, we show that Spike2Vec is not only scalable on several million spike sequences, but also outperforms the baseline models in terms of prediction accuracy, F1-score, etc.
Support constrained generator matrices for linear codes have been found applications in multiple access networks and weakly secure document exchange. The necessary and sufficient conditions for the existence of Reed-Solomon codes with support constrained generator matrices were conjectured by Dau, Song, Yuen and Hassibi. This conjecture is called the GM-MDS conjecture and finally proved recently in independent works of Lovett and Yildiz-Hassibi. From their conjecture support constrained generator matrices for MDS codes are existent over linear size small fields. In this paper we propose a natural generalized conjecture for the support constrained matrices based on the generalized Hamming weights (SCGM-GHW conjecture). The GM-MDS conjecture can be thought as a very special case of our SCGM-GHW conjecture for linear $1$-MDS codes. We investigate the support constrained generator matrices for some linear codes such as $2$-MDS codes, first order Reed-Muller codes, algebraic-geometric codes from elliptic curves from the view of our SCGM-GHW conjecture. In particular the direct generalization of the GM-MDS conjecture about $1$-MDS codes to $2$-MDS codes is not true. For linear $2$-MDS codes only cardinality-based constraints on subset systems are not sufficient for the purpose that these subsets are in the zero coordinate position sets of rows in generator matrices.
Fermat-Euler quotients arose from the study of the first case of Fermat's Last Theorem, and have numerous applications in number theory. Recently they were studied from the cryptographic aspects by constructing many pseudorandom binary sequences, whose linear complexities and trace representations were calculated. In this work, we further study their correlation measures by using the approach based on Dirichlet characters, Ramanujan sums and Gauss sums. Our results show that the $4$-order correlation measures of these sequences are very large. Therefore they may not be suggested for cryptography.
The Minimum Linear Arrangement problem (MLA) consists of finding a mapping $\pi$ from vertices of a graph to distinct integers that minimizes $\sum_{\{u,v\}\in E}|\pi(u) - \pi(v)|$. In that setting, vertices are often assumed to lie on a horizontal line and edges are drawn as semicircles above said line. For trees, various algorithms are available to solve the problem in polynomial time in $n=|V|$. There exist variants of the MLA in which the arrangements are constrained. Iordanskii, and later Hochberg and Stallmann (HS), put forward $O(n)$-time algorithms that solve the problem when arrangements are constrained to be planar (also known as one-page book embeddings). We also consider linear arrangements of rooted trees that are constrained to be projective (planar embeddings where the root is not covered by any edge). Gildea and Temperley (GT) sketched an algorithm for projective arrangements which they claimed runs in $O(n)$ but did not provide any justification of its cost. In contrast, Park and Levy claimed that GT's algorithm runs in $O(n \log d_{max})$ where $d_{max}$ is the maximum degree but did not provide sufficient detail. Here we correct an error in HS's algorithm for the planar case, show its relationship with the projective case, and derive simple algorithms for the projective and planar cases that run without a doubt in $O(n)$ time.
Optimal control problems including partial differential equation (PDE) as well as integer constraints merge the combinatorial difficulties of integer programming and the challenges related to large-scale systems resulting from discretized PDEs. So far, the Branch-and-Bound framework has been the most common solution strategy for such problems. In order to provide an alternative solution approach, especially in a large-scale context, this article investigates penalization techniques. Taking inspiration from a well-known family of existing exact penalty algorithms, a novel improved penalty algorithm is derived, whose key ingredients are a basin hopping strategy and an interior point method, both of which are specialized for the problem class. A thorough numerical investigation is carried out for a standard stationary test problem. Extensions to a convection-diffusion as well as a nonlinear test problem finally demonstrate the versatility of the approach.
Jacobi's results on the computation of the order and of the normal forms of a differential system are translated in the formalism of differential algebra. In the quasi-regular case, we give complete proofs according to Jacobi's arguments. The main result is {\it Jacobi's bound}, still conjectural in the general case: the order of a differential system $P_{1}, \ldots, P_{n}$ is not greater than the maximum $\cO$ of the sums $\sum_{i=1}^{n} a_{i,\sigma(i)}$, for all permutations $\sigma$ of the indices, where $a_{i,j}:={\rm ord}_{x_{j}}P_{i}$, \emph{viz.}\ the \emph{tropical determinant of the matrix $(a_{i,j})$}. The order is precisely equal to $\cO$ iff Jacobi's \emph{truncated determinant} does not vanish. Jacobi also gave a polynomial time algorithm to compute $\cO$, similar to Kuhn's \index{Hungarian method}``Hungarian method'' and some variants of shortest path algorithms, related to the computation of integers $\ell_{i}$ such that a normal form may be obtained, in the generic case, by differentiating $\ell_{i}$ times equation $P_{i}$. Fundamental results about changes of orderings and the various normal forms a system may have, including differential resolvents, are also provided.
A key component of mathematical reasoning is the ability to formulate interesting conjectures about a problem domain at hand. In this paper, we give a brief overview of a theory exploration system called QuickSpec, which is able to automatically discover interesting conjectures about a given set of functions. QuickSpec works by interleaving term generation with random testing to form candidate conjectures. This is made tractable by starting from small sizes and ensuring that only terms that are irreducible with respect to already discovered conjectures are considered. QuickSpec has been successfully applied to generate lemmas for automated inductive theorem proving as well as to generate specifications of functional programs. We give an overview of typical use-cases of QuickSpec, as well as demonstrating how to easily connect it to a theorem prover of the user's choice.
This paper focuses on the expected difference in borrower's repayment when there is a change in the lender's credit decisions. Classical estimators overlook the confounding effects and hence the estimation error can be magnificent. As such, we propose another approach to construct the estimators such that the error can be greatly reduced. The proposed estimators are shown to be unbiased, consistent, and robust through a combination of theoretical analysis and numerical testing. Moreover, we compare the power of estimating the causal quantities between the classical estimators and the proposed estimators. The comparison is tested across a wide range of models, including linear regression models, tree-based models, and neural network-based models, under different simulated datasets that exhibit different levels of causality, different degrees of nonlinearity, and different distributional properties. Most importantly, we apply our approaches to a large observational dataset provided by a global technology firm that operates in both the e-commerce and the lending business. We find that the relative reduction of estimation error is strikingly substantial if the causal effects are accounted for correctly.
Over the past few years, we have seen fundamental breakthroughs in core problems in machine learning, largely driven by advances in deep neural networks. At the same time, the amount of data collected in a wide array of scientific domains is dramatically increasing in both size and complexity. Taken together, this suggests many exciting opportunities for deep learning applications in scientific settings. But a significant challenge to this is simply knowing where to start. The sheer breadth and diversity of different deep learning techniques makes it difficult to determine what scientific problems might be most amenable to these methods, or which specific combination of methods might offer the most promising first approach. In this survey, we focus on addressing this central issue, providing an overview of many widely used deep learning models, spanning visual, sequential and graph structured data, associated tasks and different training methods, along with techniques to use deep learning with less data and better interpret these complex models --- two central considerations for many scientific use cases. We also include overviews of the full design process, implementation tips, and links to a plethora of tutorials, research summaries and open-sourced deep learning pipelines and pretrained models, developed by the community. We hope that this survey will help accelerate the use of deep learning across different scientific domains.
Neural networks have succeeded in many reasoning tasks. Empirically, these tasks require specialized network structures, e.g., Graph Neural Networks (GNNs) perform well on many such tasks, but less structured networks fail. Theoretically, there is limited understanding of why and when a network structure generalizes better than others, although they have equal expressive power. In this paper, we develop a framework to characterize which reasoning tasks a network can learn well, by studying how well its computation structure aligns with the algorithmic structure of the relevant reasoning process. We formally define this algorithmic alignment and derive a sample complexity bound that decreases with better alignment. This framework offers an explanation for the empirical success of popular reasoning models, and suggests their limitations. As an example, we unify seemingly different reasoning tasks, such as intuitive physics, visual question answering, and shortest paths, via the lens of a powerful algorithmic paradigm, dynamic programming (DP). We show that GNNs align with DP and thus are expected to solve these tasks. On several reasoning tasks, our theory is supported by empirical results.