This paper discusses the formalization of proofs "by diagram chasing", a standard technique for proving properties in abelian categories. We discuss how the essence of diagram chases can be captured by a simple many-sorted first-order theory, and we study the models and decidability of this theory. The longer-term motivation of this work is the design of a computer-aided instrument for writing reliable proofs in homological algebra, based on interactive theorem provers.
This work considers the asymptotic behavior of the distance between two sample covariance matrices (SCM). A general result is provided for a class of functionals that can be expressed as sums of traces of functions that are separately applied to each covariance matrix. In particular, this class includes very conventional metrics, such as the Euclidean distance or Jeffrery's divergence, as well as a number of other more sophisticated distances recently derived from Riemannian geometry considerations, such as the log-Euclidean metric. In particular, we analyze the asymptotic behavior of this class of functionals by establishing a central limit theorem that allows us to describe their asymptotic statistical law. In order to account for the fact that the sample sizes of two SCMs are of the same order of magnitude as their observation dimension, results are provided by assuming that these parameters grow to infinity while their quotients converge to fixed quantities. Numerical results illustrate how this type of result can be used in order to predict the performance of these metrics in practical machine learning algorithms, such as clustering of SCMs.
This paper explores the intricacies of traffic behavior at unsignalized intersections through the lens of a novel dataset, combining manual video data labeling and advanced traffic simulation in SUMO. This research involved recording traffic at various unsignalized intersections in Memphis, TN, during different times of the day. After manually labeling video data to capture specific variables, we reconstructed traffic scenarios in the SUMO simulation environment. The output data from these simulations offered a comprehensive analysis, including time-space diagrams for vehicle movement, travel time frequency distributions, and speed-position plots to identify bottleneck points. This approach enhances our understanding of traffic dynamics, providing crucial insights for effective traffic management and infrastructure improvements.
This paper establishes the equivalence between Local Differential Privacy (LDP) and a global limit on learning any knowledge about an object. However, an output from an LDP query is not necessarily required to provide exact amount of knowledge equal to the upper bound of the learning limit. Since the amount of knowledge gain should be proportional to the incurred privacy loss, the traditional approach of using DP guarantee to measure privacy loss can occasionally overestimate the actual privacy loss. This is especially problematic in privacy accounting in LDP, where privacy loss is computed by accumulating the DP guarantees. To address this issue, this paper introduces the concept of \textit{realized privacy loss}, which measures the actual knowledge gained by the analyst after a query, as a more accurate measure of privacy loss. The realized privacy loss is integrated into the privacy accounting of fully adaptive composition, where an adversary adaptively selects queries based on previous results. Bayesian Privacy Filter is implemented to continually accept queries until the realized privacy loss of the composed queries equals the DP guarantee of the composition, allowing the full utilization of the privacy budget. Tracking the realized privacy loss during the composition is achieved through Bayesian Privacy Odometer, and the gap between the privacy budget and the realized privacy loss measures the leeway of the DP guarantee for future queries. A branch-and-bound method is devised to enable the Bayesian Privacy Filter to safeguard objects with continuous values. The Bayesian Privacy Filter is proven to be at least as efficient as the basic composition, and more efficient if the queries are privacy-loss compactible. Experimental results indicate that Bayesian Privacy Filter outperforms the basic composition by a factor of one to four when composing linear and logistic regressions.
Variable curvature modeling tools provide an accurate means of controlling infinite degrees-of-freedom deformable bodies and structures. However, their forward and inverse Newton-Euler dynamics are fraught with high computational costs. Assuming piecewise constant strains across discretized Cosserat rods imposed on the soft material, a composite two time-scale singularly perturbed nonlinear backstepping control scheme is here introduced. This is to alleviate the long computational times of the recursive Newton-Euler dynamics for soft structures. Our contribution is three-pronged: (i) we decompose the system's Newton-Euler dynamics to a two coupled sub-dynamics by introducing a perturbation parameter; (ii) we then prescribe a set of stabilizing controllers for regulating each subsystem's dynamics; and (iii) we study the interconnected singularly perturbed system and analyze its stability.
This article presents a general solution to the problem of computational complexity. First, it gives a historical introduction to the problem since the revival of the foundational problems of mathematics at the end of the 19th century. Second, building on the theory of functional relations in mathematics, it provides a theoretical framework where we can rigorously distinguish two pairs of concepts: Between solving a problem and verifying the solution to a problem. Between a deterministic and a non-deterministic model of computation. Third, it presents the theory of computational complexity and the difficulties in solving the P versus NP problem. Finally, it gives a complete proof that a certain decision problem in NP has an algorithmic exponential lower bound thus establishing firmly that P is different from NP. The proof presents a new way of approaching the subject: neither by entering into the unmanageable difficulties of proving this type of lower bound for the known NP-complete problems nor by entering into the difficulties regarding the properties of the many complexity classes established since the mid-1970s.
This paper surveys research works in the quickly advancing field of instruction tuning (IT), a crucial technique to enhance the capabilities and controllability of large language models (LLMs). Instruction tuning refers to the process of further training LLMs on a dataset consisting of \textsc{(instruction, output)} pairs in a supervised fashion, which bridges the gap between the next-word prediction objective of LLMs and the users' objective of having LLMs adhere to human instructions. In this work, we make a systematic review of the literature, including the general methodology of IT, the construction of IT datasets, the training of IT models, and applications to different modalities, domains and applications, along with an analysis on aspects that influence the outcome of IT (e.g., generation of instruction outputs, size of the instruction dataset, etc). We also review the potential pitfalls of IT along with criticism against it, along with efforts pointing out current deficiencies of existing strategies and suggest some avenues for fruitful research.
Large Language Models (LLMs) have shown excellent generalization capabilities that have led to the development of numerous models. These models propose various new architectures, tweaking existing architectures with refined training strategies, increasing context length, using high-quality training data, and increasing training time to outperform baselines. Analyzing new developments is crucial for identifying changes that enhance training stability and improve generalization in LLMs. This survey paper comprehensively analyses the LLMs architectures and their categorization, training strategies, training datasets, and performance evaluations and discusses future research directions. Moreover, the paper also discusses the basic building blocks and concepts behind LLMs, followed by a complete overview of LLMs, including their important features and functions. Finally, the paper summarizes significant findings from LLM research and consolidates essential architectural and training strategies for developing advanced LLMs. Given the continuous advancements in LLMs, we intend to regularly update this paper by incorporating new sections and featuring the latest LLM models.
With the rapid development of deep learning, training Big Models (BMs) for multiple downstream tasks becomes a popular paradigm. Researchers have achieved various outcomes in the construction of BMs and the BM application in many fields. At present, there is a lack of research work that sorts out the overall progress of BMs and guides the follow-up research. In this paper, we cover not only the BM technologies themselves but also the prerequisites for BM training and applications with BMs, dividing the BM review into four parts: Resource, Models, Key Technologies and Application. We introduce 16 specific BM-related topics in those four parts, they are Data, Knowledge, Computing System, Parallel Training System, Language Model, Vision Model, Multi-modal Model, Theory&Interpretability, Commonsense Reasoning, Reliability&Security, Governance, Evaluation, Machine Translation, Text Generation, Dialogue and Protein Research. In each topic, we summarize clearly the current studies and propose some future research directions. At the end of this paper, we conclude the further development of BMs in a more general view.
This paper offers a comprehensive review of the research on Natural Language Generation (NLG) over the past two decades, especially in relation to data-to-text generation and text-to-text generation deep learning methods, as well as new applications of NLG technology. This survey aims to (a) give the latest synthesis of deep learning research on the NLG core tasks, as well as the architectures adopted in the field; (b) detail meticulously and comprehensively various NLG tasks and datasets, and draw attention to the challenges in NLG evaluation, focusing on different evaluation methods and their relationships; (c) highlight some future emphasis and relatively recent research issues that arise due to the increasing synergy between NLG and other artificial intelligence areas, such as computer vision, text and computational creativity.
Recent contrastive representation learning methods rely on estimating mutual information (MI) between multiple views of an underlying context. E.g., we can derive multiple views of a given image by applying data augmentation, or we can split a sequence into views comprising the past and future of some step in the sequence. Contrastive lower bounds on MI are easy to optimize, but have a strong underestimation bias when estimating large amounts of MI. We propose decomposing the full MI estimation problem into a sum of smaller estimation problems by splitting one of the views into progressively more informed subviews and by applying the chain rule on MI between the decomposed views. This expression contains a sum of unconditional and conditional MI terms, each measuring modest chunks of the total MI, which facilitates approximation via contrastive bounds. To maximize the sum, we formulate a contrastive lower bound on the conditional MI which can be approximated efficiently. We refer to our general approach as Decomposed Estimation of Mutual Information (DEMI). We show that DEMI can capture a larger amount of MI than standard non-decomposed contrastive bounds in a synthetic setting, and learns better representations in a vision domain and for dialogue generation.