Version Control Systems, such as Git and Mercurial, manage the history of a project as a Directed Acyclic Graph encoding the various divergences and synchronizations happening in its life cycle. A popular workflow in the industry, called the feature branch workflow, constrains these graphs to be of a particular shape: a unique main branch, and non-interfering feature branches. Here we focus on the uniform random generation of those graphs with n vertices, including k on the main branch, for which we provide three algorithms, for three different use-cases. The first, based on rejection, is efficient when aiming for small values of k (more precisely whenever k = O($\sqrt$ n)). The second takes as input any number k of commits in the main branch, but requires costly precalculation. The last one is a Boltzmann generator and enables us to generate very large graphs while targeting a constant k/n ratio. All these algorithms are linear in the size of their outputs.
The explosive growth of online content demands robust Natural Language Processing (NLP) techniques that can capture nuanced meanings and cultural context across diverse languages. Semantic Textual Relatedness (STR) goes beyond superficial word overlap, considering linguistic elements and non-linguistic factors like topic, sentiment, and perspective. Despite its pivotal role, prior NLP research has predominantly focused on English, limiting its applicability across languages. Addressing this gap, our paper dives into capturing deeper connections between sentences beyond simple word overlap. Going beyond English-centric NLP research, we explore STR in Marathi, Hindi, Spanish, and English, unlocking the potential for information retrieval, machine translation, and more. Leveraging the SemEval-2024 shared task, we explore various language models across three learning paradigms: supervised, unsupervised, and cross-lingual. Our comprehensive methodology gains promising results, demonstrating the effectiveness of our approach. This work aims to not only showcase our achievements but also inspire further research in multilingual STR, particularly for low-resourced languages.
Language models trained on internet-scale data sets have shown an impressive ability to solve problems in Natural Language Processing and Computer Vision. However, experience is showing that these models are frequently brittle in unexpected ways, and require significant scaffolding to ensure that they operate correctly in the larger systems that comprise "language-model agents." In this paper, we argue that behavior trees provide a unifying framework for combining language models with classical AI and traditional programming. We introduce Dendron, a Python library for programming language model agents using behavior trees. We demonstrate the approach embodied by Dendron in three case studies: building a chat agent, a camera-based infrastructure inspection agent for use on a mobile robot or vehicle, and an agent that has been built to satisfy safety constraints that it did not receive through instruction tuning or RLHF.
We study the CHAIN communication problem introduced by Cormode et al. [ICALP 2019]. It is a generalization of the well-studied INDEX problem. For $k\geq 1$, in CHAIN$_{n,k}$, there are $k$ instances of INDEX, all with the same answer. They are shared between $k+1$ players as follows. Player 1 has the first string $X^1 \in \{0,1\}^n$, player 2 has the first index $\sigma^1 \in [n]$ and the second string $X^2 \in \{0,1\}^n$, player 3 has the second index $\sigma^2 \in [n]$ along with the third string $X^3 \in \{0,1\}^n$, and so on. Player $k+1$ has the last index $\sigma^k \in [n]$. The communication is one way from each player to the next, starting from player 1 to player 2, then from player 2 to player 3 and so on. Player $k+1$, after receiving the message from player $k$, has to output a single bit which is the answer to all $k$ instances of INDEX. It was proved that the CHAIN$_{n,k}$ problem requires $\Omega(n/k^2)$ communication by Cormode et al., and they used it to prove streaming lower bounds for approximation of maximum independent sets. Subsequently, it was used by Feldman et al. [STOC 2020] to prove lower bounds for streaming submodular maximization. However, these works do not get optimal bounds on the communication complexity of CHAIN$_{n,k}$, and in fact, it was conjectured by Cormode et al. that $\Omega(n)$ bits are necessary, for any $k$. As our main result, we prove the optimal lower bound of $\Omega(n)$ for CHAIN$_{n,k}$. This settles the open conjecture of Cormode et al. in the affirmative. The key technique is to use information theoretic tools to analyze protocols over the Jensen-Shannon divergence measure, as opposed to total variation distance. As a corollary, we get an improved lower bound for approximation of maximum independent set in vertex arrival streams through a reduction from CHAIN directly.
Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks and beyond. This success of LLMs has led to a large influx of research contributions in this direction. These works encompass diverse topics such as architectural innovations, better training strategies, context length improvements, fine-tuning, multi-modal LLMs, robotics, datasets, benchmarking, efficiency, and more. With the rapid development of techniques and regular breakthroughs in LLM research, it has become considerably challenging to perceive the bigger picture of the advances in this direction. Considering the rapidly emerging plethora of literature on LLMs, it is imperative that the research community is able to benefit from a concise yet comprehensive overview of the recent developments in this field. This article provides an overview of the existing literature on a broad range of LLM-related concepts. Our self-contained comprehensive overview of LLMs discusses relevant background concepts along with covering the advanced topics at the frontier of research in LLMs. This review article is intended to not only provide a systematic survey but also a quick comprehensive reference for the researchers and practitioners to draw insights from extensive informative summaries of the existing works to advance the LLM research.
Dynamic networks represent the complex and evolving interrelationships between real-world entities. Given the scale and variability of these networks, finding an optimal slicing interval is essential for meaningful analysis. Nonuniform timeslicing, which adapts to density changes within the network, is drawing attention as a solution to this problem. In this research, we categorized existing algorithms into two domains -- data mining and visualization -- according to their approach to the problem. Data mining approach focuses on capturing temporal patterns of dynamic networks, while visualization approach emphasizes lessening the burden of analysis. We then introduce a novel nonuniform timeslicing method that synthesizes the strengths of both approaches, demonstrating its efficacy with a real-world data. The findings suggest that combining the two approaches offers the potential for more effective network analysis.
A major challenge when describing the origin of life is to explain how instructional information control systems emerge naturally and spontaneously from mere molecular dynamics. So far, no one has clarified how information control emerged ab initio and how primitive control mechanisms in life might have evolved, becoming increasingly refined. Based on recent experimental results showing that chemical computation does not require the presence of life-related chemistry, we elucidate the origin and early evolution of information handling by chemical automata, from information processing (computation) to information storage (memory) and information transmission (communication). In contrast to other theories that assume the existence of initial complex structures, our narrative starts from trivial self-replicators whose interaction leads to the arising of more powerful molecular machines. By describing precisely the primordial transitions in chemistry-based computation, our metaphor is capable of explaining the above-mentioned gaps and can be translated to other models of computation, which allow us to explore biological phenomena at multiple spatial and temporal scales. At the end of our manuscript, we propose some ways to extend our ideas, including experimental validation of our theory (both in vitro and in silico).
Current Deep Neural Networks are vulnerable to adversarial examples, which alter their predictions by adding carefully crafted noise. Since human eyes are robust to such inputs, it is possible that the vulnerability stems from the standard way of processing inputs in one shot by processing every pixel with the same importance. In contrast, neuroscience suggests that the human vision system can differentiate salient features by (1) switching between multiple fixation points (saccades) and (2) processing the surrounding with a non-uniform external resolution (foveation). In this work, we advocate that the integration of such active vision mechanisms into current deep learning systems can offer robustness benefits. Specifically, we empirically demonstrate the inherent robustness of two active vision methods - GFNet and FALcon - under a black box threat model. By learning and inferencing based on downsampled glimpses obtained from multiple distinct fixation points within an input, we show that these active methods achieve (2-3) times greater robustness compared to a standard passive convolutional network under state-of-the-art adversarial attacks. More importantly, we provide illustrative and interpretable visualization analysis that demonstrates how performing inference from distinct fixation points makes active vision methods less vulnerable to malicious inputs.
The International Phonetic Alphabet (IPA) serves to systematize phonemes in language, enabling precise textual representation of pronunciation. In Bengali phonology and phonetics, ongoing scholarly deliberations persist concerning the IPA standard and core Bengali phonemes. This work examines prior research, identifies current and potential issues, and suggests a framework for a Bengali IPA standard, facilitating linguistic analysis and NLP resource creation and downstream technology development. In this work, we present a comprehensive study of Bengali IPA transcription and introduce a novel IPA transcription framework incorporating a novel dataset with DL-based benchmarks.
Deep Convolutional Neural Networks (CNNs) are a special type of Neural Networks, which have shown state-of-the-art results on various competitive benchmarks. The powerful learning ability of deep CNN is largely achieved with the use of multiple non-linear feature extraction stages that can automatically learn hierarchical representation from the data. Availability of a large amount of data and improvements in the hardware processing units have accelerated the research in CNNs and recently very interesting deep CNN architectures are reported. The recent race in deep CNN architectures for achieving high performance on the challenging benchmarks has shown that the innovative architectural ideas, as well as parameter optimization, can improve the CNN performance on various vision-related tasks. In this regard, different ideas in the CNN design have been explored such as use of different activation and loss functions, parameter optimization, regularization, and restructuring of processing units. However, the major improvement in representational capacity is achieved by the restructuring of the processing units. Especially, the idea of using a block as a structural unit instead of a layer is gaining substantial appreciation. This survey thus focuses on the intrinsic taxonomy present in the recently reported CNN architectures and consequently, classifies the recent innovations in CNN architectures into seven different categories. These seven categories are based on spatial exploitation, depth, multi-path, width, feature map exploitation, channel boosting and attention. Additionally, it covers the elementary understanding of the CNN components and sheds light on the current challenges and applications of CNNs.
Deep Convolutional Neural Networks have pushed the state-of-the art for semantic segmentation provided that a large amount of images together with pixel-wise annotations is available. Data collection is expensive and a solution to alleviate it is to use transfer learning. This reduces the amount of annotated data required for the network training but it does not get rid of this heavy processing step. We propose a method of transfer learning without annotations on the target task for datasets with redundant content and distinct pixel distributions. Our method takes advantage of the approximate content alignment of the images between two datasets when the approximation error prevents the reuse of annotation from one dataset to another. Given the annotations for only one dataset, we train a first network in a supervised manner. This network autonomously learns to generate deep data representations relevant to the semantic segmentation. Then the images in the new dataset, we train a new network to generate a deep data representation that matches the one from the first network on the previous dataset. The training consists in a regression between feature maps and does not require any annotations on the new dataset. We show that this method reaches performances similar to a classic transfer learning on the PASCAL VOC dataset with synthetic transformations.