The Bar-Hillel construction is a classic result in formal language theory. It shows, by a simple construction, that the intersection of a context-free language and a regular language is itself context-free. In the construction, the regular language is specified by a finite-state automaton. However, neither the original construction (Bar-Hillel et al., 1961) nor its weighted extension (Nederhof and Satta, 2003) can handle finite-state automata with $\varepsilon$-arcs. While it is possible to remove $\varepsilon$-arcs from a finite-state automaton efficiently without modifying the language, such an operation modifies the automaton's set of paths. We give a construction that generalizes the Bar-Hillel in the case where the desired automaton has $\varepsilon$-arcs, and further prove that our generalized construction leads to a grammar that encodes the structure of both the input automaton and grammar while retaining the asymptotic size of the original construction.
Conductivity reconstruction in an inverse eddy current problem is considered in the present paper. With the electric field measurement on part of domain boundary, we formulate the reconstruction problem to a constrained optimization problem with total variation regularization. Existence and stability are proved for the solution to the optimization problem. The finite element method is employed to discretize the optimization problem. The gradient Lipschitz properties of the objective functional are established for the the discrete optimization problems. We propose the alternating direction method of multipliers to solve the discrete problem. Based on the the gradient Lipschitz property, we prove the convergence by extending the admissible set to the whole finite element space. Finally, we show some numerical experiments to illustrate the efficiency of the proposed methods.
This paper is a collection of results on combinatorial properties of codes for the Z-channel. A Z-channel with error fraction $\tau$ takes as input a length-$n$ binary codeword and injects in an adversarial manner up to $n\tau$ asymmetric errors, i.e., errors that only zero out bits but do not flip $0$'s to $1$'s. It is known that the largest $(L-1)$-list-decodable code for the Z-channel with error fraction $\tau$ has exponential size (in $n$) if $\tau$ is less than a critical value that we call the $(L-1)$-list-decoding Plotkin point and has constant size if $\tau$ is larger than the threshold. The $(L-1)$-list-decoding Plotkin point is known to be $ L^{-\frac{1}{L-1}} - L^{-\frac{L}{L-1}} $, which equals $1/4$ for unique-decoding with $ L-1=1 $. In this paper, we derive various results for the size of the largest codes above and below the list-decoding Plotkin point. In particular, we show that the largest $(L-1)$-list-decodable code $\epsilon$-above the Plotkin point, {for any given sufficiently small positive constant $ \epsilon>0 $,} has size $\Theta_L(\epsilon^{-3/2})$ for any $L-1\ge1$. We also devise upper and lower bounds on the exponential size of codes below the list-decoding Plotkin point.
Automated Driving Systems (ADS) hold great potential to increase safety, mobility, and equity. However, without public acceptance, none of these promises can be fulfilled. To engender public trust, many entities in the ADS community participate in standards development organizations (SDOs) with the goal of enhancing safety for the entire industry through a collaborative approach. The breadth and depth of the ADS safety standardization landscape is vast and constantly changing, as often is the case for novel technologies in rapid evolution. The pace of development of the ADS industry makes it hard for the public and interested parties to keep track of ongoing SDO efforts, including the topics touched by each standard and the committees addressing each topic, as well as make sense of the wealth of documentation produced. Therefore, the authors present here a simplified framework for abstracting and organizing the current landscape of ADS safety standards into high-level, long term themes. This framework is then utilized to develop and organize associated research questions that have not yet reached widely adopted industry positions, along with identifying potential gaps where further research and standardization is needed.
Advances in information technology have increased the availability of time-stamped relational data such as those produced by email exchanges or interaction through social media. Whereas the associated information flows could be aggregated into cross-sectional panels, the temporal ordering of the events frequently contains information that requires new models for the analysis of continuous-time interactions, subject to both endogenous and exogenous influences. The introduction of the Relational Event Model (REM) has been a major development that has led to further methodological improvements stimulated by new questions that REMs made possible. In this review, we track the intellectual history of the REM, define its core properties, and discuss why and how it has been considered useful in empirical research. We describe how the demands of novel applications have stimulated methodological, computational, and inferential advancements.
Parallel software codes in high performance computing (HPC) continue to grow in complexity and scale as we enter the exascale era. A diverse set of emerging hardware and programming paradigms make developing, optimizing, and maintaining parallel software burdensome for developers. One way to alleviate some of these burdens is with automated development and analysis tools. Such tools can perform complex and/or remedial tasks for developers that increase their productivity and decrease the chance for error. So far, such tools for code development and performance analysis have been limited in the complexity of tasks they can perform. However, with recent advancements in language modeling, and the wealth of code related data that is now available online, these tools have started to utilize predictive language models to automate more complex tasks. In this paper, we show how large language models (LLMs) can be applied to tasks specific to high performance and scientific codes. We train LLMs using code and performance data that is specific to parallel codes. We compare several recent LLMs on HPC related tasks and introduce a new model, HPC-Coder, trained on parallel code. In our experiments we show that this model can auto-complete HPC functions where general models cannot, decorate for loops with OpenMP pragmas, and model performance changes in two scientific application repositories.
Car detection, particularly through camera vision, has become a major focus in the field of computer vision and has gained widespread adoption. While current car detection systems are capable of good detection, reliable detection can still be challenging due to factors such as proximity between the car, light intensity, and environmental visibility. To address these issues, we propose cross-domain Car Detection Model with integrated convolutional block Attention mechanism(CDMA) that we apply to car recognition for autonomous driving and other areas. CDMA includes several novelties: 1)Building a complete cross-domain target detection framework. 2)Developing an unpaired target domain picture generation module with an integrated convolutional attention mechanism which specifically emphasizes the car headlights feature. 3)Adopting Generalized Intersection over Union (GIOU) as the loss function of the target detection framework. 4)Designing an object detection model integrated with two-headed Convolutional Block Attention Module(CBAM). 5)Utilizing an effective data enhancement method. To evaluate the model's effectiveness, we performed a reduced will resolution process on the data in the SSLAD dataset and used it as the benchmark dataset for our task. Experimental results show that the performance of the cross-domain car target detection model improves by 40% over the model without our framework, and our improvements have a significant impact on cross-domain car recognition.
Autonomic computing investigates how systems can achieve (user) specified control outcomes on their own, without the intervention of a human operator. Autonomic computing fundamentals have been substantially influenced by those of control theory for closed and open-loop systems. In practice, complex systems may exhibit a number of concurrent and inter-dependent control loops. Despite research into autonomic models for managing computer resources, ranging from individual resources (e.g., web servers) to a resource ensemble (e.g., multiple resources within a data center), research into integrating Artificial Intelligence (AI) and Machine Learning (ML) to improve resource autonomy and performance at scale continues to be a fundamental challenge. The integration of AI/ML to achieve such autonomic and self-management of systems can be achieved at different levels of granularity, from full to human-in-the-loop automation. In this article, leading academics, researchers, practitioners, engineers, and scientists in the fields of cloud computing, AI/ML, and quantum computing join to discuss current research and potential future directions for these fields. Further, we discuss challenges and opportunities for leveraging AI and ML in next generation computing for emerging computing paradigms, including cloud, fog, edge, serverless and quantum computing environments.
Deep neural networks have achieved remarkable success in computer vision tasks. Existing neural networks mainly operate in the spatial domain with fixed input sizes. For practical applications, images are usually large and have to be downsampled to the predetermined input size of neural networks. Even though the downsampling operations reduce computation and the required communication bandwidth, it removes both redundant and salient information obliviously, which results in accuracy degradation. Inspired by digital signal processing theories, we analyze the spectral bias from the frequency perspective and propose a learning-based frequency selection method to identify the trivial frequency components which can be removed without accuracy loss. The proposed method of learning in the frequency domain leverages identical structures of the well-known neural networks, such as ResNet-50, MobileNetV2, and Mask R-CNN, while accepting the frequency-domain information as the input. Experiment results show that learning in the frequency domain with static channel selection can achieve higher accuracy than the conventional spatial downsampling approach and meanwhile further reduce the input data size. Specifically for ImageNet classification with the same input size, the proposed method achieves 1.41% and 0.66% top-1 accuracy improvements on ResNet-50 and MobileNetV2, respectively. Even with half input size, the proposed method still improves the top-1 accuracy on ResNet-50 by 1%. In addition, we observe a 0.8% average precision improvement on Mask R-CNN for instance segmentation on the COCO dataset.
Contextual word representations derived from pre-trained bidirectional language models (biLMs) have recently been shown to provide significant improvements to the state of the art for a wide range of NLP tasks. However, many questions remain as to how and why these models are so effective. In this paper, we present a detailed empirical study of how the choice of neural architecture (e.g. LSTM, CNN, or self attention) influences both end task accuracy and qualitative properties of the representations that are learned. We show there is a tradeoff between speed and accuracy, but all architectures learn high quality contextual representations that outperform word embeddings for four challenging NLP tasks. Additionally, all architectures learn representations that vary with network depth, from exclusively morphological based at the word embedding layer through local syntax based in the lower contextual layers to longer range semantics such coreference at the upper layers. Together, these results suggest that unsupervised biLMs, independent of architecture, are learning much more about the structure of language than previously appreciated.
Many natural language processing tasks solely rely on sparse dependencies between a few tokens in a sentence. Soft attention mechanisms show promising performance in modeling local/global dependencies by soft probabilities between every two tokens, but they are not effective and efficient when applied to long sentences. By contrast, hard attention mechanisms directly select a subset of tokens but are difficult and inefficient to train due to their combinatorial nature. In this paper, we integrate both soft and hard attention into one context fusion model, "reinforced self-attention (ReSA)", for the mutual benefit of each other. In ReSA, a hard attention trims a sequence for a soft self-attention to process, while the soft attention feeds reward signals back to facilitate the training of the hard one. For this purpose, we develop a novel hard attention called "reinforced sequence sampling (RSS)", selecting tokens in parallel and trained via policy gradient. Using two RSS modules, ReSA efficiently extracts the sparse dependencies between each pair of selected tokens. We finally propose an RNN/CNN-free sentence-encoding model, "reinforced self-attention network (ReSAN)", solely based on ReSA. It achieves state-of-the-art performance on both Stanford Natural Language Inference (SNLI) and Sentences Involving Compositional Knowledge (SICK) datasets.