We present a novel multi-parent crossover operator in genetic algorithms (GAs) called ``Deep Neural Crossover'' (DNC). Unlike conventional GA crossover operators that rely on a random selection of parental genes, DNC leverages the capabilities of deep reinforcement learning (DRL) and an encoder-decoder architecture to select the genes. Specifically, we use DRL to learn a policy for selecting promising genes. The policy is stochastic, to maintain the stochastic nature of GAs, representing a distribution for selecting genes with a higher probability of improving fitness. Our architecture features a recurrent neural network (RNN) to encode the parental genomes into latent memory states, and a decoder RNN that utilizes an attention-based pointing mechanism to generate a distribution over the next selected gene in the offspring. To improve the training time, we present a pre-training approach, wherein the architecture is initially trained on a single problem within a specific domain and then applied to solving other problems of the same domain. We compare DNC to known operators from the literature over two benchmark domains -- bin packing and graph coloring. We compare with both two- and three-parent crossover, outperforming all baselines. DNC is domain-independent and can be easily applied to other problem domains.
In this work, we propose a novel discriminative framework for dexterous grasp generation, named Dexterous Grasp TRansformer (DGTR), capable of predicting a diverse set of feasible grasp poses by processing the object point cloud with only one forward pass. We formulate dexterous grasp generation as a set prediction task and design a transformer-based grasping model for it. However, we identify that this set prediction paradigm encounters several optimization challenges in the field of dexterous grasping and results in restricted performance. To address these issues, we propose progressive strategies for both the training and testing phases. First, the dynamic-static matching training (DSMT) strategy is presented to enhance the optimization stability during the training phase. Second, we introduce the adversarial-balanced test-time adaptation (AB-TTA) with a pair of adversarial losses to improve grasping quality during the testing phase. Experimental results on the DexGraspNet dataset demonstrate the capability of DGTR to predict dexterous grasp poses with both high quality and diversity. Notably, while keeping high quality, the diversity of grasp poses predicted by DGTR significantly outperforms previous works in multiple metrics without any data pre-processing. Codes are available at //github.com/iSEE-Laboratory/DGTR .
Mobile Edge Computing (MEC) has emerged as a solution to the high latency and suboptimal Quality of Experience (QoE) associated with Mobile Cloud Computing (MCC). By processing data near the source, MEC reduces the need to send information to distant data centers, resulting in faster response times and lower latency. This paper explores the differences between MEC and traditional cloud computing, emphasizing architecture, data flow, and resource allocation. Key technologies like Network Function Virtualization (NFV) and Software-Defined Networking (SDN) are discussed for their role in achieving scalability and flexibility. Additionally, security and privacy challenges are addressed, underscoring the need for robust frameworks. We conclude with an examination of various edge computing applications and suggest future research directions to enhance the effectiveness and adoption of MEC in the evolving technological landscape.
Our main result is a polynomial time algorithm for deciding realizability for the GXU sublogic of linear temporal logic. This logic is particularly suitable for the specification of embedded control systems, and it is more expressive than GR(1). Reactive control programs for GXU specifications are represented as Mealy machines, which are extended by the monitoring of input events. Now, realizability for GXU specifications is shown to be equivalent to solving a certain subclass of 2QBF satisfiability problems. These logical problems can be solved in cubic time in the size of GXU specifications. For unrealizable GXU specifications, stronger environment assumptions are mined from failed consistency checks based on Padoa's characterization of definability and Craig interpolation.
Computer programs containing calls to linear solvers are a known challenge for automatic differentiation. Previous publications advise against differentiating through the low-level solver implementation, and instead advocate for high-level approaches that express the derivative in terms of a modified linear system that can be solved with a separate solver call. Despite this ubiquitous advice, we are not aware of prior work comparing the accuracy of both approaches. With this article we thus empirically study a simple question: What happens if we ignore common wisdom, and differentiate through linear solvers?
Interactive Natural Language Processing (iNLP) has emerged as a novel paradigm within the field of NLP, aimed at addressing limitations in existing frameworks while aligning with the ultimate goals of artificial intelligence. This paradigm considers language models as agents capable of observing, acting, and receiving feedback iteratively from external entities. Specifically, language models in this context can: (1) interact with humans for better understanding and addressing user needs, personalizing responses, aligning with human values, and improving the overall user experience; (2) interact with knowledge bases for enriching language representations with factual knowledge, enhancing the contextual relevance of responses, and dynamically leveraging external information to generate more accurate and informed responses; (3) interact with models and tools for effectively decomposing and addressing complex tasks, leveraging specialized expertise for specific subtasks, and fostering the simulation of social behaviors; and (4) interact with environments for learning grounded representations of language, and effectively tackling embodied tasks such as reasoning, planning, and decision-making in response to environmental observations. This paper offers a comprehensive survey of iNLP, starting by proposing a unified definition and framework of the concept. We then provide a systematic classification of iNLP, dissecting its various components, including interactive objects, interaction interfaces, and interaction methods. We proceed to delve into the evaluation methodologies used in the field, explore its diverse applications, scrutinize its ethical and safety issues, and discuss prospective research directions. This survey serves as an entry point for researchers who are interested in this rapidly evolving area and offers a broad view of the current landscape and future trajectory of iNLP.
Technology ecosystems often undergo significant transformations as they mature. For example, telephony, the Internet, and PCs all started with a single provider, but in the United States each is now served by a competitive market that uses comprehensive and universal technology standards to provide compatibility. This white paper presents our view on how the cloud ecosystem, barely over fifteen years old, could evolve as it matures.
Graph Neural Networks (GNNs) draw their strength from explicitly modeling the topological information of structured data. However, existing GNNs suffer from limited capability in capturing the hierarchical graph representation which plays an important role in graph classification. In this paper, we innovatively propose hierarchical graph capsule network (HGCN) that can jointly learn node embeddings and extract graph hierarchies. Specifically, disentangled graph capsules are established by identifying heterogeneous factors underlying each node, such that their instantiation parameters represent different properties of the same entity. To learn the hierarchical representation, HGCN characterizes the part-whole relationship between lower-level capsules (part) and higher-level capsules (whole) by explicitly considering the structure information among the parts. Experimental studies demonstrate the effectiveness of HGCN and the contribution of each component.
As a field of AI, Machine Reasoning (MR) uses largely symbolic means to formalize and emulate abstract reasoning. Studies in early MR have notably started inquiries into Explainable AI (XAI) -- arguably one of the biggest concerns today for the AI community. Work on explainable MR as well as on MR approaches to explainability in other areas of AI has continued ever since. It is especially potent in modern MR branches, such as argumentation, constraint and logic programming, planning. We hereby aim to provide a selective overview of MR explainability techniques and studies in hopes that insights from this long track of research will complement well the current XAI landscape. This document reports our work in-progress on MR explainability.
We investigate a lattice-structured LSTM model for Chinese NER, which encodes a sequence of input characters as well as all potential words that match a lexicon. Compared with character-based methods, our model explicitly leverages word and word sequence information. Compared with word-based methods, lattice LSTM does not suffer from segmentation errors. Gated recurrent cells allow our model to choose the most relevant characters and words from a sentence for better NER results. Experiments on various datasets show that lattice LSTM outperforms both word-based and character-based LSTM baselines, achieving the best results.
We introduce an effective model to overcome the problem of mode collapse when training Generative Adversarial Networks (GAN). Firstly, we propose a new generator objective that finds it better to tackle mode collapse. And, we apply an independent Autoencoders (AE) to constrain the generator and consider its reconstructed samples as "real" samples to slow down the convergence of discriminator that enables to reduce the gradient vanishing problem and stabilize the model. Secondly, from mappings between latent and data spaces provided by AE, we further regularize AE by the relative distance between the latent and data samples to explicitly prevent the generator falling into mode collapse setting. This idea comes when we find a new way to visualize the mode collapse on MNIST dataset. To the best of our knowledge, our method is the first to propose and apply successfully the relative distance of latent and data samples for stabilizing GAN. Thirdly, our proposed model, namely Generative Adversarial Autoencoder Networks (GAAN), is stable and has suffered from neither gradient vanishing nor mode collapse issues, as empirically demonstrated on synthetic, MNIST, MNIST-1K, CelebA and CIFAR-10 datasets. Experimental results show that our method can approximate well multi-modal distribution and achieve better results than state-of-the-art methods on these benchmark datasets. Our model implementation is published here: //github.com/tntrung/gaan