A great part of software development involves conceptualizing or communicating the underlying procedures and logic that needs to be expressed in programs. One major difficulty of programming is turning concept into code, especially when dealing with the APIs of unfamiliar libraries. Recently, there has been a proliferation of machine learning methods for code generation and retrieval from natural language queries, but these have primarily been evaluated purely based on retrieval accuracy or overlap of generated code with developer-written code, and the actual effect of these methods on the developer workflow is surprisingly unattested. We perform the first comprehensive investigation of the promise and challenges of using such technology inside the IDE, asking "at the current state of technology does it improve developer productivity or accuracy, how does it affect the developer experience, and what are the remaining gaps and challenges?" We first develop a plugin for the IDE that implements a hybrid of code generation and code retrieval functionality, and orchestrate virtual environments to enable collection of many user events. We ask developers with various backgrounds to complete 14 Python programming tasks ranging from basic file manipulation to machine learning or data visualization, with or without the help of the plugin. While qualitative surveys of developer experience are largely positive, quantitative results with regards to increased productivity, code quality, or program correctness are inconclusive. Analysis identifies several pain points that could improve the effectiveness of future machine learning based code generation/retrieval developer assistants, and demonstrates when developers prefer code generation over code retrieval and vice versa. We release all data and software to pave the road for future empirical studies and development of better models.
With breakthrough of AlphaGo, AI in human-computer game has become a very hot topic attracting researchers all around the world, which usually serves as an effective standard for testing artificial intelligence. Various game AI systems (AIs) have been developed such as Libratus, OpenAI Five and AlphaStar, beating professional human players. In this paper, we survey recent successful game AIs, covering board game AIs, card game AIs, first-person shooting game AIs and real time strategy game AIs. Through this survey, we 1) compare the main difficulties among different kinds of games for the intelligent decision making field ; 2) illustrate the mainstream frameworks and techniques for developing professional level AIs; 3) raise the challenges or drawbacks in the current AIs for intelligent decision making; and 4) try to propose future trends in the games and intelligent decision making techniques. Finally, we hope this brief review can provide an introduction for beginners, inspire insights for researchers in the filed of AI in games.
The dataset was collected to examine and identify possible key topics within these texts. Data preparation such as data cleaning, transformation, tokenization, removal of stop words from both English and Filipino, and word stemming was employed in the dataset before feeding it to sentiment analysis and the LDA model. The topmost occurring word within the dataset is "development" and there are three (3) likely topics from the speeches of Philippine presidents: economic development, enhancement of public services, and addressing challenges. The dataset was able to provide valuable insights contained among official documents. While the study showed that presidents have used their annual address to express their visions for the country. It also presented that the presidents from 1935 to 2016 faced the same problems during their term. Future researchers may collect other speeches made by presidents during their term; combine them to the dataset used in this study to further investigate these important texts by subjecting them to the same methodology used in this study. The dataset may be requested from the authors and it is recommended for further analysis. For example, determine how the speeches of the president reflect the preamble or foundations of the Philippine constitution.
Recommender systems exploit interaction history to estimate user preference, having been heavily used in a wide range of industry applications. However, static recommendation models are difficult to answer two important questions well due to inherent shortcomings: (a) What exactly does a user like? (b) Why does a user like an item? The shortcomings are due to the way that static models learn user preference, i.e., without explicit instructions and active feedback from users. The recent rise of conversational recommender systems (CRSs) changes this situation fundamentally. In a CRS, users and the system can dynamically communicate through natural language interactions, which provide unprecedented opportunities to explicitly obtain the exact preference of users. Considerable efforts, spread across disparate settings and applications, have been put into developing CRSs. Existing models, technologies, and evaluation methods for CRSs are far from mature. In this paper, we provide a systematic review of the techniques used in current CRSs. We summarize the key challenges of developing CRSs into five directions: (1) Question-based user preference elicitation. (2) Multi-turn conversational recommendation strategies. (3) Dialogue understanding and generation. (4) Exploitation-exploration trade-offs. (5) Evaluation and user simulation. These research directions involve multiple research fields like information retrieval (IR), natural language processing (NLP), and human-computer interaction (HCI). Based on these research directions, we discuss some future challenges and opportunities. We provide a road map for researchers from multiple communities to get started in this area. We hope this survey helps to identify and address challenges in CRSs and inspire future research.
Meta-learning, or learning to learn, has gained renewed interest in recent years within the artificial intelligence community. However, meta-learning is incredibly prevalent within nature, has deep roots in cognitive science and psychology, and is currently studied in various forms within neuroscience. The aim of this review is to recast previous lines of research in the study of biological intelligence within the lens of meta-learning, placing these works into a common framework. More recent points of interaction between AI and neuroscience will be discussed, as well as interesting new directions that arise under this perspective.
A code completion system suggests future code elements to developers given a partially-complete code snippet. Code completion is one of the most useful features in Integrated Development Environments (IDEs). Currently, most code completion techniques predict a single token at a time. In this paper, we take a further step and discuss the probability of directly completing a whole line of code instead of a single token. We believe suggesting longer code sequences can further improve the efficiency of developers. Recently neural language models have been adopted as a preferred approach for code completion, and we believe these models can still be applied to full-line code completion with a few improvements. We conduct our experiments on two real-world python corpora and evaluate existing neural models based on source code tokens or syntactical actions. The results show that neural language models can achieve acceptable results on our tasks, with significant room for improvements.
Generative adversarial networks (GANs) have been extensively studied in the past few years. Arguably the revolutionary techniques are in the area of computer vision such as plausible image generation, image to image translation, facial attribute manipulation and similar domains. Despite the significant success achieved in computer vision field, applying GANs over real-world problems still have three main challenges: (1) High quality image generation; (2) Diverse image generation; and (3) Stable training. Considering numerous GAN-related research in the literature, we provide a study on the architecture-variants and loss-variants, which are proposed to handle these three challenges from two perspectives. We propose loss and architecture-variants for classifying most popular GANs, and discuss the potential improvements with focusing on these two aspects. While several reviews for GANs have been presented, there is no work focusing on the review of GAN-variants based on handling challenges mentioned above. In this paper, we review and critically discuss 7 architecture-variant GANs and 9 loss-variant GANs for remedying those three challenges. The objective of this review is to provide an insight on the footprint that current GANs research focuses on the performance improvement. Code related to GAN-variants studied in this work is summarized on //github.com/sheqi/GAN_Review.
Lack of repeatability and generalisability are two significant threats to continuing scientific development in Natural Language Processing. Language models and learning methods are so complex that scientific conference papers no longer contain enough space for the technical depth required for replication or reproduction. Taking Target Dependent Sentiment Analysis as a case study, we show how recent work in the field has not consistently released code, or described settings for learning methods in enough detail, and lacks comparability and generalisability in train, test or validation data. To investigate generalisability and to enable state of the art comparative evaluations, we carry out the first reproduction studies of three groups of complementary methods and perform the first large-scale mass evaluation on six different English datasets. Reflecting on our experiences, we recommend that future replication or reproduction experiments should always consider a variety of datasets alongside documenting and releasing their methods and published code in order to minimise the barriers to both repeatability and generalisability. We have released our code with a model zoo on GitHub with Jupyter Notebooks to aid understanding and full documentation, and we recommend that others do the same with their papers at submission time through an anonymised GitHub account.
Explainable Recommendation refers to the personalized recommendation algorithms that address the problem of why -- they not only provide the user with the recommendations, but also make the user aware why such items are recommended by generating recommendation explanations, which help to improve the effectiveness, efficiency, persuasiveness, and user satisfaction of recommender systems. In recent years, a large number of explainable recommendation approaches -- especially model-based explainable recommendation algorithms -- have been proposed and adopted in real-world systems. In this survey, we review the work on explainable recommendation that has been published in or before the year of 2018. We first high-light the position of explainable recommendation in recommender system research by categorizing recommendation problems into the 5W, i.e., what, when, who, where, and why. We then conduct a comprehensive survey of explainable recommendation itself in terms of three aspects: 1) We provide a chronological research line of explanations in recommender systems, including the user study approaches in the early years, as well as the more recent model-based approaches. 2) We provide a taxonomy for explainable recommendation algorithms, including user-based, item-based, model-based, and post-model explanations. 3) We summarize the application of explainable recommendation in different recommendation tasks, including product recommendation, social recommendation, POI recommendation, etc. We devote a chapter to discuss the explanation perspectives in the broader IR and machine learning settings, as well as their relationship with explainable recommendation research. We end the survey by discussing potential future research directions to promote the explainable recommendation research area.
Music recommender systems (MRS) have experienced a boom in recent years, thanks to the emergence and success of online streaming services, which nowadays make available almost all music in the world at the user's fingertip. While today's MRS considerably help users to find interesting music in these huge catalogs, MRS research is still facing substantial challenges. In particular when it comes to build, incorporate, and evaluate recommendation strategies that integrate information beyond simple user--item interactions or content-based descriptors, but dig deep into the very essence of listener needs, preferences, and intentions, MRS research becomes a big endeavor and related publications quite sparse. The purpose of this trends and survey article is twofold. We first identify and shed light on what we believe are the most pressing challenges MRS research is facing, from both academic and industry perspectives. We review the state of the art towards solving these challenges and discuss its limitations. Second, we detail possible future directions and visions we contemplate for the further evolution of the field. The article should therefore serve two purposes: giving the interested reader an overview of current challenges in MRS research and providing guidance for young researchers by identifying interesting, yet under-researched, directions in the field.
This paper surveys the current state of the art in Natural Language Generation (NLG), defined as the task of generating text or speech from non-linguistic input. A survey of NLG is timely in view of the changes that the field has undergone over the past decade or so, especially in relation to new (usually data-driven) methods, as well as new applications of NLG technology. This survey therefore aims to (a) give an up-to-date synthesis of research on the core tasks in NLG and the architectures adopted in which such tasks are organised; (b) highlight a number of relatively recent research topics that have arisen partly as a result of growing synergies between NLG and other areas of artificial intelligence; (c) draw attention to the challenges in NLG evaluation, relating them to similar challenges faced in other areas of Natural Language Processing, with an emphasis on different evaluation methods and the relationships between them.