As artificial intelligence (AI) becomes more widespread, one question that arises is how human-AI interaction might impact human-human interaction. Chatbots, for example, are increasingly used as social companions, but little is known about how their use impacts human relationships. A common hypothesis is that these companion bots are detrimental to social health by harming or replacing human interaction. To understand how companion bots impact social health, we studied people who used companion bots and people who did not. Contrary to expectations, companion bot users indicated that these relationships were beneficial to their social health, whereas nonusers viewed them as harmful. Another common assumption is that people perceive conscious, humanlike AI as disturbing and threatening. Among both users and nonusers, however, we found the opposite: perceiving companion bots as more conscious and humanlike correlated with more positive opinions and better social health benefits. Humanlike bots may aid social health by supplying reliable and safe interactions, without necessarily harming human relationships.
Learning from human feedback (LHF) -- and in particular learning from pairwise preferences -- has recently become a crucial ingredient in training large language models (LLMs), and has been the subject of much research. Most recent works frame it as a reinforcement learning problem, where a reward function is learned from pairwise preference data and the LLM is treated as a policy which is adapted to maximize the rewards, often under additional regularization constraints. We propose an alternative interpretation which centers on the generative process for pairwise preferences and treats LHF as a density estimation problem. We provide theoretical and empirical results showing that for a family of generative processes defined via preference behavior distribution equations, training a reward function on pairwise preferences effectively models an annotator's implicit preference distribution. Finally, we discuss and present findings on "annotator misspecification" -- failure cases where wrong modeling assumptions are made about annotator behavior, resulting in poorly-adapted models -- suggesting that approaches that learn from pairwise human preferences could have trouble learning from a population of annotators with diverse viewpoints.
We present a new perspective on bridging the generalization gap between biological and computer vision -- mimicking the human visual diet. While computer vision models rely on internet-scraped datasets, humans learn from limited 3D scenes under diverse real-world transformations with objects in natural context. Our results demonstrate that incorporating variations and contextual cues ubiquitous in the human visual training data (visual diet) significantly improves generalization to real-world transformations such as lighting, viewpoint, and material changes. This improvement also extends to generalizing from synthetic to real-world data -- all models trained with a human-like visual diet outperform specialized architectures by large margins when tested on natural image data. These experiments are enabled by our two key contributions: a novel dataset capturing scene context and diverse real-world transformations to mimic the human visual diet, and a transformer model tailored to leverage these aspects of the human visual diet. All data and source code can be accessed at //github.com/Spandan-Madan/human_visual_diet.
Learning from large amounts of unsupervised data and a small amount of supervision is an important open problem in computer vision. We propose a new semi-supervised learning method, Semantic Positives via Pseudo-Labels (SemPPL), that combines labelled and unlabelled data to learn informative representations. Our method extends self-supervised contrastive learning -- where representations are shaped by distinguishing whether two samples represent the same underlying datum (positives) or not (negatives) -- with a novel approach to selecting positives. To enrich the set of positives, we leverage the few existing ground-truth labels to predict the missing ones through a $k$-nearest neighbours classifier by using the learned embeddings of the labelled data. We thus extend the set of positives with datapoints having the same pseudo-label and call these semantic positives. We jointly learn the representation and predict bootstrapped pseudo-labels. This creates a reinforcing cycle. Strong initial representations enable better pseudo-label predictions which then improve the selection of semantic positives and lead to even better representations. SemPPL outperforms competing semi-supervised methods setting new state-of-the-art performance of $68.5\%$ and $76\%$ top-$1$ accuracy when using a ResNet-$50$ and training on $1\%$ and $10\%$ of labels on ImageNet, respectively. Furthermore, when using selective kernels, SemPPL significantly outperforms previous state-of-the-art achieving $72.3\%$ and $78.3\%$ top-$1$ accuracy on ImageNet with $1\%$ and $10\%$ labels, respectively, which improves absolute $+7.8\%$ and $+6.2\%$ over previous work. SemPPL also exhibits state-of-the-art performance over larger ResNet models as well as strong robustness, out-of-distribution and transfer performance. We release the checkpoints and the evaluation code at //github.com/deepmind/semppl .
We give a full classification of continuous flexible discrete axial cone-nets, which are called axial C-hedra. The obtained result can also be used to construct their semi-discrete analogs. Moreover, we identify a novel subclass within the determined class of (semi-)discrete axial cone-nets, whose members are named axial P-nets as they fulfill the proportion (P) of the intercept theorem. Known special cases of these axial P-nets are the smooth and discrete conic crease patterns with reflecting rule lines. By using a parallelism operation one can even generalize axial P-nets. The resulting general P-nets constitute a rich novel class of continuous flexible (semi-)discrete surfaces, which allow direct access to their spatial shapes by three control polylines. This intuitive method makes them suitable for transformable design tasks using interactive tools.
This chapter provides a comprehensive overview of the pragmatic aspects involved in organizing AI competitions. We begin by discussing strategies to incentivize participation, touching upon effective communication techniques, aligning with trending topics in the field, structuring awards, potential recruitment opportunities, and more. We then shift to the essence of community engagement, and into organizational best practices and effective means of disseminating challenge outputs. Lastly, the chapter addresses the logistics, exposing on costs, required manpower, and resource allocation for effectively managing and executing a challenge. By examining these practical problems, readers will gain actionable insights to navigate the multifaceted landscape of AI competition organization, from inception to completion.
Large language models (LLMs) are a class of artificial intelligence models based on deep learning, which have great performance in various tasks, especially in natural language processing (NLP). Large language models typically consist of artificial neural networks with numerous parameters, trained on large amounts of unlabeled input using self-supervised or semi-supervised learning. However, their potential for solving bioinformatics problems may even exceed their proficiency in modeling human language. In this review, we will present a summary of the prominent large language models used in natural language processing, such as BERT and GPT, and focus on exploring the applications of large language models at different omics levels in bioinformatics, mainly including applications of large language models in genomics, transcriptomics, proteomics, drug discovery and single cell analysis. Finally, this review summarizes the potential and prospects of large language models in solving bioinformatic problems.
The prevalence of artificial intelligence (AI) has envisioned an era of healthcare democratisation that promises every stakeholder a new and better way of life. However, the advancement of clinical AI research is significantly hurdled by the dearth of data democratisation in healthcare. To truly democratise data for AI studies, challenges are two-fold: 1. the sensitive information in clinical data should be anonymised appropriately, and 2. AI-oriented clinical knowledge should flow freely across organisations. This paper considers a recent deep-learning advent, dataset condensation (DC), as a stone that kills two birds in democratising healthcare data. The condensed data after DC, which can be viewed as statistical metadata, abstracts original clinical records and irreversibly conceals sensitive information at individual levels; nevertheless, it still preserves adequate knowledge for learning deep neural networks (DNNs). More favourably, the compressed volumes and the accelerated model learnings of condensed data portray a more efficient clinical knowledge sharing and flowing system, as necessitated by data democratisation. We underline DC's prospects for democratising clinical data, specifically electrical healthcare records (EHRs), for AI research through experimental results and analysis across three healthcare datasets of varying data types.
Accent conversion aims to convert the accent of a source speech to a target accent, meanwhile preserving the speaker's identity. This paper introduces a novel non-autoregressive framework for accent conversion that learns accent-agnostic linguistic representations and employs them to convert the accent in the source speech. Specifically, the proposed system aligns speech representations with linguistic representations obtained from Text-to-Speech (TTS) systems, enabling training of the accent voice conversion model on non-parallel data. Furthermore, we investigate the effectiveness of a pretraining strategy on native data and different acoustic features within our proposed framework. We conduct a comprehensive evaluation using both subjective and objective metrics to assess the performance of our approach. The evaluation results highlight the benefits of the pretraining strategy and the incorporation of richer semantic features, resulting in significantly enhanced audio quality and intelligibility.
Insights: - The human-centered AI (HCAI) approach and the sociotechnical systems (STS) theory share the same goal: ensuring that new technologies such as AI best serve humans in a sociotechnical environment. - HCAI practice needs to fully embrace sociotechnical systems thinking, while traditional STS needs to evolve to address the emerging characteristics of AI technology. - We propose a conceptual framework for intelligent sociotechnical systems (iSTS) to enhance traditional STS theory in the AI era. - Based on iSTS, we further propose a sociotechnical-based hierarchical HCAI approach as a paradigmatic extension to existing HCAI practice, further advancing HCAI practice.
Artificial neural networks thrive in solving the classification problem for a particular rigid task, acquiring knowledge through generalized learning behaviour from a distinct training phase. The resulting network resembles a static entity of knowledge, with endeavours to extend this knowledge without targeting the original task resulting in a catastrophic forgetting. Continual learning shifts this paradigm towards networks that can continually accumulate knowledge over different tasks without the need to retrain from scratch. We focus on task incremental classification, where tasks arrive sequentially and are delineated by clear boundaries. Our main contributions concern 1) a taxonomy and extensive overview of the state-of-the-art, 2) a novel framework to continually determine the stability-plasticity trade-off of the continual learner, 3) a comprehensive experimental comparison of 11 state-of-the-art continual learning methods and 4 baselines. We empirically scrutinize method strengths and weaknesses on three benchmarks, considering Tiny Imagenet and large-scale unbalanced iNaturalist and a sequence of recognition datasets. We study the influence of model capacity, weight decay and dropout regularization, and the order in which the tasks are presented, and qualitatively compare methods in terms of required memory, computation time, and storage.