Every major technical invention resurfaces the dual-use dilemma -- the new technology has the potential to be used for good as well as for harm. Generative AI (GenAI) techniques, such as large language models (LLMs) and diffusion models, have shown remarkable capabilities (e.g., in-context learning, code-completion, and text-to-image generation and editing). However, GenAI can be used just as well by attackers to generate new attacks and increase the velocity and efficacy of existing attacks. This paper reports the findings of a workshop held at Google (co-organized by Stanford University and the University of Wisconsin-Madison) on the dual-use dilemma posed by GenAI. This paper is not meant to be comprehensive, but is rather an attempt to synthesize some of the interesting findings from the workshop. We discuss short-term and long-term goals for the community on this topic. We hope this paper provides both a launching point for a discussion on this important topic as well as interesting problems that the research community can work to address.
To analyze the worst-case running time of branching algorithms, the majority of work in exponential time algorithms focuses on designing complicated branching rules over developing better analysis methods for simple algorithms. In the mid-$2000$s, Fomin et al. [2005] introduced measure & conquer, an advanced general analysis method, sparking widespread adoption for obtaining tighter worst-case running time upper bounds for many fundamental NP-complete problems. Yet, much potential in this direction remains untapped, as most subsequent work applied it without further advancement. Motivated by this, we present piecewise analysis, a new general method that analyzes the running time of branching algorithms. Our approach is to define a similarity ratio that divides instances into groups and then analyze the running time within each group separately. The similarity ratio is a scale between two parameters of an instance I. Instead of relying on a single measure and a single analysis for the whole instance space, our method allows to take advantage of different intrinsic properties of instances with different similarity ratios. To showcase its potential, we reanalyze two $17$-year-old algorithms from Fomin et al. [2007] that solve $4$-Coloring and #$3$-Coloring respectively. The original analysis in their paper gave running times of $O(1.7272^n)$ and $O(1.6262^n)$ respectively for these algorithms, our analysis improves these running times to $O(1.7215^n)$ and $O(1.6232^n)$. Among the two improvements, our new running time $O(1.7215^n)$ is the first improvement in the best known running time for the 4-Coloring problem since 2007.
Various methods for designing input features have been proposed for fault recognition in rotating machines using one-dimensional raw sensor data. The available methods are complex, rely on empirical approaches, and may differ depending on the condition monitoring data used. Therefore, this article proposes a novel algorithm to design input features that unifies the feature extraction process for different time-series sensor data. This new insight for designing/extracting input features is obtained through the lens of histogram theory. The proposed algorithm extracts discriminative input features, which are suitable for a simple classifier to deep neural network-based classifiers. The designed input features are given as input to the classifier with end-to-end training in a single framework for machine conditions recognition. The proposed scheme has been validated through three real-time datasets: a) acoustic dataset, b) CWRU vibration dataset, and c) IMS vibration dataset. The real-time results and comparative study show the effectiveness of the proposed scheme for the prediction of the machine's health states.
A reasonable confidence interval should have a confidence coefficient no less than the given nominal level and a small expected length to reliably and accurately estimate the parameter of interest, and the bootstrap interval is considered to be an efficient interval estimation technique. In this paper, we offer a first attempt at computing the coverage probability and expected length of a parametric or percentile bootstrap interval by exact probabilistic calculation for any fixed sample size. This method is applied to the basic bootstrap intervals for functions of binomial proportions and a normal mean. None of these intervals, however, are found to have a correct confidence coefficient, which leads to illogical conclusions including that the bootstrap interval is narrower than the z-interval when estimating a normal mean. This raises a general question of how to utilize bootstrap intervals appropriately in practice since the sample size is typically fixed.
Decentralized bilevel optimization has been actively studied in the past few years since it has widespread applications in machine learning. However, existing algorithms suffer from large communication complexity caused by the estimation of stochastic hypergradient, limiting their application to real-world tasks. To address this issue, we develop a novel decentralized stochastic bilevel gradient descent algorithm under the heterogeneous setting, which enjoys a small communication cost in each round and a small number of communication rounds. As such, it can achieve a much better communication complexity than existing algorithms without any strong assumptions regarding heterogeneity. To the best of our knowledge, this is the first stochastic algorithm achieving these theoretical results under the heterogeneous setting. At last, the experimental results confirm the efficacy of our algorithm.
To apply the latest computer vision techniques that require a large computational cost in real industrial applications, knowledge distillation methods (KDs) are essential. Existing logit-based KDs apply the constant temperature scaling to all samples in dataset, limiting the utilization of knowledge inherent in each sample individually. In our approach, we classify the dataset into two categories (i.e., low energy and high energy samples) based on their energy score. Through experiments, we have confirmed that low energy samples exhibit high confidence scores, indicating certain predictions, while high energy samples yield low confidence scores, meaning uncertain predictions. To distill optimal knowledge by adjusting non-target class predictions, we apply a higher temperature to low energy samples to create smoother distributions and a lower temperature to high energy samples to achieve sharper distributions. When compared to previous logit-based and feature-based methods, our energy-based KD (Energy KD) achieves better performance on various datasets. Especially, Energy KD shows significant improvements on CIFAR-100-LT and ImageNet datasets, which contain many challenging samples. Furthermore, we propose high energy-based data augmentation (HE-DA) for further improving the performance. We demonstrate that meaningful performance improvement could be achieved by augmenting only 20-50% of dataset, suggesting that it can be employed on resource-limited devices. To the best of our knowledge, this paper represents the first attempt to make use of energy function in knowledge distillation and data augmentation, and we believe it will greatly contribute to future research.
Computing power, or "compute," is crucial for the development and deployment of artificial intelligence (AI) capabilities. As a result, governments and companies have started to leverage compute as a means to govern AI. For example, governments are investing in domestic compute capacity, controlling the flow of compute to competing countries, and subsidizing compute access to certain sectors. However, these efforts only scratch the surface of how compute can be used to govern AI development and deployment. Relative to other key inputs to AI (data and algorithms), AI-relevant compute is a particularly effective point of intervention: it is detectable, excludable, and quantifiable, and is produced via an extremely concentrated supply chain. These characteristics, alongside the singular importance of compute for cutting-edge AI models, suggest that governing compute can contribute to achieving common policy objectives, such as ensuring the safety and beneficial use of AI. More precisely, policymakers could use compute to facilitate regulatory visibility of AI, allocate resources to promote beneficial outcomes, and enforce restrictions against irresponsible or malicious AI development and usage. However, while compute-based policies and technologies have the potential to assist in these areas, there is significant variation in their readiness for implementation. Some ideas are currently being piloted, while others are hindered by the need for fundamental research. Furthermore, naive or poorly scoped approaches to compute governance carry significant risks in areas like privacy, economic impacts, and centralization of power. We end by suggesting guardrails to minimize these risks from compute governance.
Implicit neural representations have emerged as a powerful technique for encoding complex continuous multidimensional signals as neural networks, enabling a wide range of applications in computer vision, robotics, and geometry. While Adam is commonly used for training due to its stochastic proficiency, it entails lengthy training durations. To address this, we explore alternative optimization techniques for accelerated training without sacrificing accuracy. Traditional second-order optimizers like L-BFGS are suboptimal in stochastic settings, making them unsuitable for large-scale data sets. Instead, we propose stochastic training using curvature-aware diagonal preconditioners, showcasing their effectiveness across various signal modalities such as images, shape reconstruction, and Neural Radiance Fields (NeRF).
Counterfactual explanations provide a popular method for analyzing the predictions of black-box systems, and they can offer the opportunity for computational recourse by suggesting actionable changes on how to change the input to obtain a different (i.e. more favorable) system output. However, recent work highlighted their vulnerability to different types of manipulations. This work studies the vulnerability of counterfactual explanations to data poisoning. We formalize data poisoning in the context of counterfactual explanations for increasing the cost of recourse on three different levels: locally for a single instance, or a sub-group of instances, or globally for all instances. We demonstrate that state-of-the-art counterfactual generation methods \& toolboxes are vulnerable to such data poisoning.
Multilingual language models have gained significant attention in recent years, enabling the development of applications that meet diverse linguistic contexts. In this paper, we present a comprehensive evaluation of three popular multilingual language models: mBERT, XLM-R, and GPT-3. We assess their performance across a diverse set of languages, with a focus on understanding the impact of resource availability (general and model-specific), language family, script type, and word order on model performance, under two distinct tasks - text classification and text generation. Our findings reveal that while the amount of language-specific pretraining data plays a crucial role in model performance, we also identify other factors such as general resource availability, language family, and script type, as important features. We hope that our study contributes to a deeper understanding of multilingual language models to enhance their performance across languages and linguistic contexts.
Over the last several years, the field of natural language processing has been propelled forward by an explosion in the use of deep learning models. This survey provides a brief introduction to the field and a quick overview of deep learning architectures and methods. It then sifts through the plethora of recent studies and summarizes a large assortment of relevant contributions. Analyzed research areas include several core linguistic processing issues in addition to a number of applications of computational linguistics. A discussion of the current state of the art is then provided along with recommendations for future research in the field.