I study the measurement of scientists' influence using bibliographic data. The main result is an axiomatic characterization of the family of citation-counting indices, a broad class of influence measures which includes the renowned h-index. The result highlights several limitations of these indices: they are not suitable to compare scientists across different fields, and they cannot account for indirect influence. I explore how these limitations can be overcome by using richer bibliographic information.
This article addresses the lack of comprehensive studies on Web3 technologies, primarily due to lawyers' reluctance to explore technical intricacies. Understanding the underlying technological foundations is crucial to enhance the credibility of legal opinions. This article aims to illuminate these foundations, debunk myths, and concentrate on determining the legal status of crypto-assets in the context of property rights within the distributed economy. In addition, this article notes that the intangible nature of crypto-assets that derive value from distributed registries, and their resistance to deletion, makes crypto-assets more akin to the autonomy of intellectual property than physical media. The article presents illustrative examples from common law (United States, United Kingdom, New Zealand) and civil law (Germany, Austria, Poland) systems. Proposing a universal solution, it advocates a comprehensive framework safeguarding digital property - data ownership - extending beyond the confines of Web3. This article presents a comprehensive, multi-layered approach to the analysis of tokens as digital content and virtual goods. The approach, universally applicable to various of such goods, scrutinizes property on three distinct layers: first, the rights to the virtual good itself; second, the rights to the assets linked to the virtual good; and third, the rights to the intellectual property intricately associated with the token. Additionally, the paper provides concise analysis of the conflict of laws rules applicable to virtual goods. It also delves into issues concerning formal requirements for the transfer of intellectual property rights, licensing, the first sale (exhaustion) doctrine, the concept of the lawful acquirer, and other crucial aspects of intellectual property in the realm of virtual goods, particularly within the emerging metaverse.
This paper is a compilation of well-known results about Zadoff-Chu sequences, including all proofs with a consistent mathematical notation, for easy reference. Moreover, for a Zadoff-Chu sequence $x_u[n]$ of prime length $N_{\text{ZC}}$ and root index $u$, a formula is derived that allows computing the first term (frequency zero) of its discrete Fourier transform, $X_u[0]$, with constant complexity independent of the sequence length, as opposed to accumulating all its $N_{\text{ZC}}$ terms. The formula stems from a famous result in analytic number theory and is an interesting complement to the fact that the discrete Fourier transform of a Zadoff-Chu sequence is itself a Zadoff-Chu sequence whose terms are scaled by $X_u[0]$. Finally, the paper concludes with a brief analysis of time-continuous signals derived from Zadoff-Chu sequences, especially those obtained by OFDM-modulating a Zadoff-Chu sequence.
Semantic communication is an emerging research topic that has gained a wide range of attention recently. Despite this growing interest, there remains a notable absence of a comprehensive and widely-accepted framework for characterizing semantic communication. This paper introduces a new conceptualization of semantic communication and formulates two fundamental problems, which we term language exploitation and language design. Our contention is that the challenge of language design can be effectively situated within the broader framework of joint source-channel coding theory, underpinned by a comprehensive end-to-end distortion metric. To tackle the language exploitation problem, we put forth three approaches: semantic encoding, semantic decoding, and a synergistic combination of both in the form of combined semantic encoding and decoding. Furthermore, we establish the semantic distortion-cost region as a critical framework for assessing the language exploitation problem. For each of the three proposed approaches, the achievable distortion cost region is characterized. Overall, this paper aims to shed light on the intricate dynamics of semantic communication, paving the way for a deeper understanding of this evolving field.
Researchers, practitioners, and policymakers with an interest in AI ethics need more integrative approaches for studying and intervening in AI systems across many contexts and scales of activity. This paper presents AI value chains as an integrative concept that satisfies that need. To more clearly theorize AI value chains and conceptually distinguish them from supply chains, we review theories of value chains and AI value chains from the strategic management, service science, economic geography, industry, government, and applied research literature. We then conduct an integrative review of a sample of 67 sources that cover the ethical concerns implicated in AI value chains. Building upon the findings of our integrative review, we recommend four future directions that researchers, practitioners, and policymakers can take to advance more ethical practices of AI development and use across AI value chains. Our review and recommendations contribute to the advancement of research agendas, industrial agendas, and policy agendas that seek to study and intervene in the ethics of AI value chains.
Mathematical reasoning is a fundamental aspect of human intelligence and is applicable in various fields, including science, engineering, finance, and everyday life. The development of artificial intelligence (AI) systems capable of solving math problems and proving theorems has garnered significant interest in the fields of machine learning and natural language processing. For example, mathematics serves as a testbed for aspects of reasoning that are challenging for powerful deep learning models, driving new algorithmic and modeling advances. On the other hand, recent advances in large-scale neural language models have opened up new benchmarks and opportunities to use deep learning for mathematical reasoning. In this survey paper, we review the key tasks, datasets, and methods at the intersection of mathematical reasoning and deep learning over the past decade. We also evaluate existing benchmarks and methods, and discuss future research directions in this domain.
This book develops an effective theory approach to understanding deep neural networks of practical relevance. Beginning from a first-principles component-level picture of networks, we explain how to determine an accurate description of the output of trained networks by solving layer-to-layer iteration equations and nonlinear learning dynamics. A main result is that the predictions of networks are described by nearly-Gaussian distributions, with the depth-to-width aspect ratio of the network controlling the deviations from the infinite-width Gaussian description. We explain how these effectively-deep networks learn nontrivial representations from training and more broadly analyze the mechanism of representation learning for nonlinear models. From a nearly-kernel-methods perspective, we find that the dependence of such models' predictions on the underlying learning algorithm can be expressed in a simple and universal way. To obtain these results, we develop the notion of representation group flow (RG flow) to characterize the propagation of signals through the network. By tuning networks to criticality, we give a practical solution to the exploding and vanishing gradient problem. We further explain how RG flow leads to near-universal behavior and lets us categorize networks built from different activation functions into universality classes. Altogether, we show that the depth-to-width ratio governs the effective model complexity of the ensemble of trained networks. By using information-theoretic techniques, we estimate the optimal aspect ratio at which we expect the network to be practically most useful and show how residual connections can be used to push this scale to arbitrary depths. With these tools, we can learn in detail about the inductive bias of architectures, hyperparameters, and optimizers.
Transformers have achieved great success in many artificial intelligence fields, such as natural language processing, computer vision, and audio processing. Therefore, it is natural to attract lots of interest from academic and industry researchers. Up to the present, a great variety of Transformer variants (a.k.a. X-formers) have been proposed, however, a systematic and comprehensive literature review on these Transformer variants is still missing. In this survey, we provide a comprehensive review of various X-formers. We first briefly introduce the vanilla Transformer and then propose a new taxonomy of X-formers. Next, we introduce the various X-formers from three perspectives: architectural modification, pre-training, and applications. Finally, we outline some potential directions for future research.
Residual networks (ResNets) have displayed impressive results in pattern recognition and, recently, have garnered considerable theoretical interest due to a perceived link with neural ordinary differential equations (neural ODEs). This link relies on the convergence of network weights to a smooth function as the number of layers increases. We investigate the properties of weights trained by stochastic gradient descent and their scaling with network depth through detailed numerical experiments. We observe the existence of scaling regimes markedly different from those assumed in neural ODE literature. Depending on certain features of the network architecture, such as the smoothness of the activation function, one may obtain an alternative ODE limit, a stochastic differential equation or neither of these. These findings cast doubts on the validity of the neural ODE model as an adequate asymptotic description of deep ResNets and point to an alternative class of differential equations as a better description of the deep network limit.
Attention Model has now become an important concept in neural networks that has been researched within diverse application domains. This survey provides a structured and comprehensive overview of the developments in modeling attention. In particular, we propose a taxonomy which groups existing techniques into coherent categories. We review salient neural architectures in which attention has been incorporated, and discuss applications in which modeling attention has shown a significant impact. Finally, we also describe how attention has been used to improve the interpretability of neural networks. We hope this survey will provide a succinct introduction to attention models and guide practitioners while developing approaches for their applications.
Mining graph data has become a popular research topic in computer science and has been widely studied in both academia and industry given the increasing amount of network data in the recent years. However, the huge amount of network data has posed great challenges for efficient analysis. This motivates the advent of graph representation which maps the graph into a low-dimension vector space, keeping original graph structure and supporting graph inference. The investigation on efficient representation of a graph has profound theoretical significance and important realistic meaning, we therefore introduce some basic ideas in graph representation/network embedding as well as some representative models in this chapter.