In this article, we develop an interdisciplinary analysis of MEV which desires to merge the gap that exists between technical and legal research supporting policymakers in their regulatory decisions concerning blockchains, DeFi and associated risks. Consequently, this article is intended for both technical and legal audiences, and while we abstain from a detailed legal analysis, we aim to open a policy discussion regarding decentralized governance design at the block building layer as the place where MEV occurs. Maximal Extractable Value or MEV has been one of the major concerns in blockchain designs as it creates a centralizing force which ultimately affects user transactions. In this article, we dive into the technicality behind MEV, where we explain the concept behind the novel Proposal Builder Separation design as an effort by Flashbots to increase decentralization through modularity. We underline potential vulnerability factors under the PBS design, which open space for MEV extracting adversarial strategies by inside participants. We discuss the shift of trust from validators to builders in PoS blockchains such as Ethereum, acknowledging the impact that the later ones may have on users' transactions (in terms of front running) and censorship resistance (in terms of transaction inclusion). We recognize that under PBS, centralized (dominant) entities such as builders could potentially harm users by extracting MEV via front running strategies. Finally, we suggest adequate design and policy measures which could potentially mitigate these negative effects while protecting blockchain users.
Warm-Start reinforcement learning (RL), aided by a prior policy obtained from offline training, is emerging as a promising RL approach for practical applications. Recent empirical studies have demonstrated that the performance of Warm-Start RL can be improved \textit{quickly} in some cases but become \textit{stagnant} in other cases, especially when the function approximation is used. To this end, the primary objective of this work is to build a fundamental understanding on ``\textit{whether and when online learning can be significantly accelerated by a warm-start policy from offline RL?}''. Specifically, we consider the widely used Actor-Critic (A-C) method with a prior policy. We first quantify the approximation errors in the Actor update and the Critic update, respectively. Next, we cast the Warm-Start A-C algorithm as Newton's method with perturbation, and study the impact of the approximation errors on the finite-time learning performance with inaccurate Actor/Critic updates. Under some general technical conditions, we derive the upper bounds, which shed light on achieving the desired finite-learning performance in the Warm-Start A-C algorithm. In particular, our findings reveal that it is essential to reduce the algorithm bias in online learning. We also obtain lower bounds on the sub-optimality gap of the Warm-Start A-C algorithm to quantify the impact of the bias and error propagation.
In September 2022, Ethereum transitioned from Proof-of-Work (PoW) to Proof-of-Stake (PoS) during 'the merge' - making it the largest PoS cryptocurrency in terms of market capitalization. With this work, we present a comprehensive measurement study of the current state of the Ethereum PoS consensus layer on the beacon chain. We perform a longitudinal study over the entire history of the beacon chain, which ranges from 1 December 2020 until 15 May 2023. Our work finds that all dips in network participation, unrelated to network upgrades, are caused by issues with major consensus clients or service operators controlling a large number of validators. Thus, we analyze the decentralization of staking power over time by clustering validators to entities. We find that the staking power is concentrated in the hands of a few large entities. Further, we also analyze the consensus client landscape, given that bugs in a consensus client pose a security risk to the consensus layer. While the consensus client landscape exhibits significant concentration, with a single client accounting for one-third of the market share throughout the entire history of the beacon chain, we observe an improving trend.
Congestion is a common failure mode of markets, where consumers compete inefficiently on the same subset of goods (e.g., chasing the same small set of properties on a vacation rental platform). The typical economic story is that prices solve this problem by balancing supply and demand in order to decongest the market. But in modern online marketplaces, prices are typically set in a decentralized way by sellers, with the power of a platform limited to controlling representations -- the information made available about products. This motivates the present study of decongestion by representation, where a platform uses this power to learn representations that improve social welfare by reducing congestion. The technical challenge is twofold: relying only on revealed preferences from users' past choices, rather than true valuations; and working with representations that determine which features to reveal and are inherently combinatorial. We tackle both by proposing a differentiable proxy of welfare that can be trained end-to-end on consumer choice data. We provide theory giving sufficient conditions for when decongestion promotes welfare, and present experiments on both synthetic and real data shedding light on our setting and approach.
Dependency cycles pose a significant challenge to software quality and maintainability. However, there is limited understanding of how practitioners resolve dependency cycles in real-world scenarios. This paper presents an empirical study investigating the recurring patterns employed by software developers to resolve dependency cycles between two classes in practice. We analyzed the data from 18 open-source projects across different domains and manually inspected hundreds of cycle untangling cases. Our findings reveal that developers tend to employ five recurring patterns to address dependency cycles. The chosen patterns are not only determined by dependency relations between cyclic classes, but also highly related to their design context, i.e., how cyclic classes depend on or are depended by their neighbor classes. Through this empirical study, we also discovered three common mistakes developers usually made during cycles' handling. These recurring patterns and common mistakes observed in dependency cycles' practice can serve as a taxonomy to improve developers' awareness and also be used as learning materials for students in software engineering and inexperienced developers. Our results also suggest that, in addition to considering the internal structure of dependency cycles, automatic tools need to consider the design context of cycles to provide better support for refactoring dependency cycles.
Blockchain protocols typically aspire to run in the permissionless setting, in which nodes are owned and operated by a large number of diverse and unknown entities, with each node free to start or stop running the protocol at any time. This setting is more challenging than the traditional permissioned setting, in which the set of nodes that will be running the protocol is fixed and known at the time of protocol deployment. The goal of this paper is to provide a framework for reasoning about the rich design space of blockchain protocols and their capabilities and limitations in the permissionless setting. This paper offers a hierarchy of settings with different "degrees of permissionlessness", specified by the amount of knowledge that a protocol has about the current participants: These are the fully permissionless, dynamically available and quasi-permissionless settings. The paper also proves several results illustrating the utility of our analysis framework for reasoning about blockchain protocols in these settings. For example: (1) In the fully permissionless setting, even with synchronous communication and with severe restrictions on the total size of the Byzantine players, every deterministic protocol for Byzantine agreement has an infinite execution. (2) In the dynamically available and partially synchronous setting, no protocol can solve the Byzantine agreement problem with high probability, even if there are no Byzantine players at all. (3) In the quasi-permissionless and partially synchronous setting, by contrast, assuming a bound on the total size of the Byzantine players, there is a deterministic protocol guaranteed to solve the Byzantine agreement problem in a finite amount of time. (4) In the quasi-permissionless and synchronous setting, every proof-of-stake protocol that does not use advanced cryptography is vulnerable to long-range attacks.
In recent years, online social networks have been the target of adversaries who seek to introduce discord into societies, to undermine democracies and to destabilize communities. Often the goal is not to favor a certain side of a conflict but to increase disagreement and polarization. To get a mathematical understanding of such attacks, researchers use opinion-formation models from sociology, such as the Friedkin--Johnsen model, and formally study how much discord the adversary can produce when altering the opinions for only a small set of users. In this line of work, it is commonly assumed that the adversary has full knowledge about the network topology and the opinions of all users. However, the latter assumption is often unrealistic in practice, where user opinions are not available or simply difficult to estimate accurately. To address this concern, we raise the following question: Can an attacker sow discord in a social network, even when only the network topology is known? We answer this question affirmatively. We present approximation algorithms for detecting a small set of users who are highly influential for the disagreement and polarization in the network. We show that when the adversary radicalizes these users and if the initial disagreement/polarization in the network is not very high, then our method gives a constant-factor approximation on the setting when the user opinions are known. To find the set of influential users, we provide a novel approximation algorithm for a variant of MaxCut in graphs with positive and negative edge weights. We experimentally evaluate our methods, which have access only to the network topology, and we find that they have similar performance as methods that have access to the network topology and all user opinions. We further present an NP-hardness proof, which was an open question by Chen and Racz [IEEE Trans. Netw. Sci. Eng., 2021].
Empirical evidence demonstrates that citations received by scholarly publications follow a pattern of preferential attachment, resulting in a power-law distribution. Such asymmetry has sparked significant debate regarding the use of citations for research evaluation. However, a consensus has yet to be established concerning the historical trends in citation concentration. Are citations becoming more concentrated in a small number of articles? Or have recent geopolitical and technical changes in science led to more decentralized distributions? This ongoing debate stems from a lack of technical clarity in measuring inequality. Given the variations in citation practices across disciplines and over time, it is crucial to account for multiple factors that can influence the findings. This article explores how reference-based and citation-based approaches, uncited articles, citation inflation, the expansion of bibliometric databases, disciplinary differences, and self-citations affect the evolution of citation concentration. Our results indicate a decreasing trend in citation concentration, primarily driven by a decline in uncited articles, which, in turn, can be attributed to the growing significance of Asia and Europe. On the whole, our findings clarify current debates on citation concentration and show that, contrary to a widely-held belief, citations are increasingly scattered.
This paper deals with the problem of finding the preferred extensions of an argumentation framework by means of a bijection with the naive sets of another framework. First, we consider the case where an argumentation framework is naive-bijective: its naive sets and preferred extensions are equal. Recognizing naive-bijective argumentation frameworks is hard, but we show that it is tractable for frameworks with bounded in-degree. Next, we give a bijection between the preferred extensions of an argumentation framework being admissible-closed (the intersection of two admissible sets is admissible) and the naive sets of another framework on the same set of arguments. On the other hand, we prove that identifying admissible-closed argumentation frameworks is coNP-complete. At last, we introduce the notion of irreducible self-defending sets as those that are not the union of others. It turns out there exists a bijection between the preferred extensions of an argumentation framework and the naive sets of a framework on its irreducible self-defending sets. Consequently, the preferred extensions of argumentation frameworks with some lattice properties can be listed with polynomial delay and polynomial space.
Game theory has by now found numerous applications in various fields, including economics, industry, jurisprudence, and artificial intelligence, where each player only cares about its own interest in a noncooperative or cooperative manner, but without obvious malice to other players. However, in many practical applications, such as poker, chess, evader pursuing, drug interdiction, coast guard, cyber-security, and national defense, players often have apparently adversarial stances, that is, selfish actions of each player inevitably or intentionally inflict loss or wreak havoc on other players. Along this line, this paper provides a systematic survey on three main game models widely employed in adversarial games, i.e., zero-sum normal-form and extensive-form games, Stackelberg (security) games, zero-sum differential games, from an array of perspectives, including basic knowledge of game models, (approximate) equilibrium concepts, problem classifications, research frontiers, (approximate) optimal strategy seeking techniques, prevailing algorithms, and practical applications. Finally, promising future research directions are also discussed for relevant adversarial games.
ASR (automatic speech recognition) systems like Siri, Alexa, Google Voice or Cortana has become quite popular recently. One of the key techniques enabling the practical use of such systems in people's daily life is deep learning. Though deep learning in computer vision is known to be vulnerable to adversarial perturbations, little is known whether such perturbations are still valid on the practical speech recognition. In this paper, we not only demonstrate such attacks can happen in reality, but also show that the attacks can be systematically conducted. To minimize users' attention, we choose to embed the voice commands into a song, called CommandSong. In this way, the song carrying the command can spread through radio, TV or even any media player installed in the portable devices like smartphones, potentially impacting millions of users in long distance. In particular, we overcome two major challenges: minimizing the revision of a song in the process of embedding commands, and letting the CommandSong spread through the air without losing the voice "command". Our evaluation demonstrates that we can craft random songs to "carry" any commands and the modify is extremely difficult to be noticed. Specially, the physical attack that we play the CommandSongs over the air and record them can success with 94 percentage.