亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Decentralising the Web is a desirable but challenging goal. One particular challenge is achieving decentralised content moderation in the face of various adversaries (e.g. trolls). To overcome this challenge, many Decentralised Web (DW) implementations rely on federation policies. Administrators use these policies to create rules that ban or modify content that matches specific rules. This, however, can have unintended consequences for many users. In this paper, we present the first study of federation policies on the DW, their in-the-wild usage, and their impact on users. We identify how these policies may negatively impact "innocent" users and outline possible solutions to avoid this problem in the future.

相關內容

Multiagent reinforcement learning algorithms have not been widely adopted in large scale environments with many agents as they often scale poorly with the number of agents. Using mean field theory to aggregate agents has been proposed as a solution to this problem. However, almost all previous methods in this area make a strong assumption of a centralized system where all the agents in the environment learn the same policy and are effectively indistinguishable from each other. In this paper, we relax this assumption about indistinguishable agents and propose a new mean field system known as Decentralized Mean Field Games, where each agent can be quite different from others. All agents learn independent policies in a decentralized fashion, based on their local observations. We define a theoretical solution concept for this system and provide a fixed point guarantee for a Q-learning based algorithm in this system. A practical consequence of our approach is that we can address a `chicken-and-egg' problem in empirical mean field reinforcement learning algorithms. Further, we provide Q-learning and actor-critic algorithms that use the decentralized mean field learning approach and give stronger performances compared to common baselines in this area. In our setting, agents do not need to be clones of each other and learn in a fully decentralized fashion. Hence, for the first time, we show the application of mean field learning methods in fully competitive environments, large-scale continuous action space environments, and other environments with heterogeneous agents. Importantly, we also apply the mean field method in a ride-sharing problem using a real-world dataset. We propose a decentralized solution to this problem, which is more practical than existing centralized training methods.

Outsourced computation for neural networks allows users access to state of the art models without needing to invest in specialized hardware and know-how. The problem is that the users lose control over potentially privacy sensitive data. With homomorphic encryption (HE) computation can be performed on encrypted data without revealing its content. In this systematization of knowledge, we take an in-depth look at approaches that combine neural networks with HE for privacy preservation. We categorize the changes to neural network models and architectures to make them computable over HE and how these changes impact performance. We find numerous challenges to HE based privacy-preserving deep learning such as computational overhead, usability, and limitations posed by the encryption schemes.

Multiparty computation (MPC) consists in several parties engaging in joint computation in such a way that each party's input and output remain private to that party. Whereas MPC protocols for specific computations have existed since the 1980s, only recently general-purpose compilers have been developed to allow MPC on arbitrary functions. Yet, using today's MPC compilers requires substantial programming effort and skill on the user's side, among other things because nearly all compilers translate the code of the computation into a Boolean or arithmetic circuit. In particular, the circuit representation requires unrolling loops and recursive calls, which forces programmers to (often manually) define loop bounds and hardly use recursion. We present an approach allowing MPC on an arbitrary computation expressed as ordinary code with all functionalities that does not need to be translated into a circuit. Our notion of input and output privacy is predicated on unlinkability. Our method leverages co-utile computation outsourcing using anonymous channels via decentralized reputation, makes a minimalistic use of cryptography and does not require participants to be honest-but-curious: it works as long as participants are rational (self-interested), which may include rationally malicious peers (who become attackers if this is advantageous to them). We present example applications, including e-voting. Our empirical work shows that reputation captures well the behavior of peers and ensures that parties with high reputation obtain correct results.

This paper studies optimal motion planning subject to motion and environment uncertainties. By modeling the system as a probabilistic labeled Markov decision process (PL-MDP), the control objective is to synthesize a finite-memory policy, under which the agent satisfies complex high-level tasks expressed as linear temporal logic (LTL) with desired satisfaction probability. In particular, the cost optimization of the trajectory that satisfies infinite horizon tasks is considered, and the trade-off between reducing the expected mean cost and maximizing the probability of task satisfaction is analyzed. Instead of using traditional Rabin automata, the LTL formulas are converted to limit-deterministic B\"uchi automata (LDBA) with a reachability acceptance condition and a compact graph structure. The novelty of this work lies in considering the cases where LTL specifications can be potentially infeasible and developing a relaxed product MDP between PL-MDP and LDBA. The relaxed product MDP allows the agent to revise its motion plan whenever the task is not fully feasible and quantify the revised plan's violation measurement. A multi-objective optimization problem is then formulated to jointly consider the probability of task satisfaction, the violation with respect to original task constraints, and the implementation cost of the policy execution. The formulated problem can be solved via coupled linear programs. To the best of our knowledge, this work first bridges the gap between probabilistic planning revision of potential infeasible LTL specifications and optimal control synthesis of both plan prefix and plan suffix of the trajectory over the infinite horizons. Experimental results are provided to demonstrate the effectiveness of the proposed framework.

The reliability of a Boolean Conjunctive Query (CQ) over a tuple-independent probabilistic database is the probability that the CQ is satisfied when the tuples of the database are sampled one by one, independently, with their associated probability. For queries without self-joins (repeated relation symbols), the data complexity of this problem is fully characterized by a known dichotomy: reliability can be computed in polynomial time for hierarchical queries, and is #P-hard for non-hierarchical queries. Inspired by this dichotomy, we investigate a fundamental counting problem for CQs without self-joins: how many sets of facts from the input database satisfy the query? This is equivalent to the uniform case of the query reliability problem, where the probability of every tuple is required to be 1/2. Of course, for hierarchical queries, uniform reliability is solvable in polynomial time, like the reliability problem. We show that being hierarchical is also necessary for this tractability (under conventional complexity assumptions). In fact, we establish a generalization of the dichotomy that covers every restricted case of reliability in which the probabilities of tuples are determined by their relation.

Motivated by many interesting real-world applications in logistics and online advertising, we consider an online allocation problem subject to lower and upper resource constraints, where the requests arrive sequentially, sampled i.i.d. from an unknown distribution, and we need to promptly make a decision given limited resources and lower bounds requirements. First, with knowledge of the measure of feasibility, i.e., $\alpha$, we propose a new algorithm that obtains $1-O(\frac{\epsilon}{\alpha-\epsilon})$ -competitive ratio for the offline problems that know the entire requests ahead of time. Inspired by the previous studies, this algorithm adopts an innovative technique to dynamically update a threshold price vector for making decisions. Moreover, an optimization method to estimate the optimal measure of feasibility is proposed with theoretical guarantee at the end of this paper. Based on this method, if we tolerate slight violation of the lower bounds constraints with parameter $\eta$, the proposed algorithm is naturally extended to the settings without strong feasible assumption, which cover the significantly unexplored infeasible scenarios.

Real-world applications in healthcare and supply chain domains produce, exchange, and share data in a multi-stakeholder environment. Data owners want to control their data and privacy in such settings. On the other hand, data consumers demand methods to understand when, how, and who produced the data. These requirements necessitate data governance frameworks that guarantee data provenance, privacy protection, and consent management. We introduce a decentralized data governance framework based on blockchain technology and proxy re-encryption to let data owners control and track their data through privacy-enhancing and consent management mechanisms. Besides, our framework allows the data consumers to understand data lineage through a blockchain-based provenance mechanism. We have used Digital e-prescription as the use case since it has multiple stakeholders and sensitive data while enabling the medical fraternity to manage patients' prescription data, involving patients as data owners, doctors and pharmacists as data consumers. Our proof-of-concept implementation and evaluation results based on CosmWasm, Ethereum, and pyUmbral PRE show that the proposed decentralized system guarantees transparency, privacy, and trust with minimal overhead.

Deep reinforcement learning (RL) has recently shown great promise in robotic continuous control tasks. Nevertheless, prior research in this vein center around the centralized learning setting that largely relies on the communication availability among all the components of a robot. However, agents in the real world often operate in a decentralised fashion without communication due to latency requirements, limited power budgets and safety concerns. By formulating robotic components as a system of decentralised agents, this work presents a decentralised multiagent reinforcement learning framework for continuous control. To this end, we first develop a cooperative multiagent PPO framework that allows for centralized optimisation during training and decentralised operation during execution. However, the system only receives a global reward signal which is not attributed towards each agent. To address this challenge, we further propose a generic game-theoretic credit assignment framework which computes agent-specific reward signals. Last but not least, we also incorporate a model-based RL module into our credit assignment framework, which leads to significant improvement in sample efficiency. We demonstrate the effectiveness of our framework on experimental results on Mujoco locomotion control tasks. For a demo video please visit: //youtu.be/gFyVPm4svEY.

In this monograph, I introduce the basic concepts of Online Learning through a modern view of Online Convex Optimization. Here, online learning refers to the framework of regret minimization under worst-case assumptions. I present first-order and second-order algorithms for online learning with convex losses, in Euclidean and non-Euclidean settings. All the algorithms are clearly presented as instantiation of Online Mirror Descent or Follow-The-Regularized-Leader and their variants. Particular attention is given to the issue of tuning the parameters of the algorithms and learning in unbounded domains, through adaptive and parameter-free online learning algorithms. Non-convex losses are dealt through convex surrogate losses and through randomization. The bandit setting is also briefly discussed, touching on the problem of adversarial and stochastic multi-armed bandits. These notes do not require prior knowledge of convex analysis and all the required mathematical tools are rigorously explained. Moreover, all the proofs have been carefully chosen to be as simple and as short as possible.

We propose a new method for event extraction (EE) task based on an imitation learning framework, specifically, inverse reinforcement learning (IRL) via generative adversarial network (GAN). The GAN estimates proper rewards according to the difference between the actions committed by the expert (or ground truth) and the agent among complicated states in the environment. EE task benefits from these dynamic rewards because instances and labels yield to various extents of difficulty and the gains are expected to be diverse -- e.g., an ambiguous but correctly detected trigger or argument should receive high gains -- while the traditional RL models usually neglect such differences and pay equal attention on all instances. Moreover, our experiments also demonstrate that the proposed framework outperforms state-of-the-art methods, without explicit feature engineering.

北京阿比特科技有限公司