亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Antisocial behavior can be contagious, spreading from individual to individual and rippling through social networks. Moreover, it can spread not only through third-party influence from observation, just like innovations or individual behavior do, but also through direct experience, via "pay-it-forward" retaliation. Here, we distinguish between the effects of observation and victimization for the contagion of antisocial behavior by analyzing large-scale digital-trace data. We study the spread of cheating in more than a million matches of an online multiplayer first-person shooter game, in which up to 100 players compete individually or in teams against strangers. We identify event sequences in which a player who observes or is killed by a certain number of cheaters starts cheating, and evaluate the extent to which these sequences would appear if we preserve the team and interaction structure but assume alternative gameplay scenarios. The results reveal that social contagion is only likely to exist for those who both observe and experience cheating, suggesting that third-party influence and "pay-it-forward" reciprocity interact positively. In addition, the effect is present only for those who both observe and experience more than once, suggesting that cheating is more likely to spread after repeated or multi-source exposure. Approaching online games as models of social systems, we use the findings to discuss strategies for targeted interventions to stem the spread of cheating and antisocial behavior more generally in online communities, schools, organizations, and sports.

相關內容

IFIP TC13 Conference on Human-Computer Interaction是人機交互領域的研究者和實踐者展示其工作的重要平臺。多年來,這些會議吸引了來自幾個國家和文化的研究人員。官網鏈接: · 相互獨立的 · 評論員 · 優化器 · 可辨認的 ·
2022 年 4 月 19 日

Data collection and research methodology represents a critical part of the research pipeline. On the one hand, it is important that we collect data in a way that maximises the validity of what we are measuring, which may involve the use of long scales with many items. On the other hand, collecting a large number of items across multiple scales results in participant fatigue, and expensive and time consuming data collection. It is therefore important that we use the available resources optimally. In this work, we consider how a consideration for theory and the associated causal/structural model can help us to streamline data collection procedures by not wasting time collecting data for variables which are not causally critical for subsequent analysis. This not only saves time and enables us to redirect resources to attend to other variables which are more important, but also increases research transparency and the reliability of theory testing. In order to achieve this streamlined data collection, we leverage structural models, and Markov conditional independency structures implicit in these models to identify the substructures which are critical for answering a particular research question. In this work, we review the relevant concepts and present a number of didactic examples with the hope that psychologists can use these techniques to streamline their data collection process without invalidating the subsequent analysis. We provide a number of simulation results to demonstrate the limited analytical impact of this streamlining.

Decision makers who receive many signals are subject to imperfect recall. This is especially important when learning from feeds that aggregate messages from many senders on social media platforms. In this paper, we study a stylized model of learning from feeds and highlight the inefficiencies that arise due to imperfect recall. In our model, failure to recall a specific message comes from the accumulation of messages which creates interference. We characterize the influence of each sender according to the rate at which she sends messages and to the strength of interference. Our analysis indicates that imperfect recall not only leads to double-counting and extreme opinions in finite populations, but also impedes the ability of the receiver to learn the true state as the population of the senders increases. We estimate the strength of interference in an online experiment where participants are exposed to (non-informative) repeated messages and they need to estimate the opinion of others. Results show that interference plays a significant role and is weaker among participants who disagree with each other. Our work has implication for the diffusion of information in networks, especially when it is false because it is shared and repeated more than true information.

Bayesian policy reuse (BPR) is a general policy transfer framework for selecting a source policy from an offline library by inferring the task belief based on some observation signals and a trained observation model. In this paper, we propose an improved BPR method to achieve more efficient policy transfer in deep reinforcement learning (DRL). First, most BPR algorithms use the episodic return as the observation signal that contains limited information and cannot be obtained until the end of an episode. Instead, we employ the state transition sample, which is informative and instantaneous, as the observation signal for faster and more accurate task inference. Second, BPR algorithms usually require numerous samples to estimate the probability distribution of the tabular-based observation model, which may be expensive and even infeasible to learn and maintain, especially when using the state transition sample as the signal. Hence, we propose a scalable observation model based on fitting state transition functions of source tasks from only a small number of samples, which can generalize to any signals observed in the target task. Moreover, we extend the offline-mode BPR to the continual learning setting by expanding the scalable observation model in a plug-and-play fashion, which can avoid negative transfer when faced with new unknown tasks. Experimental results show that our method can consistently facilitate faster and more efficient policy transfer.

We provide a decision theoretic analysis of bandit experiments. The setting corresponds to a dynamic programming problem, but solving this directly is typically infeasible. Working within the framework of diffusion asymptotics, we define suitable notions of asymptotic Bayes and minimax risk for bandit experiments. For normally distributed rewards, the minimal Bayes risk can be characterized as the solution to a nonlinear second-order partial differential equation (PDE). Using a limit of experiments approach, we show that this PDE characterization also holds asymptotically under both parametric and non-parametric distribution of the rewards. The approach further describes the state variables it is asymptotically sufficient to restrict attention to, and therefore suggests a practical strategy for dimension reduction. The upshot is that we can approximate the dynamic programming problem defining the bandit experiment with a PDE which can be efficiently solved using sparse matrix routines. We derive the optimal Bayes and minimax policies from the numerical solutions to these equations. The proposed policies substantially dominate existing methods such as Thompson sampling. The framework also allows for substantial generalizations to the bandit problem such as time discounting and pure exploration motives.

In this paper we present a novel method for a naive agent to detect novel objects it encounters in an interaction. We train a reinforcement learning policy on a stacking task given a known object type, and then observe the results of the agent attempting to stack various other objects based on the same trained policy. By extracting embedding vectors from a convolutional neural net trained over the results of the aforementioned stacking play, we can determine the similarity of a given object to known object types, and determine if the given object is likely dissimilar enough to the known types to be considered a novel class of object. We present the results of this method on two datasets gathered using two different policies and demonstrate what information the agent needs to extract from its environment to make these novelty judgments.

Automatic text summarization has experienced substantial progress in recent years. With this progress, the question has arisen whether the types of summaries that are typically generated by automatic summarization models align with users' needs. Ter Hoeve et al (2020) answer this question negatively. Amongst others, they recommend focusing on generating summaries with more graphical elements. This is in line with what we know from the psycholinguistics literature about how humans process text. Motivated from these two angles, we propose a new task: summarization with graphical elements, and we verify that these summaries are helpful for a critical mass of people. We collect a high quality human labeled dataset to support research into the task. We present a number of baseline methods that show that the task is interesting and challenging. Hence, with this work we hope to inspire a new line of research within the automatic summarization community.

When cast into the Deep Reinforcement Learning framework, many robotics tasks require solving a long horizon and sparse reward problem, where learning algorithms struggle. In such context, Imitation Learning (IL) can be a powerful approach to bootstrap the learning process. However, most IL methods require several expert demonstrations which can be prohibitively difficult to acquire. Only a handful of IL algorithms have shown efficiency in the context of an extreme low expert data regime where a single expert demonstration is available. In this paper, we present a novel algorithm designed to imitate complex robotic tasks from the states of an expert trajectory. Based on a sequential inductive bias, our method divides the complex task into smaller skills. The skills are learned into a goal-conditioned policy that is able to solve each skill individually and chain skills to solve the entire task. We show that our method imitates a non-holonomic navigation task and scales to a complex simulated robotic manipulation task with very high sample efficiency.

This paper proposes ResTv2, a simpler, faster, and stronger multi-scale vision Transformer for visual recognition. ResTv2 simplifies the EMSA structure in ResTv1 (i.e., eliminating the multi-head interaction part) and employs an upsample operation to reconstruct the lost medium- and high-frequency information caused by the downsampling operation. In addition, we explore different techniques for better apply ResTv2 backbones to downstream tasks. We found that although combining EMSAv2 and window attention can greatly reduce the theoretical matrix multiply FLOPs, it may significantly decrease the computation density, thus causing lower actual speed. We comprehensively validate ResTv2 on ImageNet classification, COCO detection, and ADE20K semantic segmentation. Experimental results show that the proposed ResTv2 can outperform the recently state-of-the-art backbones by a large margin, demonstrating the potential of ResTv2 as solid backbones. The code and models will be made publicly available at \url{//github.com/wofmanaf/ResT}

Reinforcement learning (RL) has shown great success in solving many challenging tasks via use of deep neural networks. Although using deep learning for RL brings immense representational power, it also causes a well-known sample-inefficiency problem. This means that the algorithms are data-hungry and require millions of training samples to converge to an adequate policy. One way to combat this issue is to use action advising in a teacher-student framework, where a knowledgeable teacher provides action advice to help the student. This work considers how to better leverage uncertainties about when a student should ask for advice and if the student can model the teacher to ask for less advice. The student could decide to ask for advice when it is uncertain or when both it and its model of the teacher are uncertain. In addition to this investigation, this paper introduces a new method to compute uncertainty for a deep RL agent using a secondary neural network. Our empirical results show that using dual uncertainties to drive advice collection and reuse may improve learning performance across several Atari games.

Task graphs provide a simple way to describe scientific workflows (sets of tasks with dependencies) that can be executed on both HPC clusters and in the cloud. An important aspect of executing such graphs is the used scheduling algorithm. Many scheduling heuristics have been proposed in existing works; nevertheless, they are often tested in oversimplified environments. We provide an extensible simulation environment designed for prototyping and benchmarking task schedulers, which contains implementations of various scheduling algorithms and is open-sourced, in order to be fully reproducible. We use this environment to perform a comprehensive analysis of workflow scheduling algorithms with a focus on quantifying the effect of scheduling challenges that have so far been mostly neglected, such as delays between scheduler invocations or partially unknown task durations. Our results indicate that network models used by many previous works might produce results that are off by an order of magnitude in comparison to a more realistic model. Additionally, we show that certain implementation details of scheduling algorithms which are often neglected can have a large effect on the scheduler's performance, and they should thus be described in great detail to enable proper evaluation.

北京阿比特科技有限公司