亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Recent developments in Markov chain Monte Carlo (MCMC) algorithms now allow us to run thousands of chains in parallel almost as quickly as a single chain, using hardware accelerators such as GPUs. We explore the benefits of running many chains, with an emphasis on achieving a target precision in as short a time as possible. One expected advantage is that, while each chain still needs to forget its initial point during a warmup phase, the subsequent sampling phase can be almost arbitrarily short. To determine if the resulting short chains are reliable, we need to assess how close the Markov chains are to convergence to their stationary distribution. The $\hat R$ statistic is a general purpose and popular convergence diagnostic but unfortunately can require a long sampling phase to work well. We present a nested design to overcome this challenge and a generalization called nested $\hat R$. This new diagnostic works under conditions similar to $\hat R$ and completes the MCMC workflow for many GPU-friendly samplers. In addition, the proposed nesting provides theoretical insights into the utility of $\hat R$, in both classical and short-chains regimes.

相關內容

A large number of modern applications ranging from listening songs online and browsing the Web to using a navigation app on a smartphone generate a plethora of user trails. Clustering such trails into groups with a common sequence pattern can reveal significant structure in human behavior that can lead to improving user experience through better recommendations, and even prevent suicides [LMCR14]. One approach to modeling this problem mathematically is as a mixture of Markov chains. Recently, Gupta, Kumar and Vassilvitski [GKV16] introduced an algorithm (GKV-SVD) based on the singular value decomposition (SVD) that under certain conditions can perfectly recover a mixture of L chains on n states, given only the distribution of trails of length 3 (3-trail). In this work we contribute to the problem of unmixing Markov chains by highlighting and addressing two important constraints of the GKV-SVD algorithm [GKV16]: some chains in the mixture may not even be weakly connected, and secondly in practice one does not know beforehand the true number of chains. We resolve these issues in the Gupta et al. paper [GKV16]. Specifically, we propose an algebraic criterion that enables us to choose a value of L efficiently that avoids overfitting. Furthermore, we design a reconstruction algorithm that outputs the true mixture in the presence of disconnected chains and is robust to noise. We complement our theoretical results with experiments on both synthetic and real data, where we observe that our method outperforms the GKV-SVD algorithm. Finally, we empirically observe that combining an EM-algorithm with our method performs best in practice, both in terms of reconstruction error with respect to the distribution of 3-trails and the mixture of Markov Chains.

Software is a great enabler for a number of projects that otherwise would be impossible to perform. Such projects include Space Exploration, Weather Modeling, Genome Projects, and many others. It is critical that software aiding these projects does what it is expected to do. In the terminology of software engineering, software that corresponds to requirements, that is does what it is expected to do is called correct. Checking the correctness of software has been the focus of a great deal of research in the area of software engineering. Practitioners in the field in which software is applied quite often do not assign much value to checking this correctness. Yet, as software systems become larger, potentially combined with distributed subsystems written by different authors, such verification becomes even more important. Concurrent, distributed systems are prone to dangerous errors due to different speeds of execution of their components such as deadlocks, race conditions, or violation of project-specific properties. This project describes an application of a static analysis method called model checking to verification of a distributed system for the Bioinformatics process. In it, we evaluate the efficiency of the model checking approach to the verification of combined processes with an increasing number of concurrently executed steps. We show that our experimental results correspond to analytically derived expectations. We also highlight the importance of static analysis to combined processes in the Bioinformatics field.

In many complex systems, whether biological or artificial, the thermodynamic costs of communication among their components are large. These systems also tend to split information transmitted between any two components across multiple channels. A common hypothesis is that such inverse multiplexing strategies reduce total thermodynamic costs. So far, however, there have been no physics-based results supporting this hypothesis. This gap existed partially because we have lacked a theoretical framework that addresses the interplay of thermodynamics and information in off-equilibrium systems at any spatiotemporal scale. Here we present the first study that rigorously combines such a framework, stochastic thermodynamics, with Shannon information theory. We develop a minimal model that captures the fundamental features common to a wide variety of communication systems. We find that the thermodynamic cost in this model is a convex function of the channel capacity, the canonical measure of the communication capability of a channel. We also find that this function is not always monotonic, in contrast to previous results not derived from first principles physics. These results clarify when and how to split a single communication stream across multiple channels. In particular, we present Pareto fronts that reveal the trade-off between thermodynamic costs and channel capacity when inverse multiplexing. Due to the generality of our model, our findings could help explain empirical observations of how thermodynamic costs of information transmission make inverse multiplexing energetically favorable in many real-world communication systems.

We initiate a principled study of algorithmic collective action on digital platforms that deploy machine learning algorithms. We propose a simple theoretical model of a collective interacting with a firm's learning algorithm. The collective pools the data of participating individuals and executes an algorithmic strategy by instructing participants how to modify their own data to achieve a collective goal. We investigate the consequences of this model in three fundamental learning-theoretic settings: the case of a nonparametric optimal learning algorithm, a parametric risk minimizer, and gradient-based optimization. In each setting, we come up with coordinated algorithmic strategies and characterize natural success criteria as a function of the collective's size. Complementing our theory, we conduct systematic experiments on a skill classification task involving tens of thousands of resumes from a gig platform for freelancers. Through more than two thousand model training runs of a BERT-like language model, we see a striking correspondence emerge between our empirical observations and the predictions made by our theory. Taken together, our theory and experiments broadly support the conclusion that algorithmic collectives of exceedingly small fractional size can exert significant control over a platform's learning algorithm.

Regulations and standards in the field of artificial intelligence (AI) are necessary to minimise risks and maximise benefits, yet some argue that they stifle innovation. This paper critically examines the idea that regulation stifles innovation in the field of AI. Current trends in AI regulation, particularly the proposed European AI Act and the standards supporting its implementation, are discussed. Arguments in support of the idea that regulation stifles innovation are analysed and criticised, and an alternative point of view is offered, showing how regulation and standards can foster innovation in the field of AI.

In this paper, we develop two ``Nesterov's accelerated'' variants of the well-known extragradient method to approximate a solution of a co-hypomonotone inclusion constituted by the sum of two operators, where one is Lipschitz continuous and the other is possibly multivalued. The first scheme can be viewed as an accelerated variant of Tseng's forward-backward-forward splitting method, while the second one is a variant of the reflected forward-backward splitting method, which requires only one evaluation of the Lipschitz operator, and one resolvent of the multivalued operator. Under a proper choice of the algorithmic parameters and appropriate conditions on the co-hypomonotone parameter, we theoretically prove that both algorithms achieve $\mathcal{O}(1/k)$ convergence rates on the norm of the residual, where $k$ is the iteration counter. Our results can be viewed as alternatives of a recent class of Halpern-type schemes for root-finding problems.

Class invariants -- consistency constraints preserved by every operation on objects of a given type -- are fundamental to building, understanding and verifying object-oriented programs. For verification, however, they raise difficulties, which have not yet received a generally accepted solution. The present work introduces a proof rule meant to address these issues and allow verification tools to benefit from invariants. It clarifies the notion of invariant and identifies the three associated problems: callbacks, furtive access and reference leak. As an example, the 2016 Ethereum DAO bug, in which $50 million were stolen, resulted from a callback invalidating an invariant. The discussion starts with a simplified model of computation and an associated proof rule, demonstrating its soundness. It then removes one by one the three simplifying assumptions, each removal raising one of the three issues, and leading to a corresponding adaptation to the proof rule. The final version of the rule can tackle tricky examples, including "challenge problems" listed in the literature.

To make accurate inferences in an interactive setting, an agent must not confuse passive observation of events with having participated in causing those events. The do operator formalises interventions so that we may reason about their effect. Yet there exist at least two pareto optimal mathematical formalisms of general intelligence in an interactive setting which, presupposing no explicit representation of intervention, make maximally accurate inferences. We examine one such formalism. We show that in the absence of an operator, an intervention can still be represented by a variable. Furthermore, the need to explicitly represent interventions in advance arises only because we presuppose abstractions. The aforementioned formalism avoids this and so, initial conditions permitting, representations of relevant causal interventions will emerge through induction. These emergent abstractions function as representations of one`s self and of any other object, inasmuch as the interventions of those objects impact the satisfaction of goals. We argue (with reference to theory of mind) that this explains how one might reason about one`s own identity and intent, those of others, of one's own as perceived by others and so on. In a narrow sense this describes what it is to be aware, and is a mechanistic explanation of aspects of consciousness.

We investigate solution methods for large-scale inverse problems governed by partial differential equations (PDEs) via Bayesian inference. The Bayesian framework provides a statistical setting to infer uncertain parameters from noisy measurements. To quantify posterior uncertainty, we adopt Markov Chain Monte Carlo (MCMC) approaches for generating samples. To increase the efficiency of these approaches in high-dimension, we make use of local information about gradient and Hessian of the target potential, also via Hamiltonian Monte Carlo (HMC). Our target application is inferring the field of soil permeability processing observations of pore pressure, using a nonlinear PDE poromechanics model for predicting pressure from permeability. We compare the performance of different sampling approaches in this and other settings. We also investigate the effect of dimensionality and non-gaussianity of distributions on the performance of different sampling methods.

Federated Averaging (FedAvg) and its variants are the most popular optimization algorithms in federated learning (FL). Previous convergence analyses of FedAvg either assume full client participation or partial client participation where the clients can be uniformly sampled. However, in practical cross-device FL systems, only a subset of clients that satisfy local criteria such as battery status, network connectivity, and maximum participation frequency requirements (to ensure privacy) are available for training at a given time. As a result, client availability follows a natural cyclic pattern. We provide (to our knowledge) the first theoretical framework to analyze the convergence of FedAvg with cyclic client participation with several different client optimizers such as GD, SGD, and shuffled SGD. Our analysis discovers that cyclic client participation can achieve a faster asymptotic convergence rate than vanilla FedAvg with uniform client participation under suitable conditions, providing valuable insights into the design of client sampling protocols.

北京阿比特科技有限公司