亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Although information theory has found success in disciplines, the literature on its applications to software evolution is limit. We are still missing artifacts that leverage the data and tooling available to measure how the information content of a project can be a proxy for its complexity. In this work, we explore two definitions of entropy, one structural and one textual, and apply it to the historical progression of the commit history of 25 open source projects. We produce evidence that they generally are highly correlated. We also observed that they display weak and unstable correlations with other complexity metrics. Our preliminary investigation of outliers shows an unexpected high frequency of events where there is considerable change in the information content of the project, suggesting that such outliers may inform a definition of surprisal.

相關內容

Solving the problem of cooperation is of fundamental importance to the creation and maintenance of functional societies, with examples of cooperative dilemmas ranging from navigating busy road junctions to negotiating carbon reduction treaties. As the use of AI becomes more pervasive throughout society, the need for socially intelligent agents that are able to navigate these complex cooperative dilemmas is becoming increasingly evident. In the natural world, direct punishment is an ubiquitous social mechanism that has been shown to benefit the emergence of cooperation within populations. However no prior work has investigated its impact on the development of cooperation within populations of artificial learning agents experiencing social dilemmas. Additionally, within natural populations the use of any form of punishment is strongly coupled with the related social mechanisms of partner selection and reputation. However, no previous work has considered the impact of combining multiple social mechanisms on the emergence of cooperation in multi-agent systems. Therefore, in this paper we present a comprehensive analysis of the behaviours and learning dynamics associated with direct punishment in multi-agent reinforcement learning systems and how it compares to third-party punishment, when both are combined with the related social mechanisms of partner selection and reputation. We provide an extensive and systematic evaluation of the impact of these key mechanisms on the dynamics of the strategies learned by agents. Finally, we discuss the implications of the use of these mechanisms on the design of cooperative AI systems.

Negation is a common everyday phenomena and has been a consistent area of weakness for language models (LMs). Although the Information Retrieval (IR) community has adopted LMs as the backbone of modern IR architectures, there has been little to no research in understanding how negation impacts neural IR. We therefore construct a straightforward benchmark on this theme: asking IR models to rank two documents that differ only by negation. We show that the results vary widely according to the type of IR architecture: cross-encoders perform best, followed by late-interaction models, and in last place are bi-encoder and sparse neural architectures. We find that most current information retrieval models do not consider negation, performing similarly or worse than randomly ranking. We show that although the obvious approach of continued fine-tuning on a dataset of contrastive documents containing negations increases performance (as does model size), there is still a large gap between machine and human performance.

The Shannon entropy of a random variable $X$ has much behaviour analogous to a signed measure. Previous work has concretized this connection by defining a signed measure $\mu$ on an abstract information space $\tilde{X}$, which is taken to represent the information that $X$ contains. This construction is sufficient to derive many measure-theoretical counterparts to information quantities such as the mutual information $I(X; Y) = \mu(\tilde{X} \cap \tilde{Y})$, the joint entropy $H(X,Y) = \mu(\tilde{X} \cup \tilde{Y})$, and the conditional entropy $H(X|Y) = \mu(\tilde{X}\, \setminus \, \tilde{Y})$. We demonstrate that there exists a much finer decomposition with intuitive properties which we call the logarithmic decomposition (LD). We show that this signed measure space has the useful property that its logarithmic atoms are easily characterised with negative or positive entropy, while also being coherent with Yeung's $I$-measure. We present the usability of our approach by re-examining the G\'acs-K\"orner common information from this new geometric perspective and characterising it in terms of our logarithmic atoms. We then highlight that our geometric refinement can account for an entire class of information quantities, which we call logarithmically decomposable quantities.

An experimental comparison of two or more optimization algorithms requires the same computational resources to be assigned to each algorithm. When a maximum runtime is set as the stopping criterion, all algorithms need to be executed in the same machine if they are to use the same resources. Unfortunately, the implementation code of the algorithms is not always available, which means that running the algorithms to be compared in the same machine is not always possible. And even if they are available, some optimization algorithms might be costly to run, such as training large neural-networks in the cloud. In this paper, we consider the following problem: how do we compare the performance of a new optimization algorithm B with a known algorithm A in the literature if we only have the results (the objective values) and the runtime in each instance of algorithm A? Particularly, we present a methodology that enables a statistical analysis of the performance of algorithms executed in different machines. The proposed methodology has two parts. Firstly, we propose a model that, given the runtime of an algorithm in a machine, estimates the runtime of the same algorithm in another machine. This model can be adjusted so that the probability of estimating a runtime longer than what it should be is arbitrarily low. Secondly, we introduce an adaptation of the one-sided sign test that uses a modified \textit{p}-value and takes into account that probability. Such adaptation avoids increasing the probability of type I error associated with executing algorithms A and B in different machines.

In order to advance academic research, it is important to assess and evaluate the academic influence of researchers and the findings they produce. Citation metrics are universally used methods to evaluate researchers. Amongst the several variations of citation metrics, the h-index proposed by Hirsch has become the leading measure. Recent work shows that h-index is not an effective measure to determine scientific impact - due to changing authorship patterns. This can be mitigated by using h-index of a paper to compute h- index of an author. We show that using fractional allocation of h-index gives better results. In this work, we reapply two indices based on the h-index of a single paper. The indices are referred to as: hp-index and hp-frac-index. We run large-scale experiments in three different fields with about a million publications and 3,000 authors. We also compare h-index of a paper with nine h-index like metrics. Our experiments show that hp-frac-index provides a unique ranking when compared to h-index. It also performs better than h-index in providing higher ranks to the awarded researcher.

We study various novel complexity measures for two-sided matching mechanisms, applied to the two canonical strategyproof matching mechanisms, Deferred Acceptance (DA) and Top Trading Cycles (TTC). Our metrics are designed to capture the complexity of various structural (rather than computational) concerns, in particular ones of recent interest from economics. We consider a canonical, flexible approach to formalizing our questions: define a protocol or data structure performing some task, and bound the number of bits that it requires. Our results apply this approach to four questions of general interest; for matching applicants to institutions, we ask: (1) How can one applicant affect the outcome matching? (2) How can one applicant affect another applicant's set of options? (3) How can the outcome matching be represented / communicated? (4) How can the outcome matching be verified? We prove that DA and TTC are comparable in complexity under questions (1) and (4), giving new tight lower-bound constructions and new verification protocols. Under questions (2) and (3), we prove that TTC is more complex than DA. For question (2), we prove this by giving a new characterization of which institutions are removed from each applicant's set of options when a new applicant is added in DA; this characterization may be of independent interest. For question (3), our result gives lower bounds proving the tightness of existing constructions for TTC. This shows that the relationship between the matching and the priorities is more complex in TTC than in DA, formalizing previous intuitions from the economics literature. Together, our results complement recent work that models the complexity of observing strategyproofness and shows that DA is more complex than TTC. This emphasizes that diverse considerations must factor into gauging the complexity of matching mechanisms.

Generation of automated clinical notes have been posited as a strategy to mitigate physician burnout. In particular, an automated narrative summary of a patient's hospital stay could supplement the hospital course section of the discharge summary that inpatient physicians document in electronic health record (EHR) systems. In the current study, we developed and evaluated an automated method for summarizing the hospital course section using encoder-decoder sequence-to-sequence transformer models. We fine tuned BERT and BART models and optimized for factuality through constraining beam search, which we trained and tested using EHR data from patients admitted to the neurology unit of an academic medical center. The approach demonstrated good ROUGE scores with an R-2 of 13.76. In a blind evaluation, two board-certified physicians rated 62% of the automated summaries as meeting the standard of care, which suggests the method may be useful clinically. To our knowledge, this study is among the first to demonstrate an automated method for generating a discharge summary hospital course that approaches a quality level of what a physician would write.

In relation to the traditional financial markets, the cryptocurrency market is a recent invention and the trading dynamics of all its components are readily recorded and stored. This fact opens up a unique opportunity to follow the multidimensional trajectory of its development since inception up to the present time. Several main characteristics commonly recognized as financial stylized facts of mature markets were quantitatively studied here. In particular, it is shown that the return distributions, volatility clustering effects, and even temporal multifractal correlations for a few highest-capitalization cryptocurrencies largely follow those of the well-established financial markets. The smaller cryptocurrencies are somewhat deficient in this regard, however. They are also not as highly cross-correlated among themselves and with other financial markets as the large cryptocurrencies. Quite generally, the volume V impact on price changes R appears to be much stronger on the cryptocurrency market than in the mature stock markets, and scales as $R(V) \sim V^{\alpha}$ with $\alpha \gtrsim 1$.

Reasoning is a fundamental aspect of human intelligence that plays a crucial role in activities such as problem solving, decision making, and critical thinking. In recent years, large language models (LLMs) have made significant progress in natural language processing, and there is observation that these models may exhibit reasoning abilities when they are sufficiently large. However, it is not yet clear to what extent LLMs are capable of reasoning. This paper provides a comprehensive overview of the current state of knowledge on reasoning in LLMs, including techniques for improving and eliciting reasoning in these models, methods and benchmarks for evaluating reasoning abilities, findings and implications of previous research in this field, and suggestions on future directions. Our aim is to provide a detailed and up-to-date review of this topic and stimulate meaningful discussion and future work.

The field of few-shot learning has recently seen substantial advancements. Most of these advancements came from casting few-shot learning as a meta-learning problem. Model Agnostic Meta Learning or MAML is currently one of the best approaches for few-shot learning via meta-learning. MAML is simple, elegant and very powerful, however, it has a variety of issues, such as being very sensitive to neural network architectures, often leading to instability during training, requiring arduous hyperparameter searches to stabilize training and achieve high generalization and being very computationally expensive at both training and inference times. In this paper, we propose various modifications to MAML that not only stabilize the system, but also substantially improve the generalization performance, convergence speed and computational overhead of MAML, which we call MAML++.

北京阿比特科技有限公司