亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Since the introduction of ChatGPT and GPT-4, these models have been tested across a large number of tasks. Their adeptness across domains is evident, but their aptitude in playing games and specifically their aptitude in the realm of poker has remained unexplored. Poker is a game that requires decision making under uncertainty and incomplete information. In this paper, we put ChatGPT and GPT-4 through the poker test and evaluate their poker skills. Our findings reveal that while both models display an advanced understanding of poker, encompassing concepts like the valuation of starting hands, playing positions and other intricacies of game theory optimal (GTO) poker, both ChatGPT and GPT-4 are NOT game theory optimal poker players. Through a series of experiments, we first discover the characteristics of optimal prompts and model parameters for playing poker with these models. Our observations then unveil the distinct playing personas of the two models. We first conclude that GPT-4 is a more advanced poker player than ChatGPT. This exploration then sheds light on the divergent poker tactics of the two models: ChatGPT's conservativeness juxtaposed against GPT-4's aggression. In poker vernacular, when tasked to play GTO poker, ChatGPT plays like a Nit, which means that it has a propensity to only engage with premium hands and folds a majority of hands. When subjected to the same directive, GPT-4 plays like a maniac, showcasing a loose and aggressive style of play. Both strategies, although relatively advanced, are not game theory optimal.

相關內容

北京時間2023年3月15日凌晨,ChatGPT開發商OpenAI 發布了發布了全新的多模態預訓練大模型 GPT-4,可以更可靠、更具創造力、能處理更細節的指令,根據圖片和文字提示都能生成相應內容。 具體來說來說,GPT-4 相比上一代的模型,實現了飛躍式提升:支持圖像和文本輸入,擁有強大的識圖能力;大幅提升了文字輸入限制,在ChatGPT模式下,GPT-4可以處理超過2.5萬字的文本,可以處理一些更加細節的指令;回答準確性也得到了顯著提高。

Deep generative models have been widely used for their ability to generate realistic data samples in various areas, such as images, molecules, text, and speech. One major goal of data generation is controllability, namely to generate new data with desired properties. Despite growing interest in the area of controllable generation, significant challenges still remain, including 1) disentangling desired properties with unrelated latent variables, 2) out-of-distribution property control, and 3) objective optimization for out-of-distribution property control. To address these challenges, in this paper, we propose a general framework to enhance VAE-based data generators with property controllability and ensure disentanglement. Our proposed objective can be optimized on both data seen and unseen in the training set. We propose a training procedure to train the objective in a semi-supervised manner by iteratively conducting mutual mappings between the data and properties. The proposed framework is implemented on four VAE-based controllable generators to evaluate its performance on property error, disentanglement, generation quality, and training time. The results indicate that our proposed framework enables more precise control over the properties of generated samples in a short training time, ensuring the disentanglement and keeping the validity of the generated samples.

Recently, Graph Transformer (GT) models have been widely used in the task of Molecular Property Prediction (MPP) due to their high reliability in characterizing the latent relationship among graph nodes (i.e., the atoms in a molecule). However, most existing GT-based methods usually explore the basic interactions between pairwise atoms, and thus they fail to consider the important interactions among critical motifs (e.g., functional groups consisted of several atoms) of molecules. As motifs in a molecule are significant patterns that are of great importance for determining molecular properties (e.g., toxicity and solubility), overlooking motif interactions inevitably hinders the effectiveness of MPP. To address this issue, we propose a novel Atom-Motif Contrastive Transformer (AMCT), which not only explores the atom-level interactions but also considers the motif-level interactions. Since the representations of atoms and motifs for a given molecule are actually two different views of the same instance, they are naturally aligned to generate the self-supervisory signals for model training. Meanwhile, the same motif can exist in different molecules, and hence we also employ the contrastive loss to maximize the representation agreement of identical motifs across different molecules. Finally, in order to clearly identify the motifs that are critical in deciding the properties of each molecule, we further construct a property-aware attention mechanism into our learning framework. Our proposed AMCT is extensively evaluated on seven popular benchmark datasets, and both quantitative and qualitative results firmly demonstrate its effectiveness when compared with the state-of-the-art methods.

Federated Learning (FL) has surged in prominence due to its capability of collaborative model training without direct data sharing. However, the vast disparity in local data distributions among clients, often termed the non-Independent Identically Distributed (non-IID) challenge, poses a significant hurdle to FL's generalization efficacy. The scenario becomes even more complex when not all clients participate in the training process, a common occurrence due to unstable network connections or limited computational capacities. This can greatly complicate the assessment of the trained models' generalization abilities. While a plethora of recent studies has centered on the generalization gap pertaining to unseen data from participating clients with diverse distributions, the divergence between the training distributions of participating clients and the testing distributions of non-participating ones has been largely overlooked. In response, our paper unveils an information-theoretic generalization framework for FL. Specifically, it quantifies generalization errors by evaluating the information entropy of local distributions and discerning discrepancies across these distributions. Inspired by our deduced generalization bounds, we introduce a weighted aggregation approach and a duo of client selection strategies. These innovations aim to bolster FL's generalization prowess by encompassing a more varied set of client data distributions. Our extensive empirical evaluations reaffirm the potency of our proposed methods, aligning seamlessly with our theoretical construct.

Computational fluid dynamics (CFD) simulations of viscous fluids described by the Navier-Stokes equations are considered. Depending on the Reynolds number of the flow, the Navier-Stokes equations may exhibit a highly nonlinear behavior. The system of nonlinear equations resulting from the discretization of the Navier-Stokes equations can be solved using nonlinear iteration methods, such as Newton's method. However, fast quadratic convergence is typically only obtained in a local neighborhood of the solution, and for many configurations, the classical Newton iteration does not converge at all. In such cases, so-called globalization techniques may help to improve convergence. In this paper, pseudo-transient continuation is employed in order to improve nonlinear convergence. The classical algorithm is enhanced by a neural network model that is trained to predict a local pseudo-time step. Generalization of the novel approach is facilitated by predicting the local pseudo-time step separately on each element using only local information on a patch of adjacent elements as input. Numerical results for standard benchmark problems, including flow through a backward facing step geometry and Couette flow, show the performance of the machine learning-enhanced globalization approach; as the software for the simulations, the CFD module of COMSOL Multiphysics is employed.

In variational inference, the benefits of Bayesian models rely on accurately capturing the true posterior distribution. We propose using neural samplers that specify implicit distributions, which are well-suited for approximating complex multimodal and correlated posteriors in high-dimensional spaces. Our approach advances inference using implicit distributions by introducing novel bounds that come about by locally linearising the neural sampler. This is distinct from existing methods that rely on additional discriminator networks and unstable adversarial objectives. Furthermore, we present a new sampler architecture that, for the first time, enables implicit distributions over millions of latent variables, addressing computational concerns by using differentiable numerical approximations. Our empirical analysis indicates our method is capable of recovering correlations across layers in large Bayesian neural networks, a property that is crucial for a network's performance but notoriously challenging to achieve. To the best of our knowledge, no other method has been shown to accomplish this task for such large models. Through experiments in downstream tasks, we demonstrate that our expressive posteriors outperform state-of-the-art uncertainty quantification methods, validating the effectiveness of our training algorithm and the quality of the learned implicit approximation.

We study probability density functions that are log-concave. Despite the space of all such densities being infinite-dimensional, the maximum likelihood estimate is the exponential of a piecewise linear function determined by finitely many quantities, namely the function values, or heights, at the data points. We explore in what sense exact solutions to this problem are possible. First, we show that the heights given by the maximum likelihood estimate are generically transcendental. For a cell in one dimension, the maximum likelihood estimator is expressed in closed form using the generalized W-Lambert function. Even more, we show that finding the log-concave maximum likelihood estimate is equivalent to solving a collection of polynomial-exponential systems of a special form. Even in the case of two equations, very little is known about solutions to these systems. As an alternative, we use Smale's alpha-theory to refine approximate numerical solutions and to certify solutions to log-concave density estimation.

Recent work on speech representation models jointly pre-trained with text has demonstrated the potential of improving speech representations by encoding speech and text in a shared space. In this paper, we leverage such shared representations to address the persistent challenge of limited data availability in spoken language understanding tasks. By employing a pre-trained speech-text model, we find that models fine-tuned on text can be effectively transferred to speech testing data. With as little as 1 hour of labeled speech data, our proposed approach achieves comparable performance on spoken language understanding tasks (specifically, sentiment analysis and named entity recognition) when compared to previous methods using speech-only pre-trained models fine-tuned on 10 times more data. Beyond the proof-of-concept study, we also analyze the latent representations. We find that the bottom layers of speech-text models are largely task-agnostic and align speech and text representations into a shared space, while the top layers are more task-specific.

Self-adjusting data structures are a classic approach to adapting the complexity of operations to the data access distribution. While several self-adjusting variants are known for both binary search trees and B-Trees, existing constructions come with limitations. For instance, existing works on self-adjusting B-Trees do not provide static-optimality and tend to be complex and inefficient to implement in practice. In this paper, we provide a new approach to build efficient self-adjusting search trees based on state-of-the-art non-adaptive structures. We illustrate our approach to obtain a new efficient self-adjusting Interpolation Search Tree (IST) and B-Tree, as well as a new self-adjusting tree called the Log Tree. Of note, our self-adjusting IST has expected complexity in $O(\log \frac{\log m}{\log ac(x)})$, where $m$ is the total number of requests and $ac(x)$ is the number of requests to key $x$. Our technique leads to simple constructions with a reduced number of pointer manipulations: this improves cache efficiency and even allows an efficient concurrent implementation.

Analytical dexterous grasping synthesis is often driven by grasp quality metrics. However, existing metrics possess many problems, such as being computationally expensive, physically inaccurate, and non-differentiable. Moreover, none of them can facilitate the synthesis of non-force-closure grasps, which account for a significant portion of task-oriented grasping such as lid screwing and button pushing. The main challenge behind all the above drawbacks is the difficulty in modeling the complex Grasp Wrench Space (GWS). In this work, we overcome this challenge by proposing a novel GWS estimator, thus enabling gradient-based task-oriented dexterous grasp synthesis for the first time. Our key contribution is a fast, accurate, and differentiable technique to estimate the GWS boundary with good physical interpretability by parallel sampling and mapping, which does not require iterative optimization. Second, based on our differentiable GWS estimator, we derive a task-oriented energy function to enable gradient-based grasp synthesis and a metric to evaluate non-force-closure grasps. Finally, we improve the previous dexterous grasp synthesis pipeline mainly by a novel technique to make nearest-point calculation differentiable, even on mesh edges and vertices. Extensive experiments are performed to verify the efficiency and effectiveness of our methods. Our GWS estimator can run in several milliseconds on GPUs with minimal memory cost, more than three orders of magnitude faster than the classic discretization-based method. Using this GWS estimator, we synthesize 0.1 million dexterous grasps to show that our pipeline can significantly outperform the SOTA method, even in task-unaware force-closure-grasp synthesis. For task-oriented grasp synthesis, we provide some qualitative results. Our project page is //pku-epic.github.io/TaskDexGrasp/.

Cost-guided bottom-up search (BUS) algorithms use a cost function to guide the search to solve program synthesis tasks. In this paper, we show that current state-of-the-art cost-guided BUS algorithms suffer from a common problem: they can lose useful information given by the model and fail to perform the search in a best-first order according to a cost function. We introduce a novel best-first bottom-up search algorithm, which we call Bee Search, that does not suffer information loss and is able to perform cost-guided bottom-up synthesis in a best-first manner. Importantly, Bee Search performs best-first search with respect to the generation of programs, i.e., it does not even create in memory programs that are more expensive than the solution program. It attains best-first ordering with respect to generation by performing a search in an abstract space of program costs. We also introduce a new cost function that better uses the information provided by an existing cost model. Empirical results on string manipulation and bit-vector tasks show that Bee Search can outperform existing cost-guided BUS approaches when employing more complex domain-specific languages (DSLs); Bee Search and previous approaches perform equally well with simpler DSLs. Furthermore, our new cost function with Bee Search outperforms previous cost functions on string manipulation tasks.

北京阿比特科技有限公司