亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

·

語言模型化 · GPT-4 · MoDELS · INFORMS · Performer ·

2023 年 10 月 26 日

Cultural Adaptation of Recipes

Yong Cao,Yova Kementchedjhieva,Ruixiang Cui,Antonia Karamolegkou,Li Zhou,Megan Dare,Lucia Donatelli,Daniel Hershcovich

from arxiv, Accepted to TACL

Building upon the considerable advances in Large Language Models (LLMs), we are now equipped to address more sophisticated tasks demanding a nuanced understanding of cross-cultural contexts. A key example is recipe adaptation, which goes beyond simple translation to include a grasp of ingredients, culinary techniques, and dietary preferences specific to a given culture. We introduce a new task involving the translation and cultural adaptation of recipes between Chinese and English-speaking cuisines. To support this investigation, we present CulturalRecipes, a unique dataset comprised of automatically paired recipes written in Mandarin Chinese and English. This dataset is further enriched with a human-written and curated test set. In this intricate task of cross-cultural recipe adaptation, we evaluate the performance of various methods, including GPT-4 and other LLMs, traditional machine translation, and information retrieval techniques. Our comprehensive analysis includes both automatic and human evaluation metrics. While GPT-4 exhibits impressive abilities in adapting Chinese recipes into English, it still lags behind human expertise when translating English recipes into Chinese. This underscores the multifaceted nature of cultural adaptations. We anticipate that these insights will significantly contribute to future research on culturally-aware language models and their practical application in culturally diverse contexts.

相關內容

語言模型化

語言模型化

INFORMS · 相關系數 · Principle · 泛化理論 · 自由能 ·

2023 年 12 月 13 日

Thermodynamics of Internal Correlations

from arxiv, 46 pages, 5 figures

Previous research has consistently affirmed that Maxwell's demon must adhere to the second law of thermodynamics. Yet, the unresolved question remains whether the profitability and indispensability of information, which we routinely take for granted, are based on constraints stemming from physical laws. This paper reports a novel generalization of the second law of thermodynamics, answering that when internal correlations, i.e., correlations between subsystems of resource, are intended to be exploited, information is indispensable to extract free energy. Furthermore, the internal correlations, which can grow linearly with the number of subsystems in the resource, allow for control with information that yields significant gains, dwarfing the negligible operational costs in the thermodynamic limit. Thus, the generalized second law presented herein can be interpreted as a fundamental physical principle that ensures the benefit and inevitability of information processing in thermodynamics.

INFORMS · 論文 · 人工智能 ·

2023 年 12 月 13 日

The Logic of Doxastic Strategies

Junli Jiang,Pavel Naumov

from arxiv, Proceedings of the 38th Annual AAAI Conference on Artificial Intelligence (AAAI-24)

In many real-world situations, there is often not enough information to know that a certain strategy will succeed in achieving the goal, but there is a good reason to believe that it will. The paper introduces the term ``doxastic'' for such strategies. The main technical contribution is a sound and complete logical system that describes the interplay between doxastic strategy and belief modalities.

樣本 · MoDELS · CIFAR-10 · 采樣法 · 截斷誤差 ·

2023 年 12 月 12 日

A Unified Sampling Framework for Solver Searching of Diffusion Probabilistic Models

Enshu Liu,Xuefei Ning,Huazhong Yang,Yu Wang

Recent years have witnessed the rapid progress and broad application of diffusion probabilistic models (DPMs). Sampling from DPMs can be viewed as solving an ordinary differential equation (ODE). Despite the promising performance, the generation of DPMs usually consumes much time due to the large number of function evaluations (NFE). Though recent works have accelerated the sampling to around 20 steps with high-order solvers, the sample quality with less than 10 NFE can still be improved. In this paper, we propose a unified sampling framework (USF) to study the optional strategies for solver. Under this framework, we further reveal that taking different solving strategies at different timesteps may help further decrease the truncation error, and a carefully designed \emph{solver schedule} has the potential to improve the sample quality by a large margin. Therefore, we propose a new sampling framework based on the exponential integral formulation that allows free choices of solver strategy at each step and design specific decisions for the framework. Moreover, we propose $S^3$, a predictor-based search method that automatically optimizes the solver schedule to get a better time-quality trade-off of sampling. We demonstrate that $S^3$ can find outstanding solver schedules which outperform the state-of-the-art sampling methods on CIFAR-10, CelebA, ImageNet, and LSUN-Bedroom datasets. Specifically, we achieve 2.69 FID with 10 NFE and 6.86 FID with 5 NFE on CIFAR-10 dataset, outperforming the SOTA method significantly. We further apply $S^3$ to Stable-Diffusion model and get an acceleration ratio of 2$\times$, showing the feasibility of sampling in very few steps without retraining the neural network.

Sphering · Pair · 博弈論 ·

2023 年 12 月 12 日

The Topology of Poker

Laurent Bartholdi,Roman Mikhailov

from arxiv, Added URL of data; corrected remark about contractibility

We examine the complexity of the ``Texas Hold'em'' variant of poker from a topological perspective. We show that there exists a natural simplicial complex governing the multi-way winning probabilities between various hands, and that this simplicial complex contains $4$-dimensional spheres as induced subcomplexes. We deduce that evaluating the strength of a pair of cards in Texas Hold'em is an intricate problem, and that even the notion of who is bluffing against whom is ill-defined in some situations.

正則化項 · 線性的 · 極小點 · 統計量 · SimPLe ·

2023 年 12 月 11 日

Automatic Regularization for Linear MMSE Filters

Daniel Gomes de Pinho Zanco,Leszek Szczecinski,Jacob Benesty

In this work, we consider the problem of regularization in minimum mean-squared error (MMSE) linear filters. Exploiting the relationship with statistical machine learning methods, the regularization parameter is found from the observed signals in a simple and automatic manner. The proposed approach is illustrated through system identification examples, where the automatic regularization yields near-optimal results.

Principle · Extensibility · state-of-the-art · 講稿 · 設計 ·

2023 年 12 月 9 日

First Principles of Big Memory Systems

In this paper, we comprehensively analyze the vertical and horizontal extensions of existing memory hierarchy. The difference between memory and big memory is well reported. We present the state-of-the-art studies upon the big memory systems, together with design methodology and implementations. Persistence is the first principle of big memory systems. We further show the full-stack and moving persistence.

Analysis · INFORMS · Machine Learning · Learning · INTERACT ·

2023 年 12 月 6 日

Feature Analysis of Encrypted Malicious Traffic

Anish Singh Shekhawat,Fabio Di Troia,Mark Stamp

In recent years there has been a dramatic increase in the number of malware attacks that use encrypted HTTP traffic for self-propagation or communication. Antivirus software and firewalls typically will not have access to encryption keys, and therefore direct detection of malicious encrypted data is unlikely to succeed. However, previous work has shown that traffic analysis can provide indications of malicious intent, even in cases where the underlying data remains encrypted. In this paper, we apply three machine learning techniques to the problem of distinguishing malicious encrypted HTTP traffic from benign encrypted traffic and obtain results comparable to previous work. We then consider the problem of feature analysis in some detail. Previous work has often relied on human expertise to determine the most useful and informative features in this problem domain. We demonstrate that such feature-related information can be obtained directly from machine learning models themselves. We argue that such a machine learning based approach to feature analysis is preferable, as it is more reliable, and we can, for example, uncover relatively unintuitive interactions between features.

向量化 · 相似度 · Processing（編程語言） · Storage · 優化器 ·

2023 年 10 月 21 日

Survey of Vector Database Management Systems

James Jie Pan,Jianguo Wang,Guoliang Li

from arxiv, 25 pages

There are now over 20 commercial vector database management systems (VDBMSs), all produced within the past five years. But embedding-based retrieval has been studied for over ten years, and similarity search a staggering half century and more. Driving this shift from algorithms to systems are new data intensive applications, notably large language models, that demand vast stores of unstructured data coupled with reliable, secure, fast, and scalable query processing capability. A variety of new data management techniques now exist for addressing these needs, however there is no comprehensive survey to thoroughly review these techniques and systems. We start by identifying five main obstacles to vector data management, namely vagueness of semantic similarity, large size of vectors, high cost of similarity comparison, lack of natural partitioning that can be used for indexing, and difficulty of efficiently answering hybrid queries that require both attributes and vectors. Overcoming these obstacles has led to new approaches to query processing, storage and indexing, and query optimization and execution. For query processing, a variety of similarity scores and query types are now well understood; for storage and indexing, techniques include vector compression, namely quantization, and partitioning based on randomization, learning partitioning, and navigable partitioning; for query optimization and execution, we describe new operators for hybrid queries, as well as techniques for plan enumeration, plan selection, and hardware accelerated execution. These techniques lead to a variety of VDBMSs across a spectrum of design and runtime characteristics, including native systems specialized for vectors and extended systems that incorporate vector capabilities into existing systems. We then discuss benchmarks, and finally we outline research challenges and point the direction for future work.

語言模型化 · MoDELS · 泛化理論 · 可辨認的 · Continuity ·

2023 年 7 月 12 日

A Comprehensive Overview of Large Language Models

Humza Naveed,Asad Ullah Khan,Shi Qiu,Muhammad Saqib,Saeed Anwar,Muhammad Usman,Nick Barnes,Ajmal Mian

Large Language Models (LLMs) have shown excellent generalization capabilities that have led to the development of numerous models. These models propose various new architectures, tweaking existing architectures with refined training strategies, increasing context length, using high-quality training data, and increasing training time to outperform baselines. Analyzing new developments is crucial for identifying changes that enhance training stability and improve generalization in LLMs. This survey paper comprehensively analyses the LLMs architectures and their categorization, training strategies, training datasets, and performance evaluations and discusses future research directions. Moreover, the paper also discusses the basic building blocks and concepts behind LLMs, followed by a complete overview of LLMs, including their important features and functions. Finally, the paper summarizes significant findings from LLM research and consolidates essential architectural and training strategies for developing advanced LLMs. Given the continuous advancements in LLMs, we intend to regularly update this paper by incorporating new sections and featuring the latest LLM models.

Networking · 殘差網絡 · 縮放 · Weight · 平滑 ·

2021 年 5 月 25 日

Scaling Properties of Deep Residual Networks

Alain-Sam Cohen,Rama Cont,Alain Rossier,Renyuan Xu

from arxiv, Published at ICML 2021

Residual networks (ResNets) have displayed impressive results in pattern recognition and, recently, have garnered considerable theoretical interest due to a perceived link with neural ordinary differential equations (neural ODEs). This link relies on the convergence of network weights to a smooth function as the number of layers increases. We investigate the properties of weights trained by stochastic gradient descent and their scaling with network depth through detailed numerical experiments. We observe the existence of scaling regimes markedly different from those assumed in neural ODE literature. Depending on certain features of the network architecture, such as the smoothness of the activation function, one may obtain an alternative ODE limit, a stochastic differential equation or neither of these. These findings cast doubts on the validity of the neural ODE model as an adequate asymptotic description of deep ResNets and point to an alternative class of differential equations as a better description of the deep network limit.

閱讀: 0 點贊: 0

小貼士

登錄享

相關主題

語言模型化

北京阿比特科技有限公司

注冊地址：北京市海淀區羊坊店路18號2幢3層301-191

<tfoot id='mhlnq'></tfoot>

<legend id='mhlnq'><style id='mhlnq'><dir id='mhlnq'><q id='mhlnq'></q></dir></style></legend>

<i id='mhlnq'><tr id='mhlnq'><dt id='mhlnq'><q id='mhlnq'><span id='mhlnq'><b id='mhlnq'><form id='mhlnq'><ins id='mhlnq'></ins><ul id='mhlnq'></ul><sub id='mhlnq'></sub></form><legend id='mhlnq'></legend><bdo id='mhlnq'><pre id='mhlnq'><center id='mhlnq'></center></pre></bdo></b><th id='mhlnq'></th></span></q></dt></tr></i><div id='mhlnq'><tfoot id='mhlnq'></tfoot><dl id='mhlnq'><fieldset id='mhlnq'></fieldset></dl></div>