云南虫谷在线观看免费观看电视剧,国产欧美日韩综合在线

Transport phenomena (e.g., fluid flows) are governed by time-dependent partial differential equations (PDEs) describing mass, momentum, and energy conservation, and are ubiquitous in many engineering applications. However, deep learning architectures are fundamentally incompatible with the simulation of these PDEs. This paper clearly articulates and then solves this incompatibility. The local-dependency of generic transport PDEs implies that it only involves local information to predict the physical properties at a location in the next time step. However, the deep learning architecture will inevitably increase the scope of information to make such predictions as the number of layers increases, which can cause sluggish convergence and compromise generalizability. This paper aims to solve this problem by proposing a distributed data scoping method with linear time complexity to strictly limit the scope of information to predict the local properties. The numerical experiments over multiple physics show that our data scoping method significantly accelerates training convergence and improves the generalizability of benchmark models on large-scale engineering simulations. Specifically, over the geometries not included in the training data for heat transferring simulation, it can increase the accuracy of Convolutional Neural Networks (CNNs) by 21.7 \% and that of Fourier Neural Operators (FNOs) by 38.5 \% on average.

相關內容

INFORMS

關注 10

《計算機信息》雜志發表高質量的論文，擴大了運籌學和計算的范圍，尋求有關理論、方法、實驗、系統和應用方面的原創研究論文、新穎的調查和教程論文，以及描述新的和有用的軟件工具的論文。官網鏈接： · INFORMS · CASES · 講稿 · 向量化 ·

2024 年 6 月 11 日

Spin-Wave Voices: Sonification of Nanoscale Spin Waves as an Engagement and Research Tool

Santa Pile,Oleg Lesota,Silvan David Peter,Christina Humer,Martin Gasser

from arxiv, Accepted to The 29th International Conference on Auditory Display (ICAD 2024) conference proceedings

Magnonics is an emerging research field that addresses the use of spin waves (magnons), purely magnetic waves, for information transport and processing. Spin waves are a potential replacement for electric current in modern computational devices that would make them more compact and energy efficient. The field is yet little known, even among physicists. Additionally, with the development of new measuring techniques and computational physics, the obtained magnetic data becomes more complex, in some cases including 3D vector fields and time-resolution. This work presents an approach to the audio-visual representation of the spin waves and discusses its use as a tool for science communication exhibits and possible data analysis tool. The work also details an instance of such an exhibit presented at the annual international digital art exhibition Ars Electronica Festival in 2022.

簇 · 圖 · 劃分 · SCC · state-of-the-art ·

2024 年 6 月 11 日

TeraHAC: Hierarchical Agglomerative Clustering of Trillion-Edge Graphs

Laxman Dhulipala,Jason Lee,Jakub ??cki,Vahab Mirrokni

from arxiv, SIGMOD 2024

We introduce TeraHAC, a $(1+\epsilon)$-approximate hierarchical agglomerative clustering (HAC) algorithm which scales to trillion-edge graphs. Our algorithm is based on a new approach to computing $(1+\epsilon)$-approximate HAC, which is a novel combination of the nearest-neighbor chain algorithm and the notion of $(1+\epsilon)$-approximate HAC. Our approach allows us to partition the graph among multiple machines and make significant progress in computing the clustering within each partition before any communication with other partitions is needed. We evaluate TeraHAC on a number of real-world and synthetic graphs of up to 8 trillion edges. We show that TeraHAC requires over 100x fewer rounds compared to previously known approaches for computing HAC. It is up to 8.3x faster than SCC, the state-of-the-art distributed algorithm for hierarchical clustering, while achieving 1.16x higher quality. In fact, TeraHAC essentially retains the quality of the celebrated HAC algorithm while significantly improving the running time.

均值 · 簇 · 參數化模型 · MoDELS · 泛函 ·

2024 年 6 月 11 日

Bayesian Parametric Methods for Deriving Distribution of Restricted Mean Survival Time

Keisuke Hanada,Masahiro Kojima

We propose a Bayesian method for deriving the distribution of restricted mean survival time (RMST) using posterior samples, which accounts for covariates and heterogeneity among clusters based on a parametric model for survival time. We derive an explicit RMST equation by devising an integral of the survival function, allowing for the calculation of not only the mean and credible interval but also the mode, median, and probability of exceeding a certain value. Additionally, We propose two methods: one using random effects to account for heterogeneity among clusters and another utilizing frailty. We developed custom Stan code for the exponential, Weibull, log-normal frailty, and log-logistic models, as they cannot be processed using the brm functions in R. We evaluate our proposed methods through computer simulations and analyze real data from the eight Empowered Action Group states in India to confirm consistent results across states after adjusting for cluster differences. In conclusion, we derived explicit RMST formulas for parametric models and their distributions, enabling the calculation of the mean, median, mode, and credible interval. Our simulations confirmed the robustness of the proposed methods, and using the shrinkage effect allowed for more accurate results for each cluster.

Learning · Continuity · 知識 (knowledge) · MoDELS · GROUP ·

2024 年 6 月 10 日

Towards Lifelong Learning of Large Language Models: A Survey

Junhao Zheng,Shengjie Qiu,Chengming Shi,Qianli Ma

from arxiv, 37 pages

As the applications of large language models (LLMs) expand across diverse fields, the ability of these models to adapt to ongoing changes in data, tasks, and user preferences becomes crucial. Traditional training methods, relying on static datasets, are increasingly inadequate for coping with the dynamic nature of real-world information. Lifelong learning, also known as continual or incremental learning, addresses this challenge by enabling LLMs to learn continuously and adaptively over their operational lifetime, integrating new knowledge while retaining previously learned information and preventing catastrophic forgetting. This survey delves into the sophisticated landscape of lifelong learning, categorizing strategies into two primary groups: Internal Knowledge and External Knowledge. Internal Knowledge includes continual pretraining and continual finetuning, each enhancing the adaptability of LLMs in various scenarios. External Knowledge encompasses retrieval-based and tool-based lifelong learning, leveraging external data sources and computational tools to extend the model's capabilities without modifying core parameters. The key contributions of our survey are: (1) Introducing a novel taxonomy categorizing the extensive literature of lifelong learning into 12 scenarios; (2) Identifying common techniques across all lifelong learning scenarios and classifying existing literature into various technique groups within each scenario; (3) Highlighting emerging techniques such as model expansion and data selection, which were less explored in the pre-LLM era. Through a detailed examination of these groups and their respective categories, this survey aims to enhance the adaptability, reliability, and overall performance of LLMs in real-world applications.

損失 · 通用動力公司 · 對率損失 · 分離的 · 優化器 ·

2024 年 6 月 10 日

Large Stepsize Gradient Descent for Logistic Loss: Non-Monotonicity of the Loss Improves Optimization Efficiency

Jingfeng Wu,Peter L. Bartlett,Matus Telgarsky,Bin Yu

from arxiv, COLT 2024 camera ready

We consider gradient descent (GD) with a constant stepsize applied to logistic regression with linearly separable data, where the constant stepsize $\eta$ is so large that the loss initially oscillates. We show that GD exits this initial oscillatory phase rapidly -- in $\mathcal{O}(\eta)$ steps -- and subsequently achieves an $\tilde{\mathcal{O}}(1 / (\eta t) )$ convergence rate after $t$ additional steps. Our results imply that, given a budget of $T$ steps, GD can achieve an accelerated loss of $\tilde{\mathcal{O}}(1/T^2)$ with an aggressive stepsize $\eta:= \Theta( T)$, without any use of momentum or variable stepsize schedulers. Our proof technique is versatile and also handles general classification loss functions (where exponential tails are needed for the $\tilde{\mathcal{O}}(1/T^2)$ acceleration), nonlinear predictors in the neural tangent kernel regime, and online stochastic gradient descent (SGD) with a large stepsize, under suitable separability conditions.

優化器 · 泛函 · Performer · Learning · 全局優化 ·

2024 年 6 月 7 日

FunBO: Discovering Acquisition Functions for Bayesian Optimization with FunSearch

Virginia Aglietti,Ira Ktena,Jessica Schrouff,Eleni Sgouritsa,Francisco J. R. Ruiz,Alexis Bellot,Silvia Chiappa

The sample efficiency of Bayesian optimization algorithms depends on carefully crafted acquisition functions (AFs) guiding the sequential collection of function evaluations. The best-performing AF can vary significantly across optimization problems, often requiring ad-hoc and problem-specific choices. This work tackles the challenge of designing novel AFs that perform well across a variety of experimental settings. Based on FunSearch, a recent work using Large Language Models (LLMs) for discovery in mathematical sciences, we propose FunBO, an LLM-based method that can be used to learn new AFs written in computer code by leveraging access to a limited number of evaluations for a set of objective functions. We provide the analytic expression of all discovered AFs and evaluate them on various global optimization benchmarks and hyperparameter optimization tasks. We show how FunBO identifies AFs that generalize well in and out of the training distribution of functions, thus outperforming established general-purpose AFs and achieving competitive performance against AFs that are customized to specific function types and are learned via transfer-learning algorithms.

可理解性 · 語言模型化 · MoDELS · 大語言模型 · INTERACT ·

2024 年 6 月 6 日

ValueBench: Towards Comprehensively Evaluating Value Orientations and Understanding of Large Language Models

Yuanyi Ren,Haoran Ye,Hanjun Fang,Xin Zhang,Guojie Song

from arxiv, Accepted at ACL 2024

Large Language Models (LLMs) are transforming diverse fields and gaining increasing influence as human proxies. This development underscores the urgent need for evaluating value orientations and understanding of LLMs to ensure their responsible integration into public-facing applications. This work introduces ValueBench, the first comprehensive psychometric benchmark for evaluating value orientations and value understanding in LLMs. ValueBench collects data from 44 established psychometric inventories, encompassing 453 multifaceted value dimensions. We propose an evaluation pipeline grounded in realistic human-AI interactions to probe value orientations, along with novel tasks for evaluating value understanding in an open-ended value space. With extensive experiments conducted on six representative LLMs, we unveil their shared and distinctive value orientations and exhibit their ability to approximate expert conclusions in value-related extraction and generation tasks. ValueBench is openly accessible at //github.com/Value4AI/ValueBench.

點云 · 機器人 · Performer · Learning · RGB-D ·

2024 年 6 月 6 日

Point Cloud Matters: Rethinking the Impact of Different Observation Spaces on Robot Learning

Haoyi Zhu,Yating Wang,Di Huang,Weicai Ye,Wanli Ouyang,Tong He

In robot learning, the observation space is crucial due to the distinct characteristics of different modalities, which can potentially become a bottleneck alongside policy design. In this study, we explore the influence of various observation spaces on robot learning, focusing on three predominant modalities: RGB, RGB-D, and point cloud. We introduce OBSBench, a benchmark comprising two simulators and 125 tasks, along with standardized pipelines for various encoders and policy baselines. Extensive experiments on diverse contact-rich manipulation tasks reveal a notable trend: point cloud-based methods, even those with the simplest designs, frequently outperform their RGB and RGB-D counterparts. This trend persists in both scenarios: training from scratch and utilizing pre-training. Furthermore, our findings demonstrate that point cloud observations often yield better policy performance and significantly stronger generalization capabilities across various geometric and visual conditions. These outcomes suggest that the 3D point cloud is a valuable observation modality for intricate robotic tasks. We also suggest that incorporating both appearance and coordinate information can enhance the performance of point cloud methods. We hope our work provides valuable insights and guidance for designing more generalizable and robust robotic models. Codes are available at //github.com/HaoyiZhu/PointCloudMatters.

控制器 · 操作 · INFORMS · Performer · 泛函 ·

2024 年 6 月 5 日

The Semantics of Effects: Centrality, Quantum Control and Reversible Recursion

Louis Lemonnier

from arxiv, PhD thesis from Universit\'e Paris-Saclay

This thesis revolves around an area of computer science called "semantics". We work with operational semantics, equational theories, and denotational semantics. The first contribution of this thesis is a study of the commutativity of effects through the prism of monads. Monads are the generalisation of algebraic structures such as monoids, which have a notion of centre: the centre of a monoid is made of elements which commute with all others. We provide the necessary and sufficient conditions for a monad to have a centre. We also detail the semantics of a language with effects that carry information on which effects are central. Moreover, we provide a strong link between its equational theories and its denotational semantics. The second focus of the thesis is quantum computing, seen as a reversible effect. Physically permissible quantum operations are all reversible, except measurement; however, measurement can be deferred at the end of the computation, allowing us to focus on the reversible part first. We define a simply-typed reversible programming language performing quantum operations called "unitaries". A denotational semantics and an equational theory adapted to the language are presented, and we prove that the former is complete. Furthermore, we study recursion in reversible programming, providing adequate operational and denotational semantics to a Turing-complete, reversible, functional programming language. The denotational semantics uses the dcpo enrichment of rig join inverse categories. This mathematical account of higher-order reasoning on reversible computing does not directly generalise to its quantum counterpart. In the conclusion, we detail the limitations and possible future for higher-order quantum control through guarded recursion.

INFORMS · 機器閱讀理解 · Better · Extensibility · 數據集 ·

2020 年 12 月 14 日

Reasoning in Dialog: Improving Response Generation by Context Reading Comprehension

Xiuying Chen,Zhi Cui,Jiayi Zhang,Chen Wei,Jianwei Cui,Bin Wang,Dongyan Zhao,Rui Yan

from arxiv, 9 pages, 1 figure

In multi-turn dialog, utterances do not always take the full form of sentences \cite{Carbonell1983DiscoursePA}, which naturally makes understanding the dialog context more difficult. However, it is essential to fully grasp the dialog context to generate a reasonable response. Hence, in this paper, we propose to improve the response generation performance by examining the model's ability to answer a reading comprehension question, where the question is focused on the omitted information in the dialog. Enlightened by the multi-task learning scheme, we propose a joint framework that unifies these two tasks, sharing the same encoder to extract the common and task-invariant features with different decoders to learn task-specific features. To better fusing information from the question and the dialog history in the encoding part, we propose to augment the Transformer architecture with a memory updater, which is designed to selectively store and update the history dialog information so as to support downstream tasks. For the experiment, we employ human annotators to write and examine a large-scale dialog reading comprehension dataset. Extensive experiments are conducted on this dataset, and the results show that the proposed model brings substantial improvements over several strong baselines on both tasks. In this way, we demonstrate that reasoning can indeed help better response generation and vice versa. We release our large-scale dataset for further research.