国产日黄色大片一区二区_亚洲国产最新AV片_精品日韩一区一区三区四区_天天日天天射天天插天天干_国产又色又爽又黄在线观看网站_久久久久精品无码一区二区_亚洲欧美日产综合一区二区三区

In this paper, we investigate the impact of stochasticity and large stepsizes on the implicit regularisation of gradient descent (GD) and stochastic gradient descent (SGD) over diagonal linear networks. We prove the convergence of GD and SGD with macroscopic stepsizes in an overparametrised regression setting and characterise their solutions through an implicit regularisation problem. Our crisp characterisation leads to qualitative insights about the impact of stochasticity and stepsizes on the recovered solution. Specifically, we show that large stepsizes consistently benefit SGD for sparse regression problems, while they can hinder the recovery of sparse solutions for GD. These effects are magnified for stepsizes in a tight window just below the divergence threshold, in the "edge of stability" regime. Our findings are supported by experimental results.

相關內容

通用動力公司

關注 1

通用動力公司（General Dynamics）是一家美國的國防企業集團。2008年時通用動力是世界第五大國防工業承包商。由于近年來不斷的擴充和并購其他公司，通用動力現今的組成與面貌已與冷戰時期時大不相同。現今通用動力包含三大業務集團：海洋、作戰系統和資訊科技集團。

情景 · MoDELS · 可辨認的 · 有向非循環圖 · Continuity ·

2023 年 12 月 13 日

How to Select Covariates for Imputation-Based Regression Calibration Method -- A Causal Perspective

Wenze Tang,Donna Spiegelman,Yujie Wu,Molin Wang

from arxiv, 11 pages, 4 tables, 3 figures. arXiv admin note: text overlap with arXiv:2212.00795

In this paper, we identify the criteria for the selection of the minimal and most efficient covariate adjustment sets for the regression calibration method developed by Carroll, Rupert and Stefanski (CRS, 1992), used to correct bias due to continuous exposure measurement error. We utilize directed acyclic graphs to illustrate how subject matter knowledge can aid in the selection of such adjustment sets. Valid measurement error correction requires the collection of data on any (1) common causes of true exposure and outcome and (2) common causes of measurement error and outcome, in both the main study and validation study. For the CRS regression calibration method to be valid, researchers need to minimally adjust for covariate set (1) in both the measurement error model (MEM) and the outcome model and adjust for covariate set (2) at least in the MEM. In practice, we recommend including the minimal covariate adjustment set in both the MEM and the outcome model. In contrast with the regression calibration method developed by Rosner, Spiegelman and Willet, it is valid and more efficient to adjust for correlates of the true exposure or of measurement error that are not risk factors in the MEM only under CRS method. We applied the proposed covariate selection approach to the Health Professional Follow-up Study, examining the effect of fiber intake on cardiovascular incidence. In this study, we demonstrated potential issues with a data-driven approach to building the MEM that is agnostic to the structural assumptions. We extend the originally proposed estimators to settings where effect modification by a covariate is allowed. Finally, we caution against the use of the regression calibration method to calibrate the true nutrition intake using biomarkers.

多峰值 · 通用智能 · Integration · Learning · AI ·

2023 年 12 月 12 日

Multimodality of AI for Education: Towards Artificial General Intelligence

Gyeong-Geon Lee,Lehong Shi,Ehsan Latif,Yizhu Gao,Arne Bewersdorff,Matthew Nyaaba,Shuchen Guo,Zihao Wu,Zhengliang Liu,Hui Wang,Gengchen Mai,Tiaming Liu,Xiaoming Zhai

This paper presents a comprehensive examination of how multimodal artificial intelligence (AI) approaches are paving the way towards the realization of Artificial General Intelligence (AGI) in educational contexts. It scrutinizes the evolution and integration of AI in educational systems, emphasizing the crucial role of multimodality, which encompasses auditory, visual, kinesthetic, and linguistic modes of learning. This research delves deeply into the key facets of AGI, including cognitive frameworks, advanced knowledge representation, adaptive learning mechanisms, strategic planning, sophisticated language processing, and the integration of diverse multimodal data sources. It critically assesses AGI's transformative potential in reshaping educational paradigms, focusing on enhancing teaching and learning effectiveness, filling gaps in existing methodologies, and addressing ethical considerations and responsible usage of AGI in educational settings. The paper also discusses the implications of multimodal AI's role in education, offering insights into future directions and challenges in AGI development. This exploration aims to provide a nuanced understanding of the intersection between AI, multimodality, and education, setting a foundation for future research and development in AGI.

控制器 · MoDELS · Unstructured · 自編碼器 · 變分自編碼 ·

2023 年 12 月 12 日

MuscleVAE: Model-Based Controllers of Muscle-Actuated Characters

Yusen Feng,Xiyan Xu,Libin Liu

In this paper, we present a simulation and control framework for generating biomechanically plausible motion for muscle-actuated characters. We incorporate a fatigue dynamics model, the 3CC-r model, into the widely-adopted Hill-type muscle model to simulate the development and recovery of fatigue in muscles, which creates a natural evolution of motion style caused by the accumulation of fatigue from prolonged activities. To address the challenging problem of controlling a musculoskeletal system with high degrees of freedom, we propose a novel muscle-space control strategy based on PD control. Our simulation and control framework facilitates the training of a generative model for muscle-based motion control, which we refer to as MuscleVAE. By leveraging the variational autoencoders (VAEs), MuscleVAE is capable of learning a rich and flexible latent representation of skills from a large unstructured motion dataset, encoding not only motion features but also muscle control and fatigue properties. We demonstrate that the MuscleVAE model can be efficiently trained using a model-based approach, resulting in the production of high-fidelity motions and enabling a variety of downstream tasks.

大語言模型 · 語言模型化 · MoDELS · Automator · AIM ·

2023 年 12 月 12 日

Context Matter: Data-Efficient Augmentation of Large Language Models for Scientific Applications

Xiang Li,Haoran Tang,Siyu Chen,Ziwei Wang,Anurag Maravi,Marcin Abram

from arxiv, 11 pages, 6 figures, 4 tables, 3 pages of supplementary material

In this paper, we explore the challenges inherent to Large Language Models (LLMs) like GPT-4, particularly their propensity for hallucinations, logic mistakes, and incorrect conclusions when tasked with answering complex questions. The capacity of LLMs to present erroneous answers in a coherent and semantically rigorous manner further complicates the detection of factual inaccuracies. This issue is especially pronounced in fields that require specialized expertise. Our work delves into these challenges, aiming to enhance the understanding and mitigation of such errors, thereby contributing to the improvement of LLM accuracy and reliability in scientific and other specialized domains. Our findings reveal a non-linear relationship between the context's relevancy and the answers' measured quality. In addition, we demonstrate that with the correct calibration, it is possible to automate the grading procedure -- a finding suggesting that, at least to some degree, the LLMs can be used to self-examine the quality of their own performance. Finally, we describe an experimental platform that can be seen as a proof-of-concept of the techniques described in this work.

Networking · Neural Networks · Weight · 正則化項 · Analysis ·

2023 年 12 月 12 日

From Complexity to Clarity: Analytical Expressions of Deep Neural Network Weights via Clifford's Geometric Algebra and Convexity

Mert Pilanci

In this paper, we introduce a novel analysis of neural networks based on geometric (Clifford) algebra and convex optimization. We show that optimal weights of deep ReLU neural networks are given by the wedge product of training samples when trained with standard regularized loss. Furthermore, the training problem reduces to convex optimization over wedge product features, which encode the geometric structure of the training dataset. This structure is given in terms of signed volumes of triangles and parallelotopes generated by data vectors. The convex problem finds a small subset of samples via $\ell_1$ regularization to discover only relevant wedge product features. Our analysis provides a novel perspective on the inner workings of deep neural networks and sheds light on the role of the hidden layers.

道德化 · 相關系數 · 分解的 · 塑造 · AIM ·

2023 年 12 月 11 日

Disentangling Perceptions of Offensiveness: Cultural and Moral Correlates

Aida Davani,Mark Díaz,Dylan Baker,Vinodkumar Prabhakaran

Perception of offensiveness is inherently subjective, shaped by the lived experiences and socio-cultural values of the perceivers. Recent years have seen substantial efforts to build AI-based tools that can detect offensive language at scale, as a means to moderate social media platforms, and to ensure safety of conversational AI technologies such as ChatGPT and Bard. However, existing approaches treat this task as a technical endeavor, built on top of data annotated for offensiveness by a global crowd workforce without any attention to the crowd workers' provenance or the values their perceptions reflect. We argue that cultural and psychological factors play a vital role in the cognitive processing of offensiveness, which is critical to consider in this context. We re-frame the task of determining offensiveness as essentially a matter of moral judgment -- deciding the boundaries of ethically wrong vs. right language within an implied set of socio-cultural norms. Through a large-scale cross-cultural study based on 4309 participants from 21 countries across 8 cultural regions, we demonstrate substantial cross-cultural differences in perceptions of offensiveness. More importantly, we find that individual moral values play a crucial role in shaping these variations: moral concerns about Care and Purity are significant mediating factors driving cross-cultural differences. These insights are of crucial importance as we build AI models for the pluralistic world, where the values they espouse should aim to respect and account for moral values in diverse geo-cultural contexts.

Learning · Performer · NMT · SimPLe · Machine Translation ·

2023 年 12 月 11 日

Order Matters in the Presence of Dataset Imbalance for Multilingual Learning

Dami Choi,Derrick Xin,Hamid Dadkhahi,Justin Gilmer,Ankush Garg,Orhan Firat,Chih-Kuan Yeh,Andrew M. Dai,Behrooz Ghorbani

In this paper, we empirically study the optimization dynamics of multi-task learning, particularly focusing on those that govern a collection of tasks with significant data imbalance. We present a simple yet effective method of pre-training on high-resource tasks, followed by fine-tuning on a mixture of high/low-resource tasks. We provide a thorough empirical study and analysis of this method's benefits showing that it achieves consistent improvements relative to the performance trade-off profile of standard static weighting. We analyze under what data regimes this method is applicable and show its improvements empirically in neural machine translation (NMT) and multi-lingual language modeling.

Weight · 卷積 · 塊 · 講稿 · 估計/估計量 ·

2023 年 12 月 10 日

Smaller Keys for the McEliece Cryptosystem: A Convolutional Variant with GRS Codes

Paulo Almeida,Miguel Beltrá,Diego Napp,Cláudia Sebasti?o

from arxiv, 25 pages. arXiv admin note: text overlap with arXiv:1804.08955

In this paper we present a variant of the McEliece cryptosystem that possesses several interesting properties, including a reduction of the public key for a given security level. In contrast to the classical McEliece cryptosystems, where block codes are used, we propose the use of a convolutional encoder to be part of the public key. The permutation matrix is substituted by a polynomial matrix whose coefficient matrices have columns with weight zero or at least weight two. This allows the use of Generalized Reed-Solomon (GRS) codes which translates into shorter keys for a given security level. Hence, the private key is constituted by a generator matrix of a GRS code and two polynomial matrices containing large parts generated completely at random. In this setting the message is a sequence of messages instead of a single block message and the errors are added throughout the sequence. We discuss possible structural and ISD attacks to this scheme. We conclude presenting the key sizes obtained for different parameters and estimating the computational cost of encryption and decryption process.

Analysis · 正則化項 · UniFormer · 收縮 · 優化器 ·

2023 年 12 月 9 日

Theoretical Foundations of Community Rating by a Private Monopolist Insurer: Framework, Regulation, and Numerical Analysis

Yann Braouezec,John Cagnol

from arxiv, The key findings haven't undergone any modifications. A handful of typos have been corrected, and the presentation has been enhanced for improved clarity

Community rating is a policy that mandates uniform premium regardless of the risk factors. In this paper, our focus narrows to the single contract interpretation wherein we establish a theoretical framework for community rating using Stiglitz's (1977) monopoly model in which there is a continuum of agents. We exhibit profitability conditions and show that, under mild regularity conditions, the optimal premium is unique and satisfies the inverse elasticity rule. Our numerical analysis, using realistic parameter values, reveals that under regulation, a 10% increase in indemnity is possible with minimal impact on other variables.

MoDELS · state-of-the-art · 噪聲 · 語言模型化 · 得分 ·

2023 年 12 月 8 日

SymNoise: Advancing Language Model Fine-tuning with Symmetric Noise

Abhay Kumar Yadav,Arjun Singh

In this paper, we introduce a novel fine-tuning technique for language models, which involves incorporating symmetric noise into the embedding process. This method aims to enhance the model's function by more stringently regulating its local curvature, demonstrating superior performance over the current method, NEFTune. When fine-tuning the LLaMA-2-7B model using Alpaca, standard techniques yield a 29.79% score on AlpacaEval. However, our approach, SymNoise, increases this score significantly to 69.04%, using symmetric noisy embeddings. This is a 6.7% improvement over the state-of-the-art method, NEFTune~(64.69%). Furthermore, when tested on various models and stronger baseline instruction datasets, such as Evol-Instruct, ShareGPT, OpenPlatypus, SymNoise consistently outperforms NEFTune. The current literature, including NEFTune, has underscored the importance of more in-depth research into the application of noise-based strategies in the fine-tuning of language models. Our approach, SymNoise, is another significant step towards this direction, showing notable improvement over the existing state-of-the-art method.