云南虫谷在线观看免费观看电视剧-亚洲日本文字天天更新

Generative Artificial Intelligence (GAI) has high potential to help address a diversity of educational challenges. In principle, GAI could facilitate the implementation of interactive and empowering pedagogical activities to complement the standard teaching strategies and favor students active engagement, understanding and control over their learning processes. These dimensions are indeed fundamental for a better learning experience and longer-lasting cognitive outcomes. However, several characteristics of the interactions with GAI such as continuous confidence in the generated answers, and the lack of pedagogical stance in their behavior may lead students to poor states of control over learning (e.g. over-reliance on pre-generated content, over-estimation of one's own knowledge, loss of curious and critical-thinking sense, etc). The fine line between the two settings seems to lie in how this technology is used to carry out the pedagogical activities (e.g. types of interactions allowed, level of controllability by students, level of involvement of educators, etc) as well as to what extent students have the relevant skills (cognitive, metacognitive and GAI literacy) that allow them to correctly evaluate, analyze and interpret the system behaviors. In this context, this article proposes to identify some of the opportunities and challenges that could arise wrt students control over their learning when using GAI during formal pedagogical activities. In a second step, we also discuss the types of trainings that could be relevant to offer students in order to provide them with the appropriate set of skills that can help them use GAI in informed ways, when pursuing a given learning goal.

相關內容

INTERACT

關注 5

IFIP TC13 Conference on Human-Computer Interaction是人機交互領域的研究者和實踐者展示其工作的重要平臺。多年來，這些會議吸引了來自幾個國家和文化的研究人員。官網鏈接： · Learning · 查準率/準確率 · 學習率 · Weight ·

2023 年 11 月 19 日

Large Learning Rates Improve Generalization: But How Large Are We Talking About?

Ekaterina Lobacheva,Eduard Pockonechnyy,Maxim Kodryan,Dmitry Vetrov

from arxiv, Published in Mathematics of Modern Machine Learning Workshop at NeurIPS 2023. First two authors contributed equally

Inspired by recent research that recommends starting neural networks training with large learning rates (LRs) to achieve the best generalization, we explore this hypothesis in detail. Our study clarifies the initial LR ranges that provide optimal results for subsequent training with a small LR or weight averaging. We find that these ranges are in fact significantly narrower than generally assumed. We conduct our main experiments in a simplified setup that allows precise control of the learning rate hyperparameter and validate our key findings in a more practical setting.

再縮放 · 線性的 · 縮放 · 估計/估計量 · 廣義線性模型 ·

2023 年 11 月 19 日

Generalized Linear Models via the Lasso: To Scale or Not to Scale?

Anant Mathur,Sarat Moka,Zdravko Botev

The Lasso regression is a popular regularization method for feature selection in statistics. Prior to computing the Lasso estimator in both linear and generalized linear models, it is common to conduct a preliminary rescaling of the feature matrix to ensure that all the features are standardized. Without this standardization, it is argued, the Lasso estimate will unfortunately depend on the units used to measure the features. We propose a new type of iterative rescaling of the features in the context of generalized linear models. Whilst existing Lasso algorithms perform a single scaling as a preprocessing step, the proposed rescaling is applied iteratively throughout the Lasso computation until convergence. We provide numerical examples, with both real and simulated data, illustrating that the proposed iterative rescaling can significantly improve the statistical performance of the Lasso estimator without incurring any significant additional computational cost.

可約的 · Extensibility · INFORMS · Performer · 方陣 ·

2023 年 11 月 17 日

How Much Time is Required for Phase Shift Delivery in RIS-Aided Wireless Systems?

Hao Xie,Dong Li

Reconfigurable intelligent surface (RIS) has become a focal point of extensive research due to its remarkable "squared gain". However, achieving a substantial beamforming gain typically requires a significant number of elements, which leads to a non-negligible overhead that forwards the coherent phase shift to the RIS. Different from previous works, which primarily focus on the information transmission phase, we consider the phase delivery overhead during the phase-shift delivery phase to explore the trade-off between performance and overhead. To reduce the phase delivery overhead via the control link, we introduce a hybrid phase shift mechanism, encompassing both the coherent and fixed phase shifts. Specifically, a beamforming problem is formulated for maximizing the throughput. In light of the intractability of the problem, we develop an alternating optimization-based iterative algorithm by combining quadratic transformation and successive convex approximation. To gain more insights, we derive the closed-form expression of the number of elements adopting the coherent phase shift in the large signal-to-noise ratio region. This expression serves as a valuable guide for the practical implementation of the RIS technology. Our simulation results conclusively demonstrate the effectiveness of the proposed algorithm in achieving a favorable trade-off between throughput and overhead. Furthermore, the introduction of the hybrid phase shift approach significantly reduces phase delivery overhead while concurrently enhancing the system throughput.

多樣性 · MoDELS · Prompt · 查全率/召回率 · 話題 ·

2023 年 11 月 16 日

How Far Can We Extract Diverse Perspectives from Large Language Models? Criteria-Based Diversity Prompting!

Shirley Anugrah Hayati,Minhwa Lee,Dheeraj Rajagopal,Dongyeop Kang

from arxiv, NLP

Collecting diverse human data on subjective NLP topics is costly and challenging. As Large Language Models (LLMs) have developed human-like capabilities, there is a recent trend in collaborative efforts between humans and LLMs for generating diverse data, offering potential scalable and efficient solutions. However, the extent of LLMs' capability to generate diverse perspectives on subjective topics remains an unexplored question. In this study, we investigate LLMs' capacity for generating diverse perspectives and rationales on subjective topics, such as social norms and argumentative texts. We formulate this problem as diversity extraction in LLMs and propose a criteria-based prompting technique to ground diverse opinions and measure perspective diversity from the generated criteria words. Our results show that measuring semantic diversity through sentence embeddings and distance metrics is not enough to measure perspective diversity. To see how far we can extract diverse perspectives from LLMs, or called diversity coverage, we employ a step-by-step recall prompting for generating more outputs from the model in an iterative manner. As we apply our prompting method to other tasks (hate speech labeling and story continuation), indeed we find that LLMs are able to generate diverse opinions according to the degree of task subjectivity.

CLUES · MoDELS · entity · 掩碼 · Performer ·

2023 年 11 月 16 日

Deceiving Semantic Shortcuts on Reasoning Chains: How Far Can Models Go without Hallucination?

Bangzheng Li,Ben Zhou,Fei Wang,Xingyu Fu,Dan Roth,Muhao Chen

from arxiv, Work in progress

Despite the recent advancement in large language models (LLMs) and their high performances across numerous benchmarks, recent research has unveiled that LLMs suffer from hallucinations and unfaithful reasoning. This work studies a specific type of hallucination induced by semantic associations. Specifically, we investigate to what extent LLMs take shortcuts from certain keyword/entity biases in the prompt instead of following the correct reasoning path. To quantify this phenomenon, we propose a novel probing method and benchmark called EureQA. We start from questions that LLMs will answer correctly with utmost certainty, and mask the important entity with evidence sentence recursively, asking models to find masked entities according to a chain of evidence before answering the question. During the construction of the evidence, we purposefully replace semantic clues (entities) that may lead to the correct answer with distractor clues (evidence) that will not directly lead to the correct answer but require a chain-like reasoning process. We evaluate if models can follow the correct reasoning chain instead of short-cutting through distractor clues. We find that existing LLMs lack the necessary capabilities to follow correct reasoning paths and resist the attempt of greedy shortcuts. We show that the distractor semantic associations often lead to model hallucination, which is strong evidence that questions the validity of current LLM reasoning.

語言模型化 · MoDELS · 泛函 · 知識 (knowledge) · 情景 ·

2023 年 11 月 16 日

MacGyver: Are Large Language Models Creative Problem Solvers?

Yufei Tian,Abhilasha Ravichander,Lianhui Qin,Ronan Le Bras,Raja Marjieh,Nanyun Peng,Yejin Choi,Thomas L. Griffiths,Faeze Brahman

We explore the creative problem-solving capabilities of modern large language models (LLMs) in a constrained setting. The setting requires circumventing a cognitive bias known in psychology as ''functional fixedness'' to use familiar objects in innovative or unconventional ways. To this end, we create MacGyver, an automatically generated dataset consisting of 1,600 real-world problems that deliberately trigger functional fixedness and require thinking 'out-of-the-box'. We then present our collection of problems to both LLMs and humans to compare and contrast their problem-solving abilities. We show that MacGyver is challenging for both groups, but in unique and complementary ways. For example, humans typically excel in solving problems that they are familiar with but may struggle with tasks requiring domain-specific knowledge, leading to a higher variance. On the other hand, LLMs, being exposed to a variety of highly specialized knowledge, attempt broader problems but are prone to overconfidence and propose actions that are physically infeasible or inefficient. We also provide a detailed error analysis of LLMs, and demonstrate the potential of enhancing their problem-solving ability with novel prompting techniques such as iterative step-wise reflection and divergent-convergent thinking. This work provides insight into the creative problem-solving capabilities of humans and AI and illustrates how psychological paradigms can be extended into large-scale tasks for comparing humans and machines.

圖 · Processing（編程語言） · NLP · Neural Networks · 圖形處理器 ·

2021 年 6 月 10 日

Graph Neural Networks for Natural Language Processing: A Survey

Lingfei Wu,Yu Chen,Kai Shen,Xiaojie Guo,Hanning Gao,Shucheng Li,Jian Pei,Bo Long

from arxiv, 127 pages

Deep learning has become the dominant approach in coping with various tasks in Natural LanguageProcessing (NLP). Although text inputs are typically represented as a sequence of tokens, there isa rich variety of NLP problems that can be best expressed with a graph structure. As a result, thereis a surge of interests in developing new deep learning techniques on graphs for a large numberof NLP tasks. In this survey, we present a comprehensive overview onGraph Neural Networks(GNNs) for Natural Language Processing. We propose a new taxonomy of GNNs for NLP, whichsystematically organizes existing research of GNNs for NLP along three axes: graph construction,graph representation learning, and graph based encoder-decoder models. We further introducea large number of NLP applications that are exploiting the power of GNNs and summarize thecorresponding benchmark datasets, evaluation metrics, and open-source codes. Finally, we discussvarious outstanding challenges for making the full use of GNNs for NLP as well as future researchdirections. To the best of our knowledge, this is the first comprehensive overview of Graph NeuralNetworks for Natural Language Processing.

長短期記憶網絡 · RNN · Networking · Weight · MoDELS ·

2020 年 6 月 10 日

Do RNN and LSTM have Long Memory?

Jingyu Zhao,Feiqing Huang,Jia Lv,Yanjie Duan,Zhen Qin,Guodong Li,Guangjian Tian

from arxiv, Accepted by ICML 2020. Added references, experiments and acknowledgements

The LSTM network was proposed to overcome the difficulty in learning long-term dependence, and has made significant advancements in applications. With its success and drawbacks in mind, this paper raises the question - do RNN and LSTM have long memory? We answer it partially by proving that RNN and LSTM do not have long memory from a statistical perspective. A new definition for long memory networks is further introduced, and it requires the model weights to decay at a polynomial rate. To verify our theory, we convert RNN and LSTM into long memory networks by making a minimal modification, and their superiority is illustrated in modeling long-term dependence of various datasets.

文本分類 · 語言模型化 · BERT · state-of-the-art · MoDELS ·

2019 年 5 月 14 日

How to Fine-Tune BERT for Text Classification?

Chi Sun,Xipeng Qiu,Yige Xu,Xuanjing Huang

Language model pre-training has proven to be useful in learning universal language representations. As a state-of-the-art language model pre-training model, BERT (Bidirectional Encoder Representations from Transformers) has achieved amazing results in many language understanding tasks. In this paper, we conduct exhaustive experiments to investigate different fine-tuning methods of BERT on text classification task and provide a general solution for BERT fine-tuning. Finally, the proposed solution obtains new state-of-the-art results on eight widely-studied text classification datasets.

FPGA · 卷積神經網絡 · Neural Networks · 卷積 · 層 ·

2016 年 9 月 30 日

Caffeinated FPGAs: FPGA Framework For Convolutional Neural Networks

Roberto DiCecco,Griffin Lacey,Jasmina Vasiljevic,Paul Chow,Graham Taylor,Shawki Areibi

Convolutional Neural Networks (CNNs) have gained significant traction in the field of machine learning, particularly due to their high accuracy in visual recognition. Recent works have pushed the performance of GPU implementations of CNNs to significantly improve their classification and training times. With these improvements, many frameworks have become available for implementing CNNs on both CPUs and GPUs, with no support for FPGA implementations. In this work we present a modified version of the popular CNN framework Caffe, with FPGA support. This allows for classification using CNN models and specialized FPGA implementations with the flexibility of reprogramming the device when necessary, seamless memory transactions between host and device, simple-to-use test benches, and the ability to create pipelined layer implementations. To validate the framework, we use the Xilinx SDAccel environment to implement an FPGA-based Winograd convolution engine and show that the FPGA layer can be used alongside other layers running on a host processor to run several popular CNNs (AlexNet, GoogleNet, VGG A, Overfeat). The results show that our framework achieves 50 GFLOPS across 3x3 convolutions in the benchmarks. This is achieved within a practical framework, which will aid in future development of FPGA-based CNNs.