夏娃韩剧电视剧在剧免费韩剧TV_午夜男女爽爽爽免费大片_香港三级一区二区在线观看_青青青国产精美在线观看_国产色婷婷国产综合在线理论片A_中文字幕少妇一区二区三区_狠狠色狠狠色狠狠五月毛片

R and Python are among the most popular languages used in many critical data analytics tasks. However, we still do not fully understand the capabilities of these two languages w.r.t. bugs encountered in data analytics tasks. What type of bugs are common? What are the main root causes? What is the relation between bugs and root causes? How to mitigate these bugs? We present a comprehensive study of 5,068 Stack Overflow posts, 1,800 bug fix commits from GitHub repositories, and several GitHub issues of the most used libraries to understand bugs in R and Python. Our key findings include: while both R and Python have bugs due to inexperience with data analysis, Python see significantly larger data preprocessing bugs compared to R. Developers experience significantly more data flow bugs in R because intermediate results are often implicit. We also found changes and bugs in packages and libraries cause more bugs in R compared to Python while package or library misselection and conflicts cause more bugs in Python than R. While R has a slightly higher readability barrier for data analysts, the statistical power of R leads to a less number of bad performance bugs. In terms of data visualization, R packages have significantly more bugs than Python libraries. We also identified a strong correlation between comparable packages in R and Python despite their linguistic and methodological differences. Lastly, we contribute a large dataset of manually verified R and Python bugs.

相關內容

Python

關注 5351

是一種(zhong)面(mian)向(xiang)對象(xiang)的(de)(de)解釋型(xing)計算(suan)機程序(xu)設計語(yu)言，在設計中注重代碼的(de)(de)可讀性，同時也是一種(zhong)功能(neng)強(qiang)大的(de)(de)通用型(xing)語(yu)言。

INTERACT · 有偏 · MoDELS · Weight · 縮放 ·

2023 年 8 月 7 日

Relationship between Collider Bias and Interactions on the Log-Additive Scale

Apostolos Gkatzionis,Shaun R. Seaman,Rachael A. Hughes,Kate Tilling

from arxiv, Main Part: 19 pages, 5 figures, 3 tables. Supplement: 16 pages, 3 figures, 5 tables

Collider bias occurs when conditioning on a common effect (collider) of two variables $X, Y$. In this manuscript, we quantify the collider bias in the estimated association between exposure $X$ and outcome $Y$ induced by selecting on one value of a binary collider $S$ of the exposure and the outcome. In the case of logistic regression, it is known that the magnitude of the collider bias in the exposure-outcome regression coefficient is proportional to the strength of interaction $\delta_3$ between $X$ and $Y$ in a log-additive model for the collider: $\mathbb{P} (S = 1 | X, Y) = \exp \left\{ \delta_0 + \delta_1 X + \delta_2 Y + \delta_3 X Y \right\}$. We show that this result also holds under a linear or Poisson regression model for the exposure-outcome association. We then illustrate by simulation that even if a log-additive model with interactions is not the true model for the collider, the interaction term in such a model is still informative about the magnitude of collider bias. Finally, we discuss the implications of these findings for methods that attempt to adjust for collider bias, such as inverse probability weighting which is often implemented without including interactions between variables in the weighting model.

策略搜索 · 有向 · 早停 · Continuity · CASES ·

2023 年 8 月 7 日

Generalized Early Stopping in Evolutionary Direct Policy Search

Etor Arza,Leni K. Le Goff,Emma Hart

Lengthy evaluation times are common in many optimization problems such as direct policy search tasks, especially when they involve conducting evaluations in the physical world, e.g. in robotics applications. Often, when evaluating a solution over a fixed time period, it becomes clear that the objective value will not increase with additional computation time (for example, when a two-wheeled robot continuously spins on the spot). In such cases, it makes sense to stop the evaluation early to save computation time. However, most approaches to stop the evaluation are problem-specific and need to be specifically designed for the task at hand. Therefore, we propose an early stopping method for direct policy search. The proposed method only looks at the objective value at each time step and requires no problem-specific knowledge. We test the introduced stopping criterion in five direct policy search environments drawn from games, robotics, and classic control domains, and show that it can save up to 75% of the computation time. We also compare it with problem-specific stopping criteria and demonstrate that it performs comparably while being more generally applicable.

INFORMS · 代碼 · 控制器 · binary · Performer ·

2023 年 8 月 7 日

Thwarting Code-Reuse and Side-Channel Attacks in Embedded Systems

Rodothea Myrsini Tsoupidi,Elena Troubitsyna,Panagiotis Papadimitratos

Embedded devices are increasingly present in our everyday life. They often process critical information, and hence, rely on cryptographic protocols to achieve security. However, embedded devices remain vulnerable to attackers seeking to hijack their operation and extract sensitive information by exploiting side channels and code reuse. Code-Reuse Attacks (CRAs) can steer the execution of a program to malicious outcomes, altering existing on-board code without direct access to the device memory. Moreover, Side-Channel Attacks (SCAs) may reveal secret information to the attacker based on mere observation of the device. Thwarting CRAs and SCAs against embedded devices is challenging because embedded devices are often resource constrained. Fine-grained code diversification hinders CRAs by introducing uncertainty to the binary code; while software mechanisms can thwart timing or power SCAs. The resilience to either attack may come at the price of the overall efficiency. Moreover, a unified approach that preserves these mitigations against both CRAs and SCAs is not available. In this paper, we propose a novel Secure Diversity by Construction (SecDivCon) approach that tackles this challenge. SecDivCon is a combinatorial compiler-based approach that combines software diversification against CRAs with software mitigations against SCAs. SecDivCon restricts the performance overhead introduced by the generated code that thwarts the attacks and hence, offers a secure-by-design approach enabling control over the performance-security trade-off. Our experiments, using 16 benchmark programs, show that SCA-aware diversification is effective against CRAs, while preserving SCA mitigation properties at a low, controllable overhead. Given the combinatorial nature of our approach, SecDivCon is suitable for small, performance-critical functions that are sensitive to SCAs.

置信度 · 貝塔分布 · RecSys · 模型評估 · Learning ·

2023 年 8 月 6 日

A Lightweight Method for Modeling Confidence in Recommendations with Learned Beta Distributions

Norman Knyazev,Harrie Oosterhuis

from arxiv, In Proceedings of the 17th ACM Conference on Recommender Systems (RecSys '23), September 18-22, 2023, Singapore, Singapore. ACM, New York, NY, USA, 12 pages

Most Recommender Systems (RecSys) do not provide an indication of confidence in their decisions. Therefore, they do not distinguish between recommendations of which they are certain, and those where they are not. Existing confidence methods for RecSys are either inaccurate heuristics, conceptually complex or computationally very expensive. Consequently, real-world RecSys applications rarely adopt these methods, and thus, provide no confidence insights in their behavior. In this work, we propose learned beta distributions (LBD) as a simple and practical recommendation method with an explicit measure of confidence. Our main insight is that beta distributions predict user preferences as probability distributions that naturally model confidence on a closed interval, yet can be implemented with the minimal model-complexity. Our results show that LBD maintains competitive accuracy to existing methods while also having a significantly stronger correlation between its accuracy and confidence. Furthermore, LBD has higher performance when applied to a high-precision targeted recommendation task. Our work thus shows that confidence in RecSys is possible without sacrificing simplicity or accuracy, and without introducing heavy computational complexity. Thereby, we hope it enables better insight into real-world RecSys and opens the door for novel future applications.

Networking · Neural Networks · 泛化理論 · 圖形處理器 · Learning ·

2023 年 8 月 4 日

Stability and Generalization of Hypergraph Collaborative Networks

Michael Ng,Hanrui Wu,Andy Yip

Graph neural networks have been shown to be very effective in utilizing pairwise relationships across samples. Recently, there have been several successful proposals to generalize graph neural networks to hypergraph neural networks to exploit more complex relationships. In particular, the hypergraph collaborative networks yield superior results compared to other hypergraph neural networks for various semi-supervised learning tasks. The collaborative network can provide high quality vertex embeddings and hyperedge embeddings together by formulating them as a joint optimization problem and by using their consistency in reconstructing the given hypergraph. In this paper, we aim to establish the algorithmic stability of the core layer of the collaborative network and provide generalization guarantees. The analysis sheds light on the design of hypergraph filters in collaborative networks, for instance, how the data and hypergraph filters should be scaled to achieve uniform stability of the learning process. Some experimental results on real-world datasets are presented to illustrate the theory.

Networking · 層 · 神經元 · 相關系數 · Neural Networks ·

2023 年 8 月 3 日

Wider and Deeper LLM Networks are Fairer LLM Evaluators

Xinghua Zhang,Bowen Yu,Haiyang Yu,Yangyu Lv,Tingwen Liu,Fei Huang,Hongbo Xu,Yongbin Li

from arxiv, Work in Progress

Measuring the quality of responses generated by LLMs is a challenging task, particularly when it comes to evaluating whether the response is aligned with human preference. A novel approach involves using the LLM itself to make evaluation and stabilizing the results through multiple independent evaluations, similar to a single-layer narrow LLM network. This network consists of a fixed number of neurons, with each neuron being the same LLM. In this paper, we draw upon the extensive research on deep neural networks to explore whether deeper and wider networks can lead to fairer evaluations. Specifically, inspired by the observation that different neurons in a neural network are responsible for detecting different concepts, we first adaptively generate as many neuron roles as possible for each evaluation sample. Each perspective corresponds to the role of a specific LLM neuron in the first layer. In subsequent layers, we follow the idea that higher layers in deep networks are responsible for more comprehensive features, each layer receives representations from all neurons in the previous layer, integrating the locally learned evaluation information to obtain a more comprehensive evaluation result. Interestingly, this network design resembles the process of academic paper reviewing. To validate the effectiveness of our method, we construct the largest and most diverse English evaluation benchmark LLMEval$^2$ for LLM evaluators, comprising 15 tasks, 8 abilities, and 2,553 samples. Experimental results demonstrate that a wider network (involving many reviewers) with 2 layers (one round of discussion) performs the best, improving kappa correlation coefficient from 0.28 to 0.34. We also leverage WideDeep to aid in the assessment of Chinese LLMs, which has accelerated the evaluation time by 4.6 times, resulting in a 60% cost saving. WideDeep achieves a remarkable 93% agreement level among humans.

語言模型化 · 泛函 · Med-PaLM 2 · MoDELS · 得分 ·

2023 年 8 月 3 日

The Capability of Large Language Models to Measure Psychiatric Functioning

Isaac R. Galatzer-Levy,Daniel McDuff,Vivek Natarajan,Alan Karthikesalingam,Matteo Malgaroli

The current work investigates the capability of Large language models (LLMs) that are explicitly trained on large corpuses of medical knowledge (Med-PaLM 2) to predict psychiatric functioning from patient interviews and clinical descriptions without being trained to do so. To assess this, n = 145 depression and n =115 PTSD assessments and n = 46 clinical case studies across high prevalence/high comorbidity disorders (Depressive, Anxiety, Psychotic, trauma and stress, Addictive disorders) were analyzed using prompts to extract estimated clinical scores and diagnoses. Results demonstrate that Med-PaLM 2 is capable of assessing psychiatric functioning across a range of psychiatric conditions with the strongest performance being the prediction of depression scores based on standardized assessments (Accuracy range= 0.80 - 0.84) which were statistically indistinguishable from human clinical raters t(1,144) = 1.20; p = 0.23. Results show the potential for general clinical language models to flexibly predict psychiatric risk based on free descriptions of functioning from both patients and clinicians.

知識 (knowledge) · Learning · MoDELS · 圖 · entity ·

2022 年 11 月 29 日

Lifelong Embedding Learning and Transfer for Growing Knowledge Graphs

Yuanning Cui,Yuxin Wang,Zequn Sun,Wenqiang Liu,Yiqiao Jiang,Kexin Han,Wei Hu

from arxiv, Accepted in the 37th AAAI Conference on Artificial Intelligence (AAAI 2023)

Existing knowledge graph (KG) embedding models have primarily focused on static KGs. However, real-world KGs do not remain static, but rather evolve and grow in tandem with the development of KG applications. Consequently, new facts and previously unseen entities and relations continually emerge, necessitating an embedding model that can quickly learn and transfer new knowledge through growth. Motivated by this, we delve into an expanding field of KG embedding in this paper, i.e., lifelong KG embedding. We consider knowledge transfer and retention of the learning on growing snapshots of a KG without having to learn embeddings from scratch. The proposed model includes a masked KG autoencoder for embedding learning and update, with an embedding transfer strategy to inject the learned knowledge into the new entity and relation embeddings, and an embedding regularization method to avoid catastrophic forgetting. To investigate the impacts of different aspects of KG growth, we construct four datasets to evaluate the performance of lifelong KG embedding. Experimental results show that the proposed model outperforms the state-of-the-art inductive and lifelong embedding baselines.

Vision · 模型評估 · 可約的 · 計算機視覺 · DNN ·

2020 年 3 月 24 日

A Survey of Methods for Low-Power Deep Learning and Computer Vision

Abhinav Goel,Caleb Tung,Yung-Hsiang Lu,George K. Thiruvathukal

from arxiv, Accepted for publication at 2020 IEEE 6th World Forum on Internet of Things (WF-IoT), New Orleans, LA, USA 2020

Deep neural networks (DNNs) are successful in many computer vision tasks. However, the most accurate DNNs require millions of parameters and operations, making them energy, computation and memory intensive. This impedes the deployment of large DNNs in low-power devices with limited compute resources. Recent research improves DNN models by reducing the memory requirement, energy consumption, and number of operations without significantly decreasing the accuracy. This paper surveys the progress of low-power deep learning and computer vision, specifically in regards to inference, and discusses the methods for compacting and accelerating DNN models. The techniques can be divided into four major categories: (1) parameter quantization and pruning, (2) compressed convolutional filters and matrix factorization, (3) network architecture search, and (4) knowledge distillation. We analyze the accuracy, advantages, disadvantages, and potential solutions to the problems with the techniques in each category. We also discuss new evaluation metrics as a guideline for future research.

語言模型化 · MoDELS · 詞表 · 優化器 · state-of-the-art ·

2019 年 9 月 25 日

Extreme Language Model Compression with Optimal Subwords and Shared Projections

Sanqiang Zhao,Raghav Gupta,Yang Song,Denny Zhou

Pre-trained deep neural network language models such as ELMo, GPT, BERT and XLNet have recently achieved state-of-the-art performance on a variety of language understanding tasks. However, their size makes them impractical for a number of scenarios, especially on mobile and edge devices. In particular, the input word embedding matrix accounts for a significant proportion of the model's memory footprint, due to the large input vocabulary and embedding dimensions. Knowledge distillation techniques have had success at compressing large neural network models, but they are ineffective at yielding student models with vocabularies different from the original teacher models. We introduce a novel knowledge distillation technique for training a student model with a significantly smaller vocabulary as well as lower embedding and hidden state dimensions. Specifically, we employ a dual-training mechanism that trains the teacher and student models simultaneously to obtain optimal word embeddings for the student vocabulary. We combine this approach with learning shared projection matrices that transfer layer-wise knowledge from the teacher model to the student model. Our method is able to compress the BERT_BASE model by more than 60x, with only a minor drop in downstream task metrics, resulting in a language model with a footprint of under 7MB. Experimental results also demonstrate higher compression efficiency and accuracy when compared with other state-of-the-art compression techniques.