精品夜色国产国偷自产乱码_91人妻社区论坛精选_婷婷国产在线视频99_欧美和亚洲黄色首页_国产精品无码久久久久久免费_亚洲欧美中文日韩激情九色_麻豆传媒出品

Automatic machine translation metrics typically rely on human translations to determine the quality of system translations. Common wisdom in the field dictates that the human references should be of very high quality. However, there are no cost-benefit analyses that could be used to guide practitioners who plan to collect references for machine translation evaluation. We find that higher-quality references lead to better metric correlations with humans at the segment-level. Having up to 7 references per segment and taking their average (or maximum) helps all metrics. Interestingly, the references from vendors of different qualities can be mixed together and improve metric success. Higher quality references, however, cost more to create and we frame this as an optimization problem: given a specific budget, what references should be collected to maximize metric success. These findings can be used by evaluators of shared tasks when references need to be created under a certain budget.

相關內容

Machine Translation

關注 209

機器翻譯(yi)（Machine Translation）涵蓋計算(suan)語(yu)(yu)言(yan)(yan)學(xue)和語(yu)(yu)言(yan)(yan)工程的所有分支(zhi)，包含多(duo)語(yu)(yu)言(yan)(yan)方(fang)面(mian)。特(te)色論文涵蓋理論，描述或計算(suan)方(fang)面(mian)的任何下列主題:雙語(yu)(yu)和多(duo)語(yu)(yu)語(yu)(yu)料庫(ku)的編(bian)寫(xie)和使(shi)用，計算(suan)機輔(fu)助語(yu)(yu)言(yan)(yan)教學(xue)，非羅馬字符集的計算(suan)含義，連接主義翻譯(yi)方(fang)法，對(dui)比語(yu)(yu)言(yan)(yan)學(xue)等。官(guan)網地(di)址：

Engineering · Vision · Better · 有向 · 塊 ·

2024 年 5 月 20 日

A Vision on Open Science for the Evolution of Software Engineering Research and Practice

Edson OliveiraJr,Fernanda Madeiral,Alcemir Rodrigues Santos,Christina von Flach,Sergio Soares

from arxiv, Proceedings of the FSE 2024 - Ideas, Visions and Reflections Track

Open Science aims to foster openness and collaboration in research, leading to more significant scientific and social impact. However, practicing Open Science comes with several challenges and is currently not properly rewarded. In this paper, we share our vision for addressing those challenges through a conceptual framework that connects essential building blocks for a change in the Software Engineering community, both culturally and technically. The idea behind this framework is that Open Science is treated as a first-class requirement for better Software Engineering research, practice, recognition, and relevant social impact. There is a long road for us, as a community, to truly embrace and gain from the benefits of Open Science. Nevertheless, we shed light on the directions for promoting the necessary culture shift and empowering the Software Engineering community.

MoDELS · Learning · 模型評估 · 得分 · 樣本 ·

2024 年 5 月 17 日

Efficient Learning of Accurate Surrogates for Simulations of Complex Systems

A. Diaw,M. McKerns,I. Sagert,L. G. Stanton,M. S. Murillo

from arxiv, 13 pages, 6 figures, submitted to Nature Machine Intelligence

Machine learning methods are increasingly used to build computationally inexpensive surrogates for complex physical models. The predictive capability of these surrogates suffers when data are noisy, sparse, or time-dependent. As we are interested in finding a surrogate that provides valid predictions of any potential future model evaluations, we introduce an online learning method empowered by optimizer-driven sampling. The method has two advantages over current approaches. First, it ensures that all turning points on the model response surface are included in the training data. Second, after any new model evaluations, surrogates are tested and "retrained" (updated) if the "score" drops below a validity threshold. Tests on benchmark functions reveal that optimizer-directed sampling generally outperforms traditional sampling methods in terms of accuracy around local extrema, even when the scoring metric favors overall accuracy. We apply our method to simulations of nuclear matter to demonstrate that highly accurate surrogates for the nuclear equation of state can be reliably auto-generated from expensive calculations using a few model evaluations.

CNN · 可理解性 · Networking · Neural Networks · 多樣性 ·

2024 年 5 月 17 日

Characterization of Magnetic Labyrinthine Structures through Junctions and Terminals Detection Using Template Matching and CNN

Vinícius Yu Okubo,Kotaro Shimizu,B. S. Shivaram,Hae Yong Kim

from arxiv, 12 pages, 7 figures, submitted to IEEE Access

Defects influence diverse properties of materials, shaping their structural, mechanical, and electronic characteristics. Among a variety of materials exhibiting unique defects, magnets exhibit diverse nano- to micro-scale defects and have been intensively studied in materials science. Specifically, defects in magnetic labyrinthine patterns, called junctions and terminals, serve as the canonical targets of the research. While detecting and characterizing such defects is crucial for understanding magnets, systematically investigating large-scale images containing over a thousand closely packed junctions and terminals remains a formidable challenge. This study introduces a new technique called TM-CNN (Template Matching - Convolutional Neural Network) designed to detect a multitude of small objects in images, such as the defects in magnetic labyrinthine patterns. TM-CNN was used to identify 641,649 such structures in 444 experimental images, and the results were explored to deepen understanding of magnetic materials. It employs a two-stage detection approach combining template matching, used in initial detection, with a convolutional neural network, used to eliminate incorrect identifications. To train a CNN classifier, it is necessary to annotate a large number of training images.This difficulty prevents the use of CNN in many practical applications. TM-CNN significantly reduces the manual workload for creating training images by automatically making most of the annotations and leaving only a small number of corrections to human reviewers. In testing, TM-CNN achieved an impressive F1 score of 0.991, far outperforming traditional template matching and CNN-based object detection algorithms.

講稿 · Use Case · Guidance · 可理解性 · SimPLe ·

2024 年 5 月 16 日

A Transdisciplinary Approach to Cybersecurity: A Framework for Encouraging Transdisciplinary Thinking

Emily Kesler

Classical cybersecurity is often perceived as a rigid science discipline filled with computer scientists and mathematicians. However, due to the rapid pace of technology development and integration, new criminal enterprises, new defense tactics, and the understanding of the human element, cybersecurity is quickly beginning to encompass more than just computers. Cybersecurity experts must broaden their perspectives beyond traditional disciplinary boundaries to provide the best protection possible. They must start to practice transdisciplinary cybersecurity. Taking influence from the Stakeholder Theory in business ethics, this paper presents a framework to encourage transdisciplinary thinking and assist experts in tackling the new challenges of the modern day. The framework uses the simple Think, Plan, Do approach to enable experts to develop their transdisciplinary thinking. The framework is intended to be used as an evaluation tool for existing cybersecurity practices or postures, as a development tool to engage with other disciplines to foster learning and create new methods, and as a guidance tool to encourage new ways of thinking about, perceiving, and executing cybersecurity practices. For each of those intended uses, a use case is presented as an example to showcase how the framework might be used. The ultimate goal of this paper is not the framework but transdisciplinary thinking. By using the tool presented here and developing their own transdisciplinary thinking, cybersecurity experts can be better prepared to face cybersecurity's unique and complex challenges.

MoDELS · Networking · INFORMS · 可約的 · 代碼 ·

2024 年 5 月 15 日

Scalable Image Coding for Humans and Machines Using Feature Fusion Network

Takahiro Shindo,Taiju Watanabe,Yui Tatsumi,Hiroshi Watanabe

As image recognition models become more prevalent, scalable coding methods for machines and humans gain more importance. Applications of image recognition models include traffic monitoring and farm management. In these use cases, the scalable coding method proves effective because the tasks require occasional image checking by humans. Existing image compression methods for humans and machines meet these requirements to some extent. However, these compression methods are effective solely for specific image recognition models. We propose a learning-based scalable image coding method for humans and machines that is compatible with numerous image recognition models. We combine an image compression model for machines with a compression model, providing additional information to facilitate image decoding for humans. The features in these compression models are fused using a feature fusion network to achieve efficient image compression. Our method's additional information compression model is adjusted to reduce the number of parameters by enabling combinations of features of different sizes in the feature fusion network. Our approach confirms that the feature fusion network efficiently combines image compression models while reducing the number of parameters. Furthermore, we demonstrate the effectiveness of the proposed scalable coding method by evaluating the image compression performance in terms of decoded image quality and bitrate.

RAID · 穩健性 · MoDELS · 數據集 · 樣本 ·

2024 年 5 月 13 日

RAID: A Shared Benchmark for Robust Evaluation of Machine-Generated Text Detectors

Liam Dugan,Alyssa Hwang,Filip Trhlik,Josh Magnus Ludan,Andrew Zhu,Hainiu Xu,Daphne Ippolito,Chris Callison-Burch

from arxiv, To appear at ACL 2024

Many commercial and open-source models claim to detect machine-generated text with very high accuracy (99\% or higher). However, very few of these detectors are evaluated on shared benchmark datasets and even when they are, the datasets used for evaluation are insufficiently challenging -- lacking variations in sampling strategy, adversarial attacks, and open-source generative models. In this work we present RAID: the largest and most challenging benchmark dataset for machine-generated text detection. RAID includes over 6 million generations spanning 11 models, 8 domains, 11 adversarial attacks and 4 decoding strategies. Using RAID, we evaluate the out-of-domain and adversarial robustness of 8 open- and 4 closed-source detectors and find that current detectors are easily fooled by adversarial attacks, variations in sampling strategies, repetition penalties, and unseen generative models. We release our dataset and tools to encourage further exploration into detector robustness.

穩健性 · Machine Translation · 相關系數 · 數據集 · 噪聲 ·

2024 年 5 月 13 日

An Empirical Study on the Robustness of Massively Multilingual Neural Machine Translation

Supryadi,Leiyu Pan,Deyi Xiong

from arxiv, 12 pages, 6 figures

Massively multilingual neural machine translation (MMNMT) has been proven to enhance the translation quality of low-resource languages. In this paper, we empirically investigate the translation robustness of Indonesian-Chinese translation in the face of various naturally occurring noise. To assess this, we create a robustness evaluation benchmark dataset for Indonesian-Chinese translation. This dataset is automatically translated into Chinese using four NLLB-200 models of different sizes. We conduct both automatic and human evaluations. Our in-depth analysis reveal the correlations between translation error types and the types of noise present, how these correlations change across different model sizes, and the relationships between automatic evaluation indicators and human evaluation indicators. The dataset is publicly available at //github.com/tjunlp-lab/ID-ZH-MTRobustEval.

Automator · 代碼 · Extensibility · Performer · CASES ·

2024 年 5 月 11 日

Exploring and Unleashing the Power of Large Language Models in Automated Code Translation

Zhen Yang,Fang Liu,Zhongxing Yu,Jacky Wai Keung,Jia Li,Shuo Liu,Yifan Hong,Xiaoxue Ma,Zhi Jin,Ge Li

from arxiv, 23 pages, 7 figures, accepted by FSE'24 (2024 ACM International Conference on the Foundations of Software Engineering)

Code translation tools (transpilers) are developed for automatic source-to-source translation. Although learning-based transpilers have shown impressive enhancement against rule-based counterparts, owing to their task-specific pre-training on extensive monolingual corpora. Their current performance still remains unsatisfactory for practical deployment, and the associated training resources are also prohibitively expensive. LLMs pre-trained on huge amounts of human-written code/text have shown remarkable performance in many code intelligence tasks due to their powerful generality, even without task-specific training. Thus, LLMs can potentially circumvent the above limitations, but they have not been exhaustively explored yet. This paper investigates diverse LLMs and learning-based transpilers for automated code translation tasks, finding that: although certain LLMs have outperformed current transpilers, they still have some accuracy issues, where most of the failures are induced by a lack of comprehension of source programs, missing clear instructions on I/O types in translation, and ignoring discrepancies between source and target programs. Enlightened by the above findings, we further propose UniTrans, a Unified code Translation framework, applicable to various LLMs, for unleashing their power in this field. Specifically, UniTrans first crafts a series of test cases for target programs with the assistance of source programs. Next, it harnesses the above auto-generated test cases to augment the code translation and then evaluate their correctness via execution. Afterward, UniTrans further (iteratively) repairs incorrectly translated programs prompted by test case execution results. Extensive experiments are conducted on six settings of translation datasets between Python, Java, and C++. Three recent LLMs of diverse sizes are tested with UniTrans, and all achieve substantial improvements.

Performer · Machine Learning · 模型性能 · MoDELS · Processing（編程語言） ·

2021 年 8 月 2 日

A Survey of Human-in-the-loop for Machine Learning

Xingjiao Wu,Luwei Xiao,Yixuan Sun,Junhang Zhang,Tianlong Ma,Liang He

Human-in-the-loop aims to train an accurate prediction model with minimum cost by integrating human knowledge and experience. Humans can provide training data for machine learning applications and directly accomplish some tasks that are hard for computers in the pipeline with the help of machine-based approaches. In this paper, we survey existing works on human-in-the-loop from a data perspective and classify them into three categories with a progressive relationship: (1) the work of improving model performance from data processing, (2) the work of improving model performance through interventional model training, and (3) the design of the system independent human-in-the-loop. Using the above categorization, we summarize major approaches in the field, along with their technical strengths/ weaknesses, we have simple classification and discussion in natural language processing, computer vision, and others. Besides, we provide some open challenges and opportunities. This survey intends to provide a high-level summarization for human-in-the-loop and motivates interested readers to consider approaches for designing effective human-in-the-loop solutions.

INFORMS · 圖 · 可約的 · 知識圖譜 · 可辨認的 ·

2018 年 8 月 29 日

Multi-Task Identification of Entities, Relations, and Coreference for Scientific Knowledge Graph Construction

Yi Luan,Luheng He,Mari Ostendorf,Hannaneh Hajishirzi

We introduce a multi-task setup of identifying and classifying entities, relations, and coreference clusters in scientific articles. We create SciERC, a dataset that includes annotations for all three tasks and develop a unified framework called Scientific Information Extractor (SciIE) for with shared span representations. The multi-task setup reduces cascading errors between tasks and leverages cross-sentence relations through coreference links. Experiments show that our multi-task model outperforms previous models in scientific information extraction without using any domain-specific features. We further show that the framework supports construction of a scientific knowledge graph, which we use to analyze information in scientific literature.