久久久久久久精品少妇9999_精品亚洲高清一区二区三区电影_国产一区二区精品久久呦_亚洲操逼黄色AAA视频_亚洲WWW网站在线观看_户外经典性爱交视频_香蕉久久AB一区二区三区

The recent proliferation of computers and the internet have opened new opportunities for collecting and processing data. However, such data are often obtained without a well-planned probability survey design. Such non-probability based samples cannot be automatically regarded as representative of the population of interest. Several classes of methods for estimation and inferences from non-probability samples have been developed in recent years. The quasi-randomization methods assume that non-probability sample selection is governed by an underlying latent random mechanism. The basic idea is to use information collected from a probability ("reference") sample to uncover latent non-probability survey participation probabilities (also known as "propensity scores") and use them in estimation of target finite population parameters. In this paper, we review and compare theoretical properties of recently developed methods of estimation survey participation probabilities and study their relative performances in simulations.

相關內容

估計(ji)(ji)/估計(ji)(ji)量

關注 3

Learning · 數據集 · CASE · Performer · AIM ·

2024 年 3 月 22 日

Cross-Lingual Learning vs. Low-Resource Fine-Tuning: A Case Study with Fact-Checking in Turkish

Recep Firat Cekinel,Pinar Karagoz,Cagri Coltekin

from arxiv, LREC-COLING 2024

The rapid spread of misinformation through social media platforms has raised concerns regarding its impact on public opinion. While misinformation is prevalent in other languages, the majority of research in this field has concentrated on the English language. Hence, there is a scarcity of datasets for other languages, including Turkish. To address this concern, we have introduced the FCTR dataset, consisting of 3238 real-world claims. This dataset spans multiple domains and incorporates evidence collected from three Turkish fact-checking organizations. Additionally, we aim to assess the effectiveness of cross-lingual transfer learning for low-resource languages, with a particular focus on Turkish. We demonstrate in-context learning (zero-shot and few-shot) performance of large language models in this context. The experimental results indicate that the dataset has the potential to advance research in the Turkish language.

語音識別 · 表示學習 · 表示 · Learning · Performer ·

2024 年 3 月 21 日

XLAVS-R: Cross-Lingual Audio-Visual Speech Representation Learning for Noise-Robust Speech Perception

HyoJung Han,Mohamed Anwar,Juan Pino,Wei-Ning Hsu,Marine Carpuat,Bowen Shi,Changhan Wang

Speech recognition and translation systems perform poorly on noisy inputs, which are frequent in realistic environments. Augmenting these systems with visual signals has the potential to improve robustness to noise. However, audio-visual (AV) data is only available in limited amounts and for fewer languages than audio-only resources. To address this gap, we present XLAVS-R, a cross-lingual audio-visual speech representation model for noise-robust speech recognition and translation in over 100 languages. It is designed to maximize the benefits of limited multilingual AV pre-training data, by building on top of audio-only multilingual pre-training and simplifying existing pre-training schemes. Extensive evaluation on the MuAViC benchmark shows the strength of XLAVS-R on downstream audio-visual speech recognition and translation tasks, where it outperforms the previous state of the art by up to 18.5% WER and 4.7 BLEU given noisy AV inputs, and enables strong zero-shot audio-visual ability with audio-only fine-tuning.

統計量 · 可理解性 · 相關系數 · Analysis · Nuance ·

2024 年 3 月 21 日

Analysis of Pleiotropy for Testosterone and Lipid Profiles in Males and Females

Srijan Chattopadhyay,Swapnaneel Bhattacharyya,Sevantee Basu

from arxiv, The authors have withdrawn this manuscript owing to the work having been performed in the lab of Anasuya Chakrabarty, but the mansucript being submitted without her knowledge or consent. Therefore, the authors do not wish this work to be cited as reference for the project. If you have any questions, please contact the corresponding author

In modern scientific studies, it is often imperative to determine whether a set of phenotypes is affected by a single factor. If such an influence is identified, it becomes essential to discern whether this effect is contingent upon categories such as sex or age group, and importantly, to understand whether this dependence is rooted in purely non-environmental reasons. The exploration of such dependencies often involves studying pleiotropy, a phenomenon wherein a single genetic locus impacts multiple traits. This heightened interest in uncovering dependencies by pleiotropy is fueled by the growing accessibility of summary statistics from genome-wide association studies (GWAS) and the establishment of thoroughly phenotyped sample collections. This advancement enables a systematic and comprehensive exploration of the genetic connections among various traits and diseases. additive genetic correlation illuminates the genetic connection between two traits, providing valuable insights into the shared biological pathways and underlying causal relationships between them. In this paper, we present a novel method to analyze such dependencies by studying additive genetic correlations between pairs of traits under consideration. Subsequently, we employ matrix comparison techniques to discern and elucidate sex-specific or age-group-specific associations, contributing to a deeper understanding of the nuanced dependencies within the studied traits. Our proposed method is computationally handy and requires only GWAS summary statistics. We validate our method by applying it to the UK Biobank data and present the results.

大語言模型 · MoDELS · 語言模型化 · Performer · 基準 ·

2024 年 3 月 21 日

CPSDBench: A Large Language Model Evaluation Benchmark and Baseline for Chinese Public Security Domain

Xin Tong,Bo Jin,Zhi Lin,Binjun Wang,Ting Yu,Qiang Cheng

Large Language Models (LLMs) have demonstrated significant potential and effectiveness across multiple application domains. To assess the performance of mainstream LLMs in public security tasks, this study aims to construct a specialized evaluation benchmark tailored to the Chinese public security domain--CPSDbench. CPSDbench integrates datasets related to public security collected from real-world scenarios, supporting a comprehensive assessment of LLMs across four key dimensions: text classification, information extraction, question answering, and text generation. Furthermore, this study introduces a set of innovative evaluation metrics designed to more precisely quantify the efficacy of LLMs in executing tasks related to public security. Through the in-depth analysis and evaluation conducted in this research, we not only enhance our understanding of the performance strengths and limitations of existing models in addressing public security issues but also provide references for the future development of more accurate and customized LLM models targeted at applications in this field.

MoDELS · Performer · Extensibility · 可理解性 · CRAFT ·

2024 年 3 月 21 日

FMM-Attack: A Flow-based Multi-modal Adversarial Attack on Video-based LLMs

Jinmin Li,Kuofeng Gao,Yang Bai,Jingyun Zhang,Shu-tao Xia,Yisen Wang

Despite the remarkable performance of video-based large language models (LLMs), their adversarial threat remains unexplored. To fill this gap, we propose the first adversarial attack tailored for video-based LLMs by crafting flow-based multi-modal adversarial perturbations on a small fraction of frames within a video, dubbed FMM-Attack. Extensive experiments show that our attack can effectively induce video-based LLMs to generate incorrect answers when videos are added with imperceptible adversarial perturbations. Intriguingly, our FMM-Attack can also induce garbling in the model output, prompting video-based LLMs to hallucinate. Overall, our observations inspire a further understanding of multi-modal robustness and safety-related feature alignment across different modalities, which is of great importance for various large multi-modal models. Our code is available at //github.com/THU-Kingmin/FMM-Attack.

Networking · Performer · 演化計算 · 結點 · 泛函 ·

2024 年 3 月 19 日

Using evolutionary computation to optimize task performance of unclocked, recurrent Boolean circuits in FPGAs

Raphael Norman-Tenazas,David Kleinberg,Erik C. Johnson,Daniel P. Lathrop,Matthew J. Roos

It has been shown that unclocked, recurrent networks of Boolean gates in FPGAs can be used for low-SWaP reservoir computing. In such systems, topology and node functionality of the network are randomly initialized. To create a network that solves a task, weights are applied to output nodes and learning is achieved by adjusting those weights with conventional machine learning methods. However, performance is often limited compared to networks where all parameters are learned. Herein, we explore an alternative learning approach for unclocked, recurrent networks in FPGAs. We use evolutionary computation to evolve the Boolean functions of network nodes. In one type of implementation the output nodes are used directly to perform a task and all learning is via evolution of the network's node functions. In a second type of implementation a back-end classifier is used as in traditional reservoir computing. In that case, both evolution of node functions and adjustment of output node weights contribute to learning. We demonstrate the practicality of node function evolution, obtaining an accuracy improvement of ~30% on an image classification task while processing at a rate of over three million samples per second. We additionally demonstrate evolvability of network memory and dynamic output signals.

Analysis · 可理解性 · 成比例 · 表示 · 大學 ·

2024 年 3 月 18 日

Visualizing Progress in Broadening Participation in Computing: The Value of Context

Valerie Barr,Carla E. Brodley,Manuel A. Pérez-Qui?ones

from arxiv, Accepted for publication in Communications of the ACM, late summer or Fall 2024

Concerns about representation in computing within the U.S. have driven numerous activities to broaden participation. Assessment of the impact of these efforts and, indeed, a clear assessment of the actual "problem" being addressed are limited by the nature of the most common data analysis which looks at the representation of each population as a percentage of the number of students graduating with a degree in computing. This use of a single metric cannot adequately assess the impact of broadening participation efforts. First, this approach fails to account for changing demographics of the undergraduate population in terms of overall numbers and relative proportion of the Federally designated gender, race, and ethnicity groupings. A second issue is that the majority of literature on broadening participation in computing (BPC) reports data on gender or on race/ethnicity, omitting data on students' intersectional identities. This leads to an incorrect understanding of both the data and the challenges we face as a field. In this paper we present several different approaches to tracking the impact of BPC efforts. We make three recommendations: 1) cohort-based analysis should be used to accurately show student engagement in computing; 2) the field as a whole needs to adopt the norm of always reporting intersectional data; 3) university demographic context matters when looking at how well a CS department is doing to broaden participation in computing, including longitudinal analysis of university demographic shifts that impact the local demographics of computing.

Performer · Machine Learning · 模型性能 · MoDELS · Processing（編程語言） ·

2021 年 8 月 2 日

A Survey of Human-in-the-loop for Machine Learning

Xingjiao Wu,Luwei Xiao,Yixuan Sun,Junhang Zhang,Tianlong Ma,Liang He

Human-in-the-loop aims to train an accurate prediction model with minimum cost by integrating human knowledge and experience. Humans can provide training data for machine learning applications and directly accomplish some tasks that are hard for computers in the pipeline with the help of machine-based approaches. In this paper, we survey existing works on human-in-the-loop from a data perspective and classify them into three categories with a progressive relationship: (1) the work of improving model performance from data processing, (2) the work of improving model performance through interventional model training, and (3) the design of the system independent human-in-the-loop. Using the above categorization, we summarize major approaches in the field, along with their technical strengths/ weaknesses, we have simple classification and discussion in natural language processing, computer vision, and others. Besides, we provide some open challenges and opportunities. This survey intends to provide a high-level summarization for human-in-the-loop and motivates interested readers to consider approaches for designing effective human-in-the-loop solutions.

Vision · 模型評估 · 可約的 · 計算機視覺 · DNN ·

2020 年 3 月 24 日

A Survey of Methods for Low-Power Deep Learning and Computer Vision

Abhinav Goel,Caleb Tung,Yung-Hsiang Lu,George K. Thiruvathukal

from arxiv, Accepted for publication at 2020 IEEE 6th World Forum on Internet of Things (WF-IoT), New Orleans, LA, USA 2020

Deep neural networks (DNNs) are successful in many computer vision tasks. However, the most accurate DNNs require millions of parameters and operations, making them energy, computation and memory intensive. This impedes the deployment of large DNNs in low-power devices with limited compute resources. Recent research improves DNN models by reducing the memory requirement, energy consumption, and number of operations without significantly decreasing the accuracy. This paper surveys the progress of low-power deep learning and computer vision, specifically in regards to inference, and discusses the methods for compacting and accelerating DNN models. The techniques can be divided into four major categories: (1) parameter quantization and pruning, (2) compressed convolutional filters and matrix factorization, (3) network architecture search, and (4) knowledge distillation. We analyze the accuracy, advantages, disadvantages, and potential solutions to the problems with the techniques in each category. We also discuss new evaluation metrics as a guideline for future research.

知識表示 · Things · 推薦系統 · MoDELS · 邊 ·

2018 年 5 月 10 日

A Unified Knowledge Representation and Context-aware Recommender System in Internet of Things

Yinhao Li,Awa Alqahtani,Ellis Solaiman,Charith Perera,Prem Prakash Jayaraman,Boualem Benatallah,Rajiv Ranjan

Within the rapidly developing Internet of Things (IoT), numerous and diverse physical devices, Edge devices, Cloud infrastructure, and their quality of service requirements (QoS), need to be represented within a unified specification in order to enable rapid IoT application development, monitoring, and dynamic reconfiguration. But heterogeneities among different configuration knowledge representation models pose limitations for acquisition, discovery and curation of configuration knowledge for coordinated IoT applications. This paper proposes a unified data model to represent IoT resource configuration knowledge artifacts. It also proposes IoT-CANE (Context-Aware recommendatioN systEm) to facilitate incremental knowledge acquisition and declarative context driven knowledge recommendation.