亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

<dir id='34Cnu'><del id='uVvT6'><del id='1Xbul'></del><pre id='dI0cQ'><pre id='45rwN'><option id='CFbcj'><address id='TcnJv'></address><bdo id='jtVkp'><tr id='xspbD'><acronym id='N620A'><pre id='Q2Mm5'></pre></acronym><div id='8MgJ2'></div></tr></bdo></option></pre><small id='ZTsJl'><address id='XKB0c'><u id='UWYWs'><legend id='LPp3a'><option id='HObMD'><abbr id='2zaPx'></abbr><li id='GnmsA'><pre id='O0PZP'></pre></li></option></legend><select id='dD38Q'></select></u></address></small></pre></del><sup id='5c15t'></sup><blockquote id='rwnJu'><dt id='lRQgu'></dt></blockquote><blockquote id='yJWLF'></blockquote></dir><tt id='GKGwN'></tt><u id='Lazn7'><tt id='dWe7I'><form id='ZJgwg'></form></tt><td id='kaM8p'><dt id='8tIFr'></dt></td></u>

<code id='kdRK2'><i id='gFoHq'><q id='Nt0gq'><legend id='KBThN'><pre id='rdQ3o'><style id='BdMlh'><acronym id='MGkTY'><i id='T48Oq'><form id='lY3hK'><option id='DekNU'><center id='V8jj9'></center></option></form></i></acronym></style><tt id='gxYwA'></tt></pre></legend></q></i></code><center id='d3czO'></center>

<dd id='VeomL'></dd>

<style id='sqwRd'></style><sub id='wy7JF'><dfn id='0wB72'><abbr id='Vd8yg'><big id='1zlnC'><bdo id='zLXiB'></bdo></big></abbr></dfn></sub>_{<dir id='VKYi7'></dir>}

·

Performer · 情景 · Networking · Neural Networks · Automator ·

2023 年 12 月 7 日

Unveiling Objects with SOLA: An Annotation-Free Image Search on the Object Level for Automotive Data Sets

Philipp Rigoll,Jacob Langner,Eric Sax

Huge image data sets are the fundament for the development of the perception of automated driving systems. A large number of images is necessary to train robust neural networks that can cope with diverse situations. A sufficiently large data set contains challenging situations and objects. For testing the resulting functions, it is necessary that these situations and objects can be found and extracted from the data set. While it is relatively easy to record a large amount of unlabeled data, it is far more difficult to find demanding situations and objects. However, during the development of perception systems, it must be possible to access challenging data without having to perform lengthy and time-consuming annotations. A developer must therefore be able to search dynamically for specific situations and objects in a data set. Thus, we designed a method which is based on state-of-the-art neural networks to search for objects with certain properties within an image. For the ease of use, the query of this search is described using natural language. To determine the time savings and performance gains, we evaluated our method qualitatively and quantitatively on automotive data sets.

相關內容

Performer

LORA · 稀疏 · Performer · MoDELS · 詞元分析器 ·

2024 年 1 月 30 日

LLaVA-MoLE: Sparse Mixture of LoRA Experts for Mitigating Data Conflicts in Instruction Finetuning MLLMs

Shaoxiang Chen,Zequn Jie,Lin Ma

Instruction finetuning on a variety of image-text instruction data is the key to obtaining a versatile Multimodal Large Language Model (MLLM), and different configurations of the instruction data can lead to finetuned models with different capabilities. However, we have discovered that data conflicts are inevitable when mixing instruction data from distinct domains, which can result in performance drops for tasks of a specific domain. To address this issue, we propose to apply an efficient Mixture of Experts (MoE) design, which is a sparse Mixture of LoRA Experts (MoLE) for instruction finetuning MLLMs. Within the Transformer layers, we extend the popular Low-Rank Adaption (LoRA) method by creating a set of LoRA experts specifically for the MLP layer, and route each token to the top-1 expert based on a routing function, allowing adaptive choices for tokens from different domains. Since the LoRA experts are sparsely activated, the training and inference cost are kept roughly constant compared to the original LoRA method. By replacing the plain-LoRA of LLaVA-1.5 with our MoE design, our final model is named LLaVA-MoLE. Extensive experiments proved that LLaVA-MoLE effectively mitigates the data conflict issue when mixing multiple distinct instruction datasets with various configurations, and achieves consistent performance gains over the strong plain-LoRA baselines. Most importantly, on the mixed datasets, LLaVA-MoLE can even outperform the plain-LoRA baseline trained with twice the samples.

模型評估 · 控制器 · 聯邦學習 · Learning · MoDELS ·

2024 年 1 月 29 日

EchoPFL: Asynchronous Personalized Federated Learning on Mobile Devices with On-Demand Staleness Control

Xiaochen Li,Sicong Liu,Zimu Zhou,Bin Guo,Yuan Xu,Zhiwen Yu

from arxiv, accepted by Ubicomp2024

The rise of mobile devices with abundant sensory data and local computing capabilities has driven the trend of federated learning (FL) on these devices. And personalized FL (PFL) emerges to train specific deep models for each mobile device to address data heterogeneity and varying performance preferences. However, mobile training times vary significantly, resulting in either delay (when waiting for slower devices for aggregation) or accuracy decline (when aggregation proceeds without waiting). In response, we propose a shift towards asynchronous PFL, where the server aggregates updates as soon as they are available. Nevertheless, existing asynchronous protocols are unfit for PFL because they are devised for federated training of a single global model. They suffer from slow convergence and decreased accuracy when confronted with severe data heterogeneity prevalent in PFL. Furthermore, they often exclude slower devices for staleness control, which notably compromises accuracy when these devices possess critical personalized data. Therefore, we propose EchoPFL, a coordination mechanism for asynchronous PFL. Central to EchoPFL is to include updates from all mobile devices regardless of their latency. To cope with the inevitable staleness from slow devices, EchoPFL revisits model broadcasting. It intelligently converts the unscalable broadcast to on-demand broadcast, leveraging the asymmetrical bandwidth in wireless networks and the dynamic clustering-based PFL. Experiments show that compared to status quo approaches, EchoPFL achieves a reduction of up to 88.2% in convergence time, an improvement of up to 46% in accuracy, and a decrease of 37% in communication costs

Networking · CC · Performer · 卷積 · 層 ·

2024 年 1 月 29 日

TFDMNet: A Novel Network Structure Combines the Time Domain and Frequency Domain Features

Hengyue Pan,Yixin Chen,Zhiliang Tian,Peng Qiao,Linbo Qiao,Dongsheng Li

from arxiv, This paper is the updated edition of our paper Learning Convolutional Neural Networks in the Frequency Domain (arXiv:2204.06718). Comparing with the previous edition, we design a mixture model to get the balance between the computation complexity and memory usage

Convolutional neural network (CNN) has achieved impressive success in computer vision during the past few decades. The image convolution operation helps CNNs to get good performance on image-related tasks. However, it also has high computation complexity and hard to be parallelized. This paper proposes a novel Element-wise Multiplication Layer (EML) to replace convolution layers, which can be trained in the frequency domain. Theoretical analyses show that EMLs lower the computation complexity and easier to be parallelized. Moreover, we introduce a Weight Fixation mechanism to alleviate the problem of over-fitting, and analyze the working behavior of Batch Normalization and Dropout in the frequency domain. To get the balance between the computation complexity and memory usage, we propose a new network structure, namely Time-Frequency Domain Mixture Network (TFDMNet), which combines the advantages of both convolution layers and EMLs. Experimental results imply that TFDMNet achieves good performance on MNIST, CIFAR-10 and ImageNet databases with less number of operations comparing with corresponding CNNs.

Microsoft Surface · 多樣性 · Nuance · 評論員 · 語言模型化 ·

2024 年 1 月 26 日

Under the Surface: Tracking the Artifactuality of LLM-Generated Data

Debarati Das,Karin De Langis,Anna Martin,Jaehyung Kim,Minhwa Lee,Zae Myung Kim,Shirley Hayati,Risako Owan,Bin Hu,Ritik Parkar,Ryan Koo,Jonginn Park,Aahan Tyagi,Libby Ferland,Sanjali Roy,Vincent Liu,Dongyeop Kang

from arxiv, Core Authors: Debarati Das, Karin De Langis, Anna Martin, Jaehyung Kim, Minhwa Lee and Zae Myung Kim | Project lead : Debarati Das | PI : Dongyeop Kang

This work delves into the expanding role of large language models (LLMs) in generating artificial data. LLMs are increasingly employed to create a variety of outputs, including annotations, preferences, instruction prompts, simulated dialogues, and free text. As these forms of LLM-generated data often intersect in their application, they exert mutual influence on each other and raise significant concerns about the quality and diversity of the artificial data incorporated into training cycles, leading to an artificial data ecosystem. To the best of our knowledge, this is the first study to aggregate various types of LLM-generated text data, from more tightly constrained data like "task labels" to more lightly constrained "free-form text". We then stress test the quality and implications of LLM-generated artificial data, comparing it with human data across various existing benchmarks. Despite artificial data's capability to match human performance, this paper reveals significant hidden disparities, especially in complex tasks where LLMs often miss the nuanced understanding of intrinsic human-generated content. This study critically examines diverse LLM-generated data and emphasizes the need for ethical practices in data creation and when using LLMs. It highlights the LLMs' shortcomings in replicating human traits and behaviors, underscoring the importance of addressing biases and artifacts produced in LLM-generated content for future research and development. All data and code are available on our project page.

MoDELS · 秩 · 樣本 · 統計方法 · 博弈論 ·

2024 年 1 月 25 日

Properties of the Mallows Model Depending on the Number of Alternatives: A Warning for an Experimentalist

Niclas Boehmer,Piotr Faliszewski,Sonja Kraiczy

from arxiv, 33 pages, 9 figures

The Mallows model is a popular distribution for ranked data. We empirically and theoretically analyze how the properties of rankings sampled from the Mallows model change when increasing the number of alternatives. We find that real-world data behaves differently than the Mallows model, yet is in line with its recent variant proposed by Boehmer et al. [2021]. As part of our study, we issue several warnings about using the model.

特化 · prototype · 置信度 · 相似度 · Extensibility ·

2024 年 1 月 25 日

PRISM: Leveraging Prototype Patient Representations with Feature-Missing-Aware Calibration for EHR Data Sparsity Mitigation

Yinghao Zhu,Zixiang Wang,Long He,Shiyun Xie,Liantao Ma,Chengwei Pan

Electronic Health Record (EHR) data, while rich in information, often suffers from sparsity, posing significant challenges in predictive modeling. Traditional imputation methods inadequately distinguish between real and imputed data, leading to potential inaccuracies in models. Addressing this, we introduce PRISM, a novel approach that indirectly imputes data through prototype representations of similar patients, thus ensuring denser and more accurate embeddings. PRISM innovates further with a feature confidence learner module, which evaluates the reliability of each feature in light of missing data. Additionally, it incorporates a novel patient similarity metric that accounts for feature confidence, avoiding overreliance on imprecise imputed values. Our extensive experiments on the MIMIC-III and MIMIC-IV datasets demonstrate PRISM's superior performance in predicting in-hospital mortality and 30-day readmission tasks, showcasing its effectiveness in handling EHR data sparsity. For the sake of reproducibility and further research, we have made the code publicly available at //github.com/yhzhu99/PRISM.

Neural Networks · 有向 · Networks · Performer · Pattern Recognition ·

2021 年 8 月 30 日

Efficient Visual Recognition with Deep Neural Networks: A Survey on Recent Advances and New Directions

Yang Wu,Dingheng Wang,Xiaotong Lu,Fan Yang,Guoqi Li,Weisheng Dong,Jianbo Shi

Visual recognition is currently one of the most important and active research areas in computer vision, pattern recognition, and even the general field of artificial intelligence. It has great fundamental importance and strong industrial needs. Deep neural networks (DNNs) have largely boosted their performances on many concrete tasks, with the help of large amounts of training data and new powerful computation resources. Though recognition accuracy is usually the first concern for new progresses, efficiency is actually rather important and sometimes critical for both academic research and industrial applications. Moreover, insightful views on the opportunities and challenges of efficiency are also highly required for the entire community. While general surveys on the efficiency issue of DNNs have been done from various perspectives, as far as we are aware, scarcely any of them focused on visual recognition systematically, and thus it is unclear which progresses are applicable to it and what else should be concerned. In this paper, we present the review of the recent advances with our suggestions on the new possible directions towards improving the efficiency of DNN-related visual recognition approaches. We investigate not only from the model but also the data point of view (which is not the case in existing surveys), and focus on three most studied data types (images, videos and points). This paper attempts to provide a systematic summary via a comprehensive survey which can serve as a valuable reference and inspire both researchers and practitioners who work on visual recognition problems.

圖形處理器 · 圖 · Better · Neural Networks · 視覺問答 ·

2020 年 3 月 31 日

Multi-Modal Graph Neural Network for Joint Reasoning on Vision and Scene Text

Difei Gao,Ke Li,Ruiping Wang,Shiguang Shan,Xilin Chen

from arxiv, Published as a CVPR2020 paper

Answering questions that require reading texts in an image is challenging for current models. One key difficulty of this task is that rare, polysemous, and ambiguous words frequently appear in images, e.g., names of places, products, and sports teams. To overcome this difficulty, only resorting to pre-trained word embedding models is far from enough. A desired model should utilize the rich information in multiple modalities of the image to help understand the meaning of scene texts, e.g., the prominent text on a bottle is most likely to be the brand. Following this idea, we propose a novel VQA approach, Multi-Modal Graph Neural Network (MM-GNN). It first represents an image as a graph consisting of three sub-graphs, depicting visual, semantic, and numeric modalities respectively. Then, we introduce three aggregators which guide the message passing from one graph to another to utilize the contexts in various modalities, so as to refine the features of nodes. The updated nodes have better features for the downstream question answering module. Experimental evaluations show that our MM-GNN represents the scene texts better and obviously facilitates the performances on two VQA tasks that require reading scene texts.

Faster R-CNN · domain shift · R-CNN · 目標檢測 · 可約的 ·

2018 年 3 月 8 日

Domain Adaptive Faster R-CNN for Object Detection in the Wild

Yuhua Chen,Wen Li,Christos Sakaridis,Dengxin Dai,Luc Van Gool

from arxiv, Accepted to CVPR 2018

Object detection typically assumes that training and test data are drawn from an identical distribution, which, however, does not always hold in practice. Such a distribution mismatch will lead to a significant performance drop. In this work, we aim to improve the cross-domain robustness of object detection. We tackle the domain shift on two levels: 1) the image-level shift, such as image style, illumination, etc, and 2) the instance-level shift, such as object appearance, size, etc. We build our approach based on the recent state-of-the-art Faster R-CNN model, and design two domain adaptation components, on image level and instance level, to reduce the domain discrepancy. The two domain adaptation components are based on H-divergence theory, and are implemented by learning a domain classifier in adversarial training manner. The domain classifiers on different levels are further reinforced with a consistency regularization to learn a domain-invariant region proposal network (RPN) in the Faster R-CNN model. We evaluate our newly proposed approach using multiple datasets including Cityscapes, KITTI, SIM10K, etc. The results demonstrate the effectiveness of our proposed approach for robust object detection in various domain shift scenarios.

圖像字幕 · MoDELS · Performer · state-of-the-art · Vision ·

2018 年 1 月 30 日

Image Captioning at Will: A Versatile Scheme for Effectively Injecting Sentiments into Image Descriptions

Quanzeng You,Hailin Jin,Jiebo Luo

from arxiv, 8 pages, 5 figures and 4 tables

Automatic image captioning has recently approached human-level performance due to the latest advances in computer vision and natural language understanding. However, most of the current models can only generate plain factual descriptions about the content of a given image. However, for human beings, image caption writing is quite flexible and diverse, where additional language dimensions, such as emotion, humor and language styles, are often incorporated to produce diverse, emotional, or appealing captions. In particular, we are interested in generating sentiment-conveying image descriptions, which has received little attention. The main challenge is how to effectively inject sentiments into the generated captions without altering the semantic matching between the visual content and the generated descriptions. In this work, we propose two different models, which employ different schemes for injecting sentiments into image captions. Compared with the few existing approaches, the proposed models are much simpler and yet more effective. The experimental results show that our model outperform the state-of-the-art models in generating sentimental (i.e., sentiment-bearing) image captions. In addition, we can also easily manipulate the model by assigning different sentiments to the testing image to generate captions with the corresponding sentiments.

閱讀: 0 點贊: 0

小貼士

登錄享

相關主題

Neural Networks

北京阿比特科技有限公司

注冊地址：北京市海淀區羊坊店路18號2幢3層301-191

<li id='vv38e'></li>

_{^{<dd id='vv38e'><tbody id='vv38e'><td id='vv38e'><optgroup id='vv38e'><strong id='vv38e'></strong></optgroup><address id='vv38e'><ul id='vv38e'></ul></address><big id='vv38e'></big></td><table id='vv38e'></table></tbody><pre id='vv38e'></pre></dd><span id='vv38e'><b id='vv38e'></b></span>}}


<dfn id='vv38e'><optgroup id='vv38e'></optgroup></dfn><tfoot id='vv38e'><bdo id='vv38e'><div id='vv38e'></div><i id='vv38e'><dt id='vv38e'></dt></i></bdo></tfoot>

_{<fieldset id='vv38e'></fieldset>}