国产特级黄色片A级无毛视频,欧美精品日韩精品国内精品,538在线播放视频,日韩精品无码中文字幕电影全,国产AV网站大全

Adopting a two-stage paradigm of pretraining followed by fine-tuning, Pretrained Language Models (PLMs) have achieved substantial advancements in the field of natural language processing. However, in real-world scenarios, data labels are often noisy due to the complex annotation process, making it essential to develop strategies for fine-tuning PLMs with such noisy labels. To this end, we introduce an innovative approach for fine-tuning PLMs using noisy labels, which incorporates the guidance of Large Language Models (LLMs) like ChatGPT. This guidance assists in accurately distinguishing between clean and noisy samples and provides supplementary information beyond the noisy labels, thereby boosting the learning process during fine-tuning PLMs. Extensive experiments on synthetic and real-world noisy datasets further demonstrate the superior advantages of our framework over the state-of-the-art baselines.

相關內容

Guidance

關注 3

估計/估計量 · 邊緣化 · Copulas · 最大似然估計 · MoDELS ·

2023 年 12 月 21 日

Two-Stage Pseudo Maximum Likelihood Estimation of Semiparametric Copula-based Regression Models for Semi-Competing Risks Data

Sakie J. Arachchige,Xinyuan Chen,Qian M. Zhou

from arxiv, 24 pages, 1 figure

We propose a two-stage estimation procedure for a copula-based model with semi-competing risks data, where the non-terminal event is subject to dependent censoring by the terminal event, and both events are subject to independent censoring. Under a copula-based model, the marginal survival functions of individual event times are specified by semiparametric transformation models, and the dependence between the bivariate event times is specified by a parametric copula function. For the estimation procedure, in the first stage, the parameters associated with the marginal of the terminal event are estimated only using the corresponding observed outcomes, and in the second stage, the marginal parameters for the non-terminal event time and the copula parameter are estimated via maximizing a pseudo-likelihood function based on the joint distribution of the bivariate event times. We derived the asymptotic properties of the proposed estimator and provided an analytic variance estimator for inference. Through simulation studies, we showed that our approach leads to consistent estimates with less computational cost and more robustness compared to the one-stage procedure developed in Chen (2012), where all parameters were estimated simultaneously. In addition, our approach demonstrates more desirable finite-sample performances over another existing two-stage estimation method proposed in Zhu et al. (2021).

正則的 · MoDELS · 3D · Performer · state-of-the-art ·

2023 年 12 月 21 日

3D Points Splatting for Real-Time Dynamic Hand Reconstruction

Zheheng Jiang,Hossein Rahmani,Sue Black,Bryan M. Williams

We present 3D Points Splatting Hand Reconstruction (3D-PSHR), a real-time and photo-realistic hand reconstruction approach. We propose a self-adaptive canonical points upsampling strategy to achieve high-resolution hand geometry representation. This is followed by a self-adaptive deformation that deforms the hand from the canonical space to the target pose, adapting to the dynamic changing of canonical points which, in contrast to the common practice of subdividing the MANO model, offers greater flexibility and results in improved geometry fitting. To model texture, we disentangle the appearance color into the intrinsic albedo and pose-aware shading, which are learned through a Context-Attention module. Moreover, our approach allows the geometric and the appearance models to be trained simultaneously in an end-to-end manner. We demonstrate that our method is capable of producing animatable, photorealistic and relightable hand reconstructions using multiple datasets, including monocular videos captured with handheld smartphones and large-scale multi-view videos featuring various hand poses. We also demonstrate that our approach achieves real-time rendering speeds while simultaneously maintaining superior performance compared to existing state-of-the-art methods.

Performer · UDA · MoDELS · 情景 · 可辨認的 ·

2023 年 12 月 21 日

Multi-Modal Domain Adaptation Across Video Scenes for Temporal Video Grounding

Haifeng Huang,Yang Zhao,Zehan Wang,Yan Xia,Zhou Zhao

Temporal Video Grounding (TVG) aims to localize the temporal boundary of a specific segment in an untrimmed video based on a given language query. Since datasets in this domain are often gathered from limited video scenes, models tend to overfit to scene-specific factors, which leads to suboptimal performance when encountering new scenes in real-world applications. In a new scene, the fine-grained annotations are often insufficient due to the expensive labor cost, while the coarse-grained video-query pairs are easier to obtain. Thus, to address this issue and enhance model performance on new scenes, we explore the TVG task in an unsupervised domain adaptation (UDA) setting across scenes for the first time, where the video-query pairs in the source scene (domain) are labeled with temporal boundaries, while those in the target scene are not. Under the UDA setting, we introduce a novel Adversarial Multi-modal Domain Adaptation (AMDA) method to adaptively adjust the model's scene-related knowledge by incorporating insights from the target data. Specifically, we tackle the domain gap by utilizing domain discriminators, which help identify valuable scene-related features effective across both domains. Concurrently, we mitigate the semantic gap between different modalities by aligning video-query pairs with related semantics. Furthermore, we employ a mask-reconstruction approach to enhance the understanding of temporal semantics within a scene. Extensive experiments on Charades-STA, ActivityNet Captions, and YouCook2 demonstrate the effectiveness of our proposed method.

蒙特卡羅 · 無偏 · 樣本 · 估計/估計量 · 采樣法 ·

2023 年 12 月 21 日

Unbiased and Consistent Nested Sampling via Sequential Monte Carlo

Robert Salomone,Leah F. South,Adam M. Johansen,Christopher Drovandi,Dirk P. Kroese

from arxiv, 21 pages main text, 6 pages supplementary material. Includes proof of consistency for an adaptive nested sampling sequential Monte Carlo algorithm

We introduce a new class of sequential Monte Carlo methods called nested sampling via sequential Monte Carlo (NS-SMC), which reformulates the essence of the nested sampling method of Skilling (2006) in terms of sequential Monte Carlo techniques. This new framework allows convergence results to be obtained in the setting when Markov chain Monte Carlo (MCMC) is used to produce new samples. An additional benefit is that marginal likelihood (normalizing constant) estimates are unbiased. In contrast to NS, the analysis of NS-SMC does not require the (unrealistic) assumption that the simulated samples be independent. We show that a minor adjustment to our adaptive NS-SMC algorithm recovers the original NS algorithm, which provides insights as to why NS seems to produce accurate estimates despite a typical violation of its assumptions. A numerical study is conducted where the performance of NS-SMC and temperature-annealed SMC is compared on challenging problems. Code for the experiments is made available online at //github.com/LeahPrice/SMC-NS .

大語言模型 · 語言模型化 · MoDELS · 知識 (knowledge) · 控制器 ·

2023 年 12 月 18 日

Opportunities and Challenges of Applying Large Language Models in Building Energy Efficiency and Decarbonization Studies: An Exploratory Overview

Liang Zhang,Zhelun Chen

In recent years, the rapid advancement and impressive capabilities of Large Language Models (LLMs) have been evident across various domains. This paper explores the application, implications, and potential of LLMs in building energy efficiency and decarbonization studies. The wide-ranging capabilities of LLMs are examined in the context of the building energy field, including intelligent control systems, code generation, data infrastructure, knowledge extraction, and education. Despite the promising potential of LLMs, challenges including complex and expensive computation, data privacy, security and copyright, complexity in fine-tuned LLMs, and self-consistency are discussed. The paper concludes with a call for future research focused on the enhancement of LLMs for domain-specific tasks, multi-modal LLMs, and collaborative research between AI and energy experts.

CSS · 計算機科學 · Performer · Analysis · 模型評估 ·

2023 年 12 月 18 日

Visualizing High-Dimensional Configuration Spaces For Robots: A Comprehensive Approach for Quantitative and Qualitative Analysis

Jorge Ocampo Jimenez,Wael Suleiman

from arxiv, 8 pages, 12 figures

The reconstruction of Configuration Space (CS) from a limited number of samples plays a vital role in expediting motion planning for random tree algorithms. Traditionally, the evaluation of CS reconstruction is performed through collision checking. However, employing the collision checker as an evaluation measure can be misleading. In particular, a collision checker may exhibit high accuracy even when only a subset of the original CS is reconstructed, limiting the motion planner's ability to find paths comparable to those in the original CS. Additionally, a significant challenge arises when dealing with high-dimensional CSs, as it becomes increasingly difficult, if not impossible, to perform qualitative evaluations when working in dimensions higher than three. In this paper, we introduce a novel approach for representing high-dimensional CSs of manipulator robots in a 2D format. Specifically, we leverage the kinematic chain of manipulator robots and the human ability to perceive colors based on hue. This allows us to construct a visualization comprising a series of pairs of 2D projections. We showcase the efficacy of our method in representing a 7-degree-of-freedom CS of a manipulator robot in a 2D projection. This representation provides qualitative insights into the joint boundaries of the robot and the collision state combinations. From a quantitative perspective, we show that the proposed representation not only captures accuracy but also furnishes additional information, enhancing our ability to compare two different high-dimensional CSs during the deployment phase, beyond what is usually offered by the collision checker. The source code is publicly available on our repository.

秩 · 穩健性 · MoDELS · INFORMS · 正則化項 ·

2023 年 12 月 16 日

Perturbation-Invariant Adversarial Training for Neural Ranking Models: Improving the Effectiveness-Robustness Trade-Off

Yu-An Liu,Ruqing Zhang,Mingkun Zhang,Wei Chen,Maarten de Rijke,Jiafeng Guo,Xueqi Cheng

from arxiv, Accepted by AAAI 24

Neural ranking models (NRMs) have shown great success in information retrieval (IR). But their predictions can easily be manipulated using adversarial examples, which are crafted by adding imperceptible perturbations to legitimate documents. This vulnerability raises significant concerns about their reliability and hinders the widespread deployment of NRMs. By incorporating adversarial examples into training data, adversarial training has become the de facto defense approach to adversarial attacks against NRMs. However, this defense mechanism is subject to a trade-off between effectiveness and adversarial robustness. In this study, we establish theoretical guarantees regarding the effectiveness-robustness trade-off in NRMs. We decompose the robust ranking error into two components, i.e., a natural ranking error for effectiveness evaluation and a boundary ranking error for assessing adversarial robustness. Then, we define the perturbation invariance of a ranking model and prove it to be a differentiable upper bound on the boundary ranking error for attainable computation. Informed by our theoretical analysis, we design a novel \emph{perturbation-invariant adversarial training} (PIAT) method for ranking models to achieve a better effectiveness-robustness trade-off. We design a regularized surrogate loss, in which one term encourages the effectiveness to be maximized while the regularization term encourages the output to be smooth, so as to improve adversarial robustness. Experimental results on several ranking models demonstrate the superiority of PITA compared to existing adversarial defenses.

剪枝 · Better · CAP · contrastive · MoDELS ·

2021 年 12 月 14 日

From Dense to Sparse: Contrastive Pruning for Better Pre-trained Language Model Compression

Runxin Xu,Fuli Luo,Chengyu Wang,Baobao Chang,Jun Huang,Songfang Huang,Fei Huang

from arxiv, Accepted to AAAI 2022

Pre-trained Language Models (PLMs) have achieved great success in various Natural Language Processing (NLP) tasks under the pre-training and fine-tuning paradigm. With large quantities of parameters, PLMs are computation-intensive and resource-hungry. Hence, model pruning has been introduced to compress large-scale PLMs. However, most prior approaches only consider task-specific knowledge towards downstream tasks, but ignore the essential task-agnostic knowledge during pruning, which may cause catastrophic forgetting problem and lead to poor generalization ability. To maintain both task-agnostic and task-specific knowledge in our pruned model, we propose ContrAstive Pruning (CAP) under the paradigm of pre-training and fine-tuning. It is designed as a general framework, compatible with both structured and unstructured pruning. Unified in contrastive learning, CAP enables the pruned model to learn from the pre-trained model for task-agnostic knowledge, and fine-tuned model for task-specific knowledge. Besides, to better retain the performance of the pruned model, the snapshots (i.e., the intermediate models at each pruning iteration) also serve as effective supervisions for pruning. Our extensive experiments show that adopting CAP consistently yields significant improvements, especially in extremely high sparsity scenarios. With only 3% model parameters reserved (i.e., 97% sparsity), CAP successfully achieves 99.2% and 96.3% of the original BERT performance in QQP and MNLI tasks. In addition, our probing experiments demonstrate that the model pruned by CAP tends to achieve better generalization ability.

entity · 小樣本學習 · 注意力機制 · 圖 · Networking ·

2020 年 10 月 19 日

Adaptive Attentional Network for Few-Shot Knowledge Graph Completion

Jiawei Sheng,Shu Guo,Zhenyu Chen,Juwei Yue,Lihong Wang,Tingwen Liu,Hongbo Xu

from arxiv, 11 pages, 3 figures

Few-shot Knowledge Graph (KG) completion is a focus of current research, where each task aims at querying unseen facts of a relation given its few-shot reference entity pairs. Recent attempts solve this problem by learning static representations of entities and references, ignoring their dynamic properties, i.e., entities may exhibit diverse roles within task relations, and references may make different contributions to queries. This work proposes an adaptive attentional network for few-shot KG completion by learning adaptive entity and reference representations. Specifically, entities are modeled by an adaptive neighbor encoder to discern their task-oriented roles, while references are modeled by an adaptive query-aware aggregator to differentiate their contributions. Through the attention mechanism, both entities and references can capture their fine-grained semantic meanings, and thus render more expressive representations. This will be more predictive for knowledge acquisition in the few-shot scenario. Evaluation in link prediction on two public datasets shows that our approach achieves new state-of-the-art results with different few-shot sizes.

卷積神經網絡 · Neural Networks · Performer · Seven · Processing（編程語言） ·

2019 年 1 月 17 日

A Survey of the Recent Architectures of Deep Convolutional Neural Networks

Asifullah Khan,Anabia Sohail,Umme Zahoora,Aqsa Saeed Qureshi

from arxiv, Number of Pages: 60 Number of Figures: 11 Number of Tables:1

Deep Convolutional Neural Networks (CNNs) are a special type of Neural Networks, which have shown state-of-the-art results on various competitive benchmarks. The powerful learning ability of deep CNN is largely achieved with the use of multiple non-linear feature extraction stages that can automatically learn hierarchical representation from the data. Availability of a large amount of data and improvements in the hardware processing units have accelerated the research in CNNs and recently very interesting deep CNN architectures are reported. The recent race in deep CNN architectures for achieving high performance on the challenging benchmarks has shown that the innovative architectural ideas, as well as parameter optimization, can improve the CNN performance on various vision-related tasks. In this regard, different ideas in the CNN design have been explored such as use of different activation and loss functions, parameter optimization, regularization, and restructuring of processing units. However, the major improvement in representational capacity is achieved by the restructuring of the processing units. Especially, the idea of using a block as a structural unit instead of a layer is gaining substantial appreciation. This survey thus focuses on the intrinsic taxonomy present in the recently reported CNN architectures and consequently, classifies the recent innovations in CNN architectures into seven different categories. These seven categories are based on spatial exploitation, depth, multi-path, width, feature map exploitation, channel boosting and attention. Additionally, it covers the elementary understanding of the CNN components and sheds light on the current challenges and applications of CNNs.