亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

As the most fundamental tasks of computer vision, object detection and segmentation have made tremendous progress in the deep learning era. Due to the expensive manual labeling, the annotated categories in existing datasets are often small-scale and pre-defined, i.e., state-of-the-art detectors and segmentors fail to generalize beyond the closed-vocabulary. To resolve this limitation, the last few years have witnessed increasing attention toward Open-Vocabulary Detection (OVD) and Segmentation (OVS). In this survey, we provide a comprehensive review on the past and recent development of OVD and OVS. To this end, we develop a taxonomy according to the type of task and methodology. We find that the permission and usage of weak supervision signals can well discriminate different methodologies, including: visual-semantic space mapping, novel visual feature synthesis, region-aware training, pseudo-labeling, knowledge distillation-based, and transfer learning-based. The proposed taxonomy is universal across different tasks, covering object detection, semantic/instance/panoptic segmentation, 3D scene and video understanding. In each category, its main principles, key challenges, development routes, strengths, and weaknesses are thoroughly discussed. In addition, we benchmark each task along with the vital components of each method. Finally, several promising directions are provided to stimulate future research.

相關內容

分(fen)(fen)(fen)類(lei)(lei)(lei)(lei)(lei)(lei)(lei)(lei)(lei)學是(shi)(shi)分(fen)(fen)(fen)類(lei)(lei)(lei)(lei)(lei)(lei)(lei)(lei)(lei)的(de)(de)(de)(de)實踐和(he)科(ke)學。Wikipedia類(lei)(lei)(lei)(lei)(lei)(lei)(lei)(lei)(lei)別(bie)(bie)說明(ming)了一(yi)種(zhong)分(fen)(fen)(fen)類(lei)(lei)(lei)(lei)(lei)(lei)(lei)(lei)(lei)法,可以通過自動方(fang)式(shi)提取Wikipedia類(lei)(lei)(lei)(lei)(lei)(lei)(lei)(lei)(lei)別(bie)(bie)的(de)(de)(de)(de)完整分(fen)(fen)(fen)類(lei)(lei)(lei)(lei)(lei)(lei)(lei)(lei)(lei)法。截至2009年,已經證明(ming),可以使用人工構(gou)(gou)建(jian)的(de)(de)(de)(de)分(fen)(fen)(fen)類(lei)(lei)(lei)(lei)(lei)(lei)(lei)(lei)(lei)法(例(li)(li)如像WordNet這(zhe)樣的(de)(de)(de)(de)計算(suan)詞(ci)典的(de)(de)(de)(de)分(fen)(fen)(fen)類(lei)(lei)(lei)(lei)(lei)(lei)(lei)(lei)(lei)法)來改進和(he)重組Wikipedia類(lei)(lei)(lei)(lei)(lei)(lei)(lei)(lei)(lei)別(bie)(bie)分(fen)(fen)(fen)類(lei)(lei)(lei)(lei)(lei)(lei)(lei)(lei)(lei)法。 從(cong)廣義上講,分(fen)(fen)(fen)類(lei)(lei)(lei)(lei)(lei)(lei)(lei)(lei)(lei)法還適(shi)用于除(chu)父子層次結(jie)(jie)構(gou)(gou)以外的(de)(de)(de)(de)關系方(fang)案,例(li)(li)如網絡結(jie)(jie)構(gou)(gou)。然后(hou)分(fen)(fen)(fen)類(lei)(lei)(lei)(lei)(lei)(lei)(lei)(lei)(lei)法可能包括有多父母的(de)(de)(de)(de)單(dan)身孩子,例(li)(li)如,“汽車”可能與(yu)父母雙(shuang)方(fang)一(yi)起(qi)出現“車輛”和(he)“鋼(gang)結(jie)(jie)構(gou)(gou)”;但是(shi)(shi)對某些人而言,這(zhe)僅意味著“汽車”是(shi)(shi)幾種(zhong)不同分(fen)(fen)(fen)類(lei)(lei)(lei)(lei)(lei)(lei)(lei)(lei)(lei)法的(de)(de)(de)(de)一(yi)部(bu)分(fen)(fen)(fen)。分(fen)(fen)(fen)類(lei)(lei)(lei)(lei)(lei)(lei)(lei)(lei)(lei)法也可能只是(shi)(shi)將事(shi)物(wu)組織成組,或者是(shi)(shi)按字(zi)母順序排(pai)列的(de)(de)(de)(de)列表(biao);但是(shi)(shi)在(zai)這(zhe)里,術(shu)語(yu)詞(ci)匯(hui)更合適(shi)。在(zai)知識管理(li)(li)中(zhong)的(de)(de)(de)(de)當(dang)前(qian)用法中(zhong),分(fen)(fen)(fen)類(lei)(lei)(lei)(lei)(lei)(lei)(lei)(lei)(lei)法被認為(wei)比本(ben)體(ti)(ti)論(lun)窄,因為(wei)本(ben)體(ti)(ti)論(lun)應用了各種(zhong)各樣的(de)(de)(de)(de)關系類(lei)(lei)(lei)(lei)(lei)(lei)(lei)(lei)(lei)型。 在(zai)數(shu)學上,分(fen)(fen)(fen)層分(fen)(fen)(fen)類(lei)(lei)(lei)(lei)(lei)(lei)(lei)(lei)(lei)法是(shi)(shi)給定對象(xiang)集的(de)(de)(de)(de)分(fen)(fen)(fen)類(lei)(lei)(lei)(lei)(lei)(lei)(lei)(lei)(lei)樹結(jie)(jie)構(gou)(gou)。該結(jie)(jie)構(gou)(gou)的(de)(de)(de)(de)頂(ding)部(bu)是(shi)(shi)適(shi)用于所有對象(xiang)的(de)(de)(de)(de)單(dan)個分(fen)(fen)(fen)類(lei)(lei)(lei)(lei)(lei)(lei)(lei)(lei)(lei),即(ji)根(gen)節點。此(ci)根(gen)下(xia)的(de)(de)(de)(de)節點是(shi)(shi)更具體(ti)(ti)的(de)(de)(de)(de)分(fen)(fen)(fen)類(lei)(lei)(lei)(lei)(lei)(lei)(lei)(lei)(lei),適(shi)用于總分(fen)(fen)(fen)類(lei)(lei)(lei)(lei)(lei)(lei)(lei)(lei)(lei)對象(xiang)集的(de)(de)(de)(de)子集。推理(li)(li)的(de)(de)(de)(de)進展從(cong)一(yi)般到(dao)更具體(ti)(ti)。

知識薈萃

精品入門和進(jin)階教程、論文和代碼整(zheng)理等

更多

查(cha)看相(xiang)關VIP內(nei)容、論文、資(zi)訊(xun)等

Temporal characteristics are prominently evident in a substantial volume of knowledge, which underscores the pivotal role of Temporal Knowledge Graphs (TKGs) in both academia and industry. However, TKGs often suffer from incompleteness for three main reasons: the continuous emergence of new knowledge, the weakness of the algorithm for extracting structured information from unstructured data, and the lack of information in the source dataset. Thus, the task of Temporal Knowledge Graph Completion (TKGC) has attracted increasing attention, aiming to predict missing items based on the available information. In this paper, we provide a comprehensive review of TKGC methods and their details. Specifically, this paper mainly consists of three components, namely, 1)Background, which covers the preliminaries of TKGC methods, loss functions required for training, as well as the dataset and evaluation protocol; 2)Interpolation, that estimates and predicts the missing elements or set of elements through the relevant available information. It further categorizes related TKGC methods based on how to process temporal information; 3)Extrapolation, which typically focuses on continuous TKGs and predicts future events, and then classifies all extrapolation methods based on the algorithms they utilize. We further pinpoint the challenges and discuss future research directions of TKGC.

Foundation models pretrained on diverse data at scale have demonstrated extraordinary capabilities in a wide range of vision and language tasks. When such models are deployed in real world environments, they inevitably interface with other entities and agents. For example, language models are often used to interact with human beings through dialogue, and visual perception models are used to autonomously navigate neighborhood streets. In response to these developments, new paradigms are emerging for training foundation models to interact with other agents and perform long-term reasoning. These paradigms leverage the existence of ever-larger datasets curated for multimodal, multitask, and generalist interaction. Research at the intersection of foundation models and decision making holds tremendous promise for creating powerful new systems that can interact effectively across a diverse range of applications such as dialogue, autonomous driving, healthcare, education, and robotics. In this manuscript, we examine the scope of foundation models for decision making, and provide conceptual tools and technical background for understanding the problem space and exploring new research directions. We review recent approaches that ground foundation models in practical decision making applications through a variety of methods such as prompting, conditional generative modeling, planning, optimal control, and reinforcement learning, and discuss common challenges and open problems in the field.

Recommendation systems have become popular and effective tools to help users discover their interesting items by modeling the user preference and item property based on implicit interactions (e.g., purchasing and clicking). Humans perceive the world by processing the modality signals (e.g., audio, text and image), which inspired researchers to build a recommender system that can understand and interpret data from different modalities. Those models could capture the hidden relations between different modalities and possibly recover the complementary information which can not be captured by a uni-modal approach and implicit interactions. The goal of this survey is to provide a comprehensive review of the recent research efforts on the multimodal recommendation. Specifically, it shows a clear pipeline with commonly used techniques in each step and classifies the models by the methods used. Additionally, a code framework has been designed that helps researchers new in this area to understand the principles and techniques, and easily runs the SOTA models. Our framework is located at: //github.com/enoche/MMRec

By interacting, synchronizing, and cooperating with its physical counterpart in real time, digital twin is promised to promote an intelligent, predictive, and optimized modern city. Via interconnecting massive physical entities and their virtual twins with inter-twin and intra-twin communications, the Internet of digital twins (IoDT) enables free data exchange, dynamic mission cooperation, and efficient information aggregation for composite insights across vast physical/virtual entities. However, as IoDT incorporates various cutting-edge technologies to spawn the new ecology, severe known/unknown security flaws and privacy invasions of IoDT hinders its wide deployment. Besides, the intrinsic characteristics of IoDT such as \emph{decentralized structure}, \emph{information-centric routing} and \emph{semantic communications} entail critical challenges for security service provisioning in IoDT. To this end, this paper presents an in-depth review of the IoDT with respect to system architecture, enabling technologies, and security/privacy issues. Specifically, we first explore a novel distributed IoDT architecture with cyber-physical interactions and discuss its key characteristics and communication modes. Afterward, we investigate the taxonomy of security and privacy threats in IoDT, discuss the key research challenges, and review the state-of-the-art defense approaches. Finally, we point out the new trends and open research directions related to IoDT.

Over the past few years, the rapid development of deep learning technologies for computer vision has greatly promoted the performance of medical image segmentation (MedISeg). However, the recent MedISeg publications usually focus on presentations of the major contributions (e.g., network architectures, training strategies, and loss functions) while unwittingly ignoring some marginal implementation details (also known as "tricks"), leading to a potential problem of the unfair experimental result comparisons. In this paper, we collect a series of MedISeg tricks for different model implementation phases (i.e., pre-training model, data pre-processing, data augmentation, model implementation, model inference, and result post-processing), and experimentally explore the effectiveness of these tricks on the consistent baseline models. Compared to paper-driven surveys that only blandly focus on the advantages and limitation analyses of segmentation models, our work provides a large number of solid experiments and is more technically operable. With the extensive experimental results on both the representative 2D and 3D medical image datasets, we explicitly clarify the effect of these tricks. Moreover, based on the surveyed tricks, we also open-sourced a strong MedISeg repository, where each of its components has the advantage of plug-and-play. We believe that this milestone work not only completes a comprehensive and complementary survey of the state-of-the-art MedISeg approaches, but also offers a practical guide for addressing the future medical image processing challenges including but not limited to small dataset learning, class imbalance learning, multi-modality learning, and domain adaptation. The code has been released at: //github.com/hust-linyi/MedISeg

The incredible development of federated learning (FL) has benefited various tasks in the domains of computer vision and natural language processing, and the existing frameworks such as TFF and FATE has made the deployment easy in real-world applications. However, federated graph learning (FGL), even though graph data are prevalent, has not been well supported due to its unique characteristics and requirements. The lack of FGL-related framework increases the efforts for accomplishing reproducible research and deploying in real-world applications. Motivated by such strong demand, in this paper, we first discuss the challenges in creating an easy-to-use FGL package and accordingly present our implemented package FederatedScope-GNN (FS-G), which provides (1) a unified view for modularizing and expressing FGL algorithms; (2) comprehensive DataZoo and ModelZoo for out-of-the-box FGL capability; (3) an efficient model auto-tuning component; and (4) off-the-shelf privacy attack and defense abilities. We validate the effectiveness of FS-G by conducting extensive experiments, which simultaneously gains many valuable insights about FGL for the community. Moreover, we employ FS-G to serve the FGL application in real-world E-commerce scenarios, where the attained improvements indicate great potential business benefits. We publicly release FS-G, as submodules of FederatedScope, at //github.com/alibaba/FederatedScope to promote FGL's research and enable broad applications that would otherwise be infeasible due to the lack of a dedicated package.

A fundamental goal of scientific research is to learn about causal relationships. However, despite its critical role in the life and social sciences, causality has not had the same importance in Natural Language Processing (NLP), which has traditionally placed more emphasis on predictive tasks. This distinction is beginning to fade, with an emerging area of interdisciplinary research at the convergence of causal inference and language processing. Still, research on causality in NLP remains scattered across domains without unified definitions, benchmark datasets and clear articulations of the remaining challenges. In this survey, we consolidate research across academic areas and situate it in the broader NLP landscape. We introduce the statistical challenge of estimating causal effects, encompassing settings where text is used as an outcome, treatment, or as a means to address confounding. In addition, we explore potential uses of causal inference to improve the performance, robustness, fairness, and interpretability of NLP models. We thus provide a unified overview of causal inference for the computational linguistics community.

With the advances of data-driven machine learning research, a wide variety of prediction problems have been tackled. It has become critical to explore how machine learning and specifically deep learning methods can be exploited to analyse healthcare data. A major limitation of existing methods has been the focus on grid-like data; however, the structure of physiological recordings are often irregular and unordered which makes it difficult to conceptualise them as a matrix. As such, graph neural networks have attracted significant attention by exploiting implicit information that resides in a biological system, with interactive nodes connected by edges whose weights can be either temporal associations or anatomical junctions. In this survey, we thoroughly review the different types of graph architectures and their applications in healthcare. We provide an overview of these methods in a systematic manner, organized by their domain of application including functional connectivity, anatomical structure and electrical-based analysis. We also outline the limitations of existing techniques and discuss potential directions for future research.

Influenced by the stunning success of deep learning in computer vision and language understanding, research in recommendation has shifted to inventing new recommender models based on neural networks. In recent years, we have witnessed significant progress in developing neural recommender models, which generalize and surpass traditional recommender models owing to the strong representation power of neural networks. In this survey paper, we conduct a systematic review on neural recommender models, aiming to summarize the field to facilitate future progress. Distinct from existing surveys that categorize existing methods based on the taxonomy of deep learning techniques, we instead summarize the field from the perspective of recommendation modeling, which could be more instructive to researchers and practitioners working on recommender systems. Specifically, we divide the work into three types based on the data they used for recommendation modeling: 1) collaborative filtering models, which leverage the key source of user-item interaction data; 2) content enriched models, which additionally utilize the side information associated with users and items, like user profile and item knowledge graph; and 3) context enriched models, which account for the contextual information associated with an interaction, such as time, location, and the past interactions. After reviewing representative works for each type, we finally discuss some promising directions in this field, including benchmarking recommender systems, graph reasoning based recommendation models, and explainable and fair recommendations for social good.

Many tasks in natural language processing can be viewed as multi-label classification problems. However, most of the existing models are trained with the standard cross-entropy loss function and use a fixed prediction policy (e.g., a threshold of 0.5) for all the labels, which completely ignores the complexity and dependencies among different labels. In this paper, we propose a meta-learning method to capture these complex label dependencies. More specifically, our method utilizes a meta-learner to jointly learn the training policies and prediction policies for different labels. The training policies are then used to train the classifier with the cross-entropy loss function, and the prediction policies are further implemented for prediction. Experimental results on fine-grained entity typing and text classification demonstrate that our proposed method can obtain more accurate multi-label classification results.

北京阿比特科技有限公司