亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Video Anomaly Detection (VAD) serves as a pivotal technology in the intelligent surveillance systems, enabling the temporal or spatial identification of anomalous events within videos. While existing reviews predominantly concentrate on conventional unsupervised methods, they often overlook the emergence of weakly-supervised and fully-unsupervised approaches. To address this gap, this survey extends the conventional scope of VAD beyond unsupervised methods, encompassing a broader spectrum termed Generalized Video Anomaly Event Detection (GVAED). By skillfully incorporating recent advancements rooted in diverse assumptions and learning frameworks, this survey introduces an intuitive taxonomy that seamlessly navigates through unsupervised, weakly-supervised, supervised and fully-unsupervised VAD methodologies, elucidating the distinctions and interconnections within these research trajectories. In addition, this survey facilitates prospective researchers by assembling a compilation of research resources, including public datasets, available codebases, programming tools, and pertinent literature. Furthermore, this survey quantitatively assesses model performance, delves into research challenges and directions, and outlines potential avenues for future exploration.

相關內容

分(fen)(fen)(fen)(fen)(fen)類(lei)(lei)(lei)學是(shi)(shi)(shi)分(fen)(fen)(fen)(fen)(fen)類(lei)(lei)(lei)的(de)(de)(de)實(shi)踐和科學。Wikipedia類(lei)(lei)(lei)別(bie)說明了一種(zhong)分(fen)(fen)(fen)(fen)(fen)類(lei)(lei)(lei)法(fa)(fa),可(ke)(ke)以通(tong)過自動方(fang)式提取Wikipedia類(lei)(lei)(lei)別(bie)的(de)(de)(de)完整(zheng)分(fen)(fen)(fen)(fen)(fen)類(lei)(lei)(lei)法(fa)(fa)。截(jie)至2009年(nian),已經證(zheng)明,可(ke)(ke)以使用(yong)人工構(gou)(gou)建的(de)(de)(de)分(fen)(fen)(fen)(fen)(fen)類(lei)(lei)(lei)法(fa)(fa)(例(li)(li)如像WordNet這(zhe)樣的(de)(de)(de)計算詞典的(de)(de)(de)分(fen)(fen)(fen)(fen)(fen)類(lei)(lei)(lei)法(fa)(fa))來(lai)改進和重(zhong)組(zu)Wikipedia類(lei)(lei)(lei)別(bie)分(fen)(fen)(fen)(fen)(fen)類(lei)(lei)(lei)法(fa)(fa)。 從(cong)廣義上講,分(fen)(fen)(fen)(fen)(fen)類(lei)(lei)(lei)法(fa)(fa)還適用(yong)于除父子層(ceng)次結(jie)構(gou)(gou)以外的(de)(de)(de)關系(xi)方(fang)案(an),例(li)(li)如網絡(luo)結(jie)構(gou)(gou)。然后(hou)分(fen)(fen)(fen)(fen)(fen)類(lei)(lei)(lei)法(fa)(fa)可(ke)(ke)能包括(kuo)有多父母的(de)(de)(de)單身孩子,例(li)(li)如,“汽車(che)”可(ke)(ke)能與父母雙方(fang)一起出現“車(che)輛(liang)”和“鋼結(jie)構(gou)(gou)”;但(dan)是(shi)(shi)(shi)對(dui)某些人而言,這(zhe)僅(jin)意味(wei)著(zhu)“汽車(che)”是(shi)(shi)(shi)幾種(zhong)不同分(fen)(fen)(fen)(fen)(fen)類(lei)(lei)(lei)法(fa)(fa)的(de)(de)(de)一部分(fen)(fen)(fen)(fen)(fen)。分(fen)(fen)(fen)(fen)(fen)類(lei)(lei)(lei)法(fa)(fa)也可(ke)(ke)能只是(shi)(shi)(shi)將事物組(zu)織(zhi)成組(zu),或(huo)者是(shi)(shi)(shi)按字(zi)母順序排(pai)列(lie)(lie)的(de)(de)(de)列(lie)(lie)表(biao);但(dan)是(shi)(shi)(shi)在這(zhe)里,術語詞匯更(geng)合適。在知識管(guan)理中的(de)(de)(de)當(dang)前(qian)用(yong)法(fa)(fa)中,分(fen)(fen)(fen)(fen)(fen)類(lei)(lei)(lei)法(fa)(fa)被認為(wei)比本體(ti)論窄,因為(wei)本體(ti)論應用(yong)了各種(zhong)各樣的(de)(de)(de)關系(xi)類(lei)(lei)(lei)型。 在數學上,分(fen)(fen)(fen)(fen)(fen)層(ceng)分(fen)(fen)(fen)(fen)(fen)類(lei)(lei)(lei)法(fa)(fa)是(shi)(shi)(shi)給定對(dui)象集(ji)(ji)的(de)(de)(de)分(fen)(fen)(fen)(fen)(fen)類(lei)(lei)(lei)樹結(jie)構(gou)(gou)。該結(jie)構(gou)(gou)的(de)(de)(de)頂部是(shi)(shi)(shi)適用(yong)于所有對(dui)象的(de)(de)(de)單個(ge)分(fen)(fen)(fen)(fen)(fen)類(lei)(lei)(lei),即根節點(dian)。此(ci)根下(xia)的(de)(de)(de)節點(dian)是(shi)(shi)(shi)更(geng)具(ju)體(ti)的(de)(de)(de)分(fen)(fen)(fen)(fen)(fen)類(lei)(lei)(lei),適用(yong)于總分(fen)(fen)(fen)(fen)(fen)類(lei)(lei)(lei)對(dui)象集(ji)(ji)的(de)(de)(de)子集(ji)(ji)。推(tui)理的(de)(de)(de)進展從(cong)一般到更(geng)具(ju)體(ti)。

知識薈萃

精品入門和進階教(jiao)程、論文和代碼整理等(deng)

更多

查看相關VIP內容、論文、資訊等

Mechanism design is essentially reverse engineering of games and involves inducing a game among strategic agents in a way that the induced game satisfies a set of desired properties in an equilibrium of the game. Desirable properties for a mechanism include incentive compatibility, individual rationality, welfare maximisation, revenue maximisation (or cost minimisation), fairness of allocation, etc. It is known from mechanism design theory that only certain strict subsets of these properties can be simultaneously satisfied exactly by any given mechanism. Often, the mechanisms required by real-world applications may need a subset of these properties that are theoretically impossible to be simultaneously satisfied. In such cases, a prominent recent approach is to use a deep learning based approach to learn a mechanism that approximately satisfies the required properties by minimizing a suitably defined loss function. In this paper, we present, from relevant literature, technical details of using a deep learning approach for mechanism design and provide an overview of key results in this topic. We demonstrate the power of this approach for three illustrative case studies: (a) efficient energy management in a vehicular network (b) resource allocation in a mobile network (c) designing a volume discount procurement auction for agricultural inputs. Section 6 concludes the paper.

Since the advent of personal computing devices, intelligent personal assistants (IPAs) have been one of the key technologies that researchers and engineers have focused on, aiming to help users efficiently obtain information and execute tasks, and provide users with more intelligent, convenient, and rich interaction experiences. With the development of smartphones and IoT, computing and sensing devices have become ubiquitous, greatly expanding the boundaries of IPAs. However, due to the lack of capabilities such as user intent understanding, task planning, tool using, and personal data management etc., existing IPAs still have limited practicality and scalability. Recently, the emergence of foundation models, represented by large language models (LLMs), brings new opportunities for the development of IPAs. With the powerful semantic understanding and reasoning capabilities, LLM can enable intelligent agents to solve complex problems autonomously. In this paper, we focus on Personal LLM Agents, which are LLM-based agents that are deeply integrated with personal data and personal devices and used for personal assistance. We envision that Personal LLM Agents will become a major software paradigm for end-users in the upcoming era. To realize this vision, we take the first step to discuss several important questions about Personal LLM Agents, including their architecture, capability, efficiency and security. We start by summarizing the key components and design choices in the architecture of Personal LLM Agents, followed by an in-depth analysis of the opinions collected from domain experts. Next, we discuss several key challenges to achieve intelligent, efficient and secure Personal LLM Agents, followed by a comprehensive survey of representative solutions to address these challenges.

Chain-of-thought reasoning, a cognitive process fundamental to human intelligence, has garnered significant attention in the realm of artificial intelligence and natural language processing. However, there still remains a lack of a comprehensive survey for this arena. To this end, we take the first step and present a thorough survey of this research field carefully and widely. We use X-of-Thought to refer to Chain-of-Thought in a broad sense. In detail, we systematically organize the current research according to the taxonomies of methods, including XoT construction, XoT structure variants, and enhanced XoT. Additionally, we describe XoT with frontier applications, covering planning, tool use, and distillation. Furthermore, we address challenges and discuss some future directions, including faithfulness, multi-modal, and theory. We hope this survey serves as a valuable resource for researchers seeking to innovate within the domain of chain-of-thought reasoning.

Multimodality Representation Learning, as a technique of learning to embed information from different modalities and their correlations, has achieved remarkable success on a variety of applications, such as Visual Question Answering (VQA), Natural Language for Visual Reasoning (NLVR), and Vision Language Retrieval (VLR). Among these applications, cross-modal interaction and complementary information from different modalities are crucial for advanced models to perform any multimodal task, e.g., understand, recognize, retrieve, or generate optimally. Researchers have proposed diverse methods to address these tasks. The different variants of transformer-based architectures performed extraordinarily on multiple modalities. This survey presents the comprehensive literature on the evolution and enhancement of deep learning multimodal architectures to deal with textual, visual and audio features for diverse cross-modal and modern multimodal tasks. This study summarizes the (i) recent task-specific deep learning methodologies, (ii) the pretraining types and multimodal pretraining objectives, (iii) from state-of-the-art pretrained multimodal approaches to unifying architectures, and (iv) multimodal task categories and possible future improvements that can be devised for better multimodal learning. Moreover, we prepare a dataset section for new researchers that covers most of the benchmarks for pretraining and finetuning. Finally, major challenges, gaps, and potential research topics are explored. A constantly-updated paperlist related to our survey is maintained at //github.com/marslanm/multimodality-representation-learning.

Knowledge graph embedding (KGE) is a increasingly popular technique that aims to represent entities and relations of knowledge graphs into low-dimensional semantic spaces for a wide spectrum of applications such as link prediction, knowledge reasoning and knowledge completion. In this paper, we provide a systematic review of existing KGE techniques based on representation spaces. Particularly, we build a fine-grained classification to categorise the models based on three mathematical perspectives of the representation spaces: (1) Algebraic perspective, (2) Geometric perspective, and (3) Analytical perspective. We introduce the rigorous definitions of fundamental mathematical spaces before diving into KGE models and their mathematical properties. We further discuss different KGE methods over the three categories, as well as summarise how spatial advantages work over different embedding needs. By collating the experimental results from downstream tasks, we also explore the advantages of mathematical space in different scenarios and the reasons behind them. We further state some promising research directions from a representation space perspective, with which we hope to inspire researchers to design their KGE models as well as their related applications with more consideration of their mathematical space properties.

Face recognition technology has advanced significantly in recent years due largely to the availability of large and increasingly complex training datasets for use in deep learning models. These datasets, however, typically comprise images scraped from news sites or social media platforms and, therefore, have limited utility in more advanced security, forensics, and military applications. These applications require lower resolution, longer ranges, and elevated viewpoints. To meet these critical needs, we collected and curated the first and second subsets of a large multi-modal biometric dataset designed for use in the research and development (R&D) of biometric recognition technologies under extremely challenging conditions. Thus far, the dataset includes more than 350,000 still images and over 1,300 hours of video footage of approximately 1,000 subjects. To collect this data, we used Nikon DSLR cameras, a variety of commercial surveillance cameras, specialized long-rage R&D cameras, and Group 1 and Group 2 UAV platforms. The goal is to support the development of algorithms capable of accurately recognizing people at ranges up to 1,000 m and from high angles of elevation. These advances will include improvements to the state of the art in face recognition and will support new research in the area of whole-body recognition using methods based on gait and anthropometry. This paper describes methods used to collect and curate the dataset, and the dataset's characteristics at the current stage.

In pace with developments in the research field of artificial intelligence, knowledge graphs (KGs) have attracted a surge of interest from both academia and industry. As a representation of semantic relations between entities, KGs have proven to be particularly relevant for natural language processing (NLP), experiencing a rapid spread and wide adoption within recent years. Given the increasing amount of research work in this area, several KG-related approaches have been surveyed in the NLP research community. However, a comprehensive study that categorizes established topics and reviews the maturity of individual research streams remains absent to this day. Contributing to closing this gap, we systematically analyzed 507 papers from the literature on KGs in NLP. Our survey encompasses a multifaceted review of tasks, research types, and contributions. As a result, we present a structured overview of the research landscape, provide a taxonomy of tasks, summarize our findings, and highlight directions for future work.

Over the past few years, the rapid development of deep learning technologies for computer vision has greatly promoted the performance of medical image segmentation (MedISeg). However, the recent MedISeg publications usually focus on presentations of the major contributions (e.g., network architectures, training strategies, and loss functions) while unwittingly ignoring some marginal implementation details (also known as "tricks"), leading to a potential problem of the unfair experimental result comparisons. In this paper, we collect a series of MedISeg tricks for different model implementation phases (i.e., pre-training model, data pre-processing, data augmentation, model implementation, model inference, and result post-processing), and experimentally explore the effectiveness of these tricks on the consistent baseline models. Compared to paper-driven surveys that only blandly focus on the advantages and limitation analyses of segmentation models, our work provides a large number of solid experiments and is more technically operable. With the extensive experimental results on both the representative 2D and 3D medical image datasets, we explicitly clarify the effect of these tricks. Moreover, based on the surveyed tricks, we also open-sourced a strong MedISeg repository, where each of its components has the advantage of plug-and-play. We believe that this milestone work not only completes a comprehensive and complementary survey of the state-of-the-art MedISeg approaches, but also offers a practical guide for addressing the future medical image processing challenges including but not limited to small dataset learning, class imbalance learning, multi-modality learning, and domain adaptation. The code has been released at: //github.com/hust-linyi/MedISeg

Graph Neural Networks (GNNs) have been studied from the lens of expressive power and generalization. However, their optimization properties are less well understood. We take the first step towards analyzing GNN training by studying the gradient dynamics of GNNs. First, we analyze linearized GNNs and prove that despite the non-convexity of training, convergence to a global minimum at a linear rate is guaranteed under mild assumptions that we validate on real-world graphs. Second, we study what may affect the GNNs' training speed. Our results show that the training of GNNs is implicitly accelerated by skip connections, more depth, and/or a good label distribution. Empirical results confirm that our theoretical results for linearized GNNs align with the training behavior of nonlinear GNNs. Our results provide the first theoretical support for the success of GNNs with skip connections in terms of optimization, and suggest that deep GNNs with skip connections would be promising in practice.

Commonsense knowledge and commonsense reasoning are some of the main bottlenecks in machine intelligence. In the NLP community, many benchmark datasets and tasks have been created to address commonsense reasoning for language understanding. These tasks are designed to assess machines' ability to acquire and learn commonsense knowledge in order to reason and understand natural language text. As these tasks become instrumental and a driving force for commonsense research, this paper aims to provide an overview of existing tasks and benchmarks, knowledge resources, and learning and inference approaches toward commonsense reasoning for natural language understanding. Through this, our goal is to support a better understanding of the state of the art, its limitations, and future challenges.

北京阿比特科技有限公司