亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Real-time traffic light recognition is essential for autonomous driving. Yet, a cohesive overview of the underlying model architectures for this task is currently missing. In this work, we conduct a comprehensive survey and analysis of traffic light recognition methods that use convolutional neural networks (CNNs). We focus on two essential aspects: datasets and CNN architectures. Based on an underlying architecture, we cluster methods into three major groups: (1) modifications of generic object detectors which compensate for specific task characteristics, (2) multi-stage approaches involving both rule-based and CNN components, and (3) task-specific single-stage methods. We describe the most important works in each cluster, discuss the usage of the datasets, and identify research gaps.

相關內容

神(shen)(shen)(shen)(shen)(shen)經(jing)網(wang)(wang)(wang)(wang)絡(luo)(Neural Networks)是世界上三個最古老的(de)(de)(de)(de)(de)(de)(de)神(shen)(shen)(shen)(shen)(shen)經(jing)建模學(xue)(xue)(xue)(xue)(xue)會(hui)(hui)(hui)的(de)(de)(de)(de)(de)(de)(de)檔案期(qi)刊:國際神(shen)(shen)(shen)(shen)(shen)經(jing)網(wang)(wang)(wang)(wang)絡(luo)學(xue)(xue)(xue)(xue)(xue)會(hui)(hui)(hui)(INNS)、歐洲(zhou)神(shen)(shen)(shen)(shen)(shen)經(jing)網(wang)(wang)(wang)(wang)絡(luo)學(xue)(xue)(xue)(xue)(xue)會(hui)(hui)(hui)(ENNS)和日本神(shen)(shen)(shen)(shen)(shen)經(jing)網(wang)(wang)(wang)(wang)絡(luo)學(xue)(xue)(xue)(xue)(xue)會(hui)(hui)(hui)(JNNS)。神(shen)(shen)(shen)(shen)(shen)經(jing)網(wang)(wang)(wang)(wang)絡(luo)提供(gong)了一(yi)(yi)個論(lun)壇,以發(fa)展和培(pei)育(yu)一(yi)(yi)個國際社會(hui)(hui)(hui)的(de)(de)(de)(de)(de)(de)(de)學(xue)(xue)(xue)(xue)(xue)者(zhe)(zhe)和實踐者(zhe)(zhe)感興(xing)趣(qu)的(de)(de)(de)(de)(de)(de)(de)所有(you)方面(mian)的(de)(de)(de)(de)(de)(de)(de)神(shen)(shen)(shen)(shen)(shen)經(jing)網(wang)(wang)(wang)(wang)絡(luo)和相關方法(fa)的(de)(de)(de)(de)(de)(de)(de)計算智能。神(shen)(shen)(shen)(shen)(shen)經(jing)網(wang)(wang)(wang)(wang)絡(luo)歡迎(ying)高質量(liang)論(lun)文(wen)的(de)(de)(de)(de)(de)(de)(de)提交,有(you)助于全面(mian)的(de)(de)(de)(de)(de)(de)(de)神(shen)(shen)(shen)(shen)(shen)經(jing)網(wang)(wang)(wang)(wang)絡(luo)研究,從(cong)行為和大腦建模,學(xue)(xue)(xue)(xue)(xue)習算法(fa),通(tong)過數(shu)學(xue)(xue)(xue)(xue)(xue)和計算分(fen)析,系統的(de)(de)(de)(de)(de)(de)(de)工(gong)程和技(ji)(ji)術(shu)應用,大量(liang)使用神(shen)(shen)(shen)(shen)(shen)經(jing)網(wang)(wang)(wang)(wang)絡(luo)的(de)(de)(de)(de)(de)(de)(de)概念和技(ji)(ji)術(shu)。這一(yi)(yi)獨特而廣泛的(de)(de)(de)(de)(de)(de)(de)范圍促進(jin)了生(sheng)物和技(ji)(ji)術(shu)研究之間的(de)(de)(de)(de)(de)(de)(de)思想交流,并有(you)助于促進(jin)對生(sheng)物啟發(fa)的(de)(de)(de)(de)(de)(de)(de)計算智能感興(xing)趣(qu)的(de)(de)(de)(de)(de)(de)(de)跨學(xue)(xue)(xue)(xue)(xue)科(ke)社區的(de)(de)(de)(de)(de)(de)(de)發(fa)展。因此(ci),神(shen)(shen)(shen)(shen)(shen)經(jing)網(wang)(wang)(wang)(wang)絡(luo)編委會(hui)(hui)(hui)代表(biao)的(de)(de)(de)(de)(de)(de)(de)專(zhuan)家領(ling)域包括心理學(xue)(xue)(xue)(xue)(xue),神(shen)(shen)(shen)(shen)(shen)經(jing)生(sheng)物學(xue)(xue)(xue)(xue)(xue),計算機科(ke)學(xue)(xue)(xue)(xue)(xue),工(gong)程,數(shu)學(xue)(xue)(xue)(xue)(xue),物理。該雜志發(fa)表(biao)文(wen)章、信(xin)件和評(ping)論(lun)以及給編輯的(de)(de)(de)(de)(de)(de)(de)信(xin)件、社論(lun)、時事、軟件調查和專(zhuan)利信(xin)息。文(wen)章發(fa)表(biao)在五個部分(fen)之一(yi)(yi):認知科(ke)學(xue)(xue)(xue)(xue)(xue),神(shen)(shen)(shen)(shen)(shen)經(jing)科(ke)學(xue)(xue)(xue)(xue)(xue),學(xue)(xue)(xue)(xue)(xue)習系統,數(shu)學(xue)(xue)(xue)(xue)(xue)和計算分(fen)析、工(gong)程和應用。 官網(wang)(wang)(wang)(wang)地址:

Instruction-tuning can be substantially optimized through enhanced diversity, resulting in models capable of handling a broader spectrum of tasks. However, existing data employed for such tuning often exhibit an inadequate coverage of individual domains, limiting the scope for nuanced comprehension and interactions within these areas. To address this deficiency, we propose Explore-Instruct, a novel approach to enhance the data coverage to be used in domain-specific instruction-tuning through active exploration via Large Language Models (LLMs). Built upon representative domain use cases, Explore-Instruct explores a multitude of variations or possibilities by implementing a search algorithm to obtain diversified and domain-focused instruction-tuning data. Our data-centric analysis validates the effectiveness of this proposed approach in improving domain-specific instruction coverage. Moreover, our model's performance demonstrates considerable advancements over multiple baselines, including those utilizing domain-specific data enhancement. Our findings offer a promising opportunity to improve instruction coverage, especially in domain-specific contexts, thereby advancing the development of adaptable language models. Our code, model weights, and data are public at \url{//github.com/fanqiwan/Explore-Instruct}.

Knowledge graph completion (KGC) aims to discover missing relations of query entities. Current text-based models utilize the entity name and description to infer the tail entity given the head entity and a certain relation. Existing approaches also consider the neighborhood of the head entity. However, these methods tend to model the neighborhood using a flat structure and are only restricted to 1-hop neighbors. In this work, we propose a node neighborhood-enhanced framework for knowledge graph completion. It models the head entity neighborhood from multiple hops using graph neural networks to enrich the head node information. Moreover, we introduce an additional edge link prediction task to improve KGC. Evaluation on two public datasets shows that this framework is simple yet effective. The case study also shows that the model is able to predict explainable predictions.

Video instance segmentation, also known as multi-object tracking and segmentation, is an emerging computer vision research area introduced in 2019, aiming at detecting, segmenting, and tracking instances in videos simultaneously. By tackling the video instance segmentation tasks through effective analysis and utilization of visual information in videos, a range of computer vision-enabled applications (e.g., human action recognition, medical image processing, autonomous vehicle navigation, surveillance, etc) can be implemented. As deep-learning techniques take a dominant role in various computer vision areas, a plethora of deep-learning-based video instance segmentation schemes have been proposed. This survey offers a multifaceted view of deep-learning schemes for video instance segmentation, covering various architectural paradigms, along with comparisons of functional performance, model complexity, and computational overheads. In addition to the common architectural designs, auxiliary techniques for improving the performance of deep-learning models for video instance segmentation are compiled and discussed. Finally, we discuss a range of major challenges and directions for further investigations to help advance this promising research field.

We study design of black-box model extraction attacks that can send minimal number of queries from a publicly available dataset to a target ML model through a predictive API with an aim to create an informative and distributionally equivalent replica of the target. First, we define distributionally equivalent and Max-Information model extraction attacks, and reduce them into a variational optimisation problem. The attacker sequentially solves this optimisation problem to select the most informative queries that simultaneously maximise the entropy and reduce the mismatch between the target and the stolen models. This leads to an active sampling-based query selection algorithm, Marich, which is model-oblivious. Then, we evaluate Marich on different text and image data sets, and different models, including CNNs and BERT. Marich extracts models that achieve $\sim 60-95\%$ of true model's accuracy and uses $\sim 1,000 - 8,500$ queries from the publicly available datasets, which are different from the private training datasets. Models extracted by Marich yield prediction distributions, which are $\sim 2-4\times$ closer to the target's distribution in comparison to the existing active sampling-based attacks. The extracted models also lead to $84-96\%$ accuracy under membership inference attacks. Experimental results validate that Marich is query-efficient, and capable of performing task-accurate, high-fidelity, and informative model extraction.

Transformer-based models have achieved state-of-the-art performance in many areas. However, the quadratic complexity of self-attention with respect to the input length hinders the applicability of Transformer-based models to long sequences. To address this, we present Fast Multipole Attention, a new attention mechanism that uses a divide-and-conquer strategy to reduce the time and memory complexity of attention for sequences of length $n$ from $\mathcal{O}(n^2)$ to $\mathcal{O}(n \log n)$ or $O(n)$, while retaining a global receptive field. The hierarchical approach groups queries, keys, and values into $\mathcal{O}( \log n)$ levels of resolution, where groups at greater distances are increasingly larger in size and the weights to compute group quantities are learned. As such, the interaction between tokens far from each other is considered in lower resolution in an efficient hierarchical manner. The overall complexity of Fast Multipole Attention is $\mathcal{O}(n)$ or $\mathcal{O}(n \log n)$, depending on whether the queries are down-sampled or not. This multi-level divide-and-conquer strategy is inspired by fast summation methods from $n$-body physics and the Fast Multipole Method. We perform evaluation on autoregressive and bidirectional language modeling tasks and compare our Fast Multipole Attention model with other efficient attention variants on medium-size datasets. We find empirically that the Fast Multipole Transformer performs much better than other efficient transformers in terms of memory size and accuracy. The Fast Multipole Attention mechanism has the potential to empower large language models with much greater sequence lengths, taking the full context into account in an efficient, naturally hierarchical manner during training and when generating long sequences.

Motion prediction is crucial for autonomous vehicles to operate safely in complex traffic environments. Extracting effective spatiotemporal relationships among traffic elements is key to accurate forecasting. Inspired by the successful practice of pretrained large language models, this paper presents SEPT, a modeling framework that leverages self-supervised learning to develop powerful spatiotemporal understanding for complex traffic scenes. Specifically, our approach involves three masking-reconstruction modeling tasks on scene inputs including agents' trajectories and road network, pretraining the scene encoder to capture kinematics within trajectory, spatial structure of road network, and interactions among roads and agents. The pretrained encoder is then finetuned on the downstream forecasting task. Extensive experiments demonstrate that SEPT, without elaborate architectural design or manual feature engineering, achieves state-of-the-art performance on the Argoverse 1 and Argoverse 2 motion forecasting benchmarks, outperforming previous methods on all main metrics by a large margin.

We study the binomial, trinomial, and Black-Scholes-Merton models of option pricing. We present fast parallel discrete-time finite-difference algorithms for American call option pricing under the binomial and trinomial models and American put option pricing under the Black-Scholes-Merton model. For $T$-step finite differences, each algorithm runs in $O(\left(T\log^2{T}\right)/p + T)$ time under a greedy scheduler on $p$ processing cores, which is a significant improvement over the $\Theta({T^2}/{p}) + \Omega(T\log{T})$ time taken by the corresponding state-of-the-art parallel algorithm. Even when run on a single core, the $O(T\log^2{T})$ time taken by our algorithms is asymptotically much smaller than the $\Theta(T^2)$ running time of the fastest known serial algorithms. Implementations of our algorithms significantly outperform the fastest implementations of existing algorithms in practice, e.g., when run for $T \approx 1000$ steps on a 48-core machine, our algorithm for the binomial model runs at least $15\times$ faster than the fastest existing parallel program for the same model with the speed-up factor gradually reaching beyond $500\times$ for $T \approx 0.5 \times 10^6$. It saves more than 80\% energy when $T \approx 4000$, and more than 99\% energy for $T > 60,000$. Our option pricing algorithms can be viewed as solving a class of nonlinear 1D stencil (i.e., finite-difference) computation problems efficiently using the Fast Fourier Transform (FFT). To our knowledge, ours are the first algorithms to handle such stencils in $o(T^2)$ time. These contributions are of independent interest as stencil computations have a wide range of applications beyond quantitative finance.

Deep models, e.g., CNNs and Vision Transformers, have achieved impressive achievements in many vision tasks in the closed world. However, novel classes emerge from time to time in our ever-changing world, requiring a learning system to acquire new knowledge continually. For example, a robot needs to understand new instructions, and an opinion monitoring system should analyze emerging topics every day. Class-Incremental Learning (CIL) enables the learner to incorporate the knowledge of new classes incrementally and build a universal classifier among all seen classes. Correspondingly, when directly training the model with new class instances, a fatal problem occurs -- the model tends to catastrophically forget the characteristics of former ones, and its performance drastically degrades. There have been numerous efforts to tackle catastrophic forgetting in the machine learning community. In this paper, we survey comprehensively recent advances in deep class-incremental learning and summarize these methods from three aspects, i.e., data-centric, model-centric, and algorithm-centric. We also provide a rigorous and unified evaluation of 16 methods in benchmark image classification tasks to find out the characteristics of different algorithms empirically. Furthermore, we notice that the current comparison protocol ignores the influence of memory budget in model storage, which may result in unfair comparison and biased results. Hence, we advocate fair comparison by aligning the memory budget in evaluation, as well as several memory-agnostic performance measures. The source code to reproduce these evaluations is available at //github.com/zhoudw-zdw/CIL_Survey/

Images can convey rich semantics and induce various emotions in viewers. Recently, with the rapid advancement of emotional intelligence and the explosive growth of visual data, extensive research efforts have been dedicated to affective image content analysis (AICA). In this survey, we will comprehensively review the development of AICA in the recent two decades, especially focusing on the state-of-the-art methods with respect to three main challenges -- the affective gap, perception subjectivity, and label noise and absence. We begin with an introduction to the key emotion representation models that have been widely employed in AICA and description of available datasets for performing evaluation with quantitative comparison of label noise and dataset bias. We then summarize and compare the representative approaches on (1) emotion feature extraction, including both handcrafted and deep features, (2) learning methods on dominant emotion recognition, personalized emotion prediction, emotion distribution learning, and learning from noisy data or few labels, and (3) AICA based applications. Finally, we discuss some challenges and promising research directions in the future, such as image content and context understanding, group emotion clustering, and viewer-image interaction.

Automatically creating the description of an image using any natural languages sentence like English is a very challenging task. It requires expertise of both image processing as well as natural language processing. This paper discuss about different available models for image captioning task. We have also discussed about how the advancement in the task of object recognition and machine translation has greatly improved the performance of image captioning model in recent years. In addition to that we have discussed how this model can be implemented. In the end, we have also evaluated the performance of model using standard evaluation matrices.

北京阿比特科技有限公司