黄片一级在线视频播放_操下面视频在线观看免费欧美_色综合一区二区三区在线视频_高清在线亚洲中文精品视频_97人人澡人人爽91综合_天天射天天日黄片_欧洲日韩一区二区免费视频

In this paper, we propose a lightweight and accurate face detection algorithm LAFD (Light and accurate face detection) based on Retinaface. Backbone network in the algorithm is a modified MobileNetV3 network which adjusts the size of the convolution kernel, the channel expansion multiplier of the inverted residuals block and the use of the SE attention mechanism. Deformable convolution network(DCN) is introduced in the context module and the algorithm uses focal loss function instead of cross-entropy loss function as the classification loss function of the model. The test results on the WIDERFACE dataset indicate that the average accuracy of LAFD is 94.1%, 92.2% and 82.1% for the "easy", "medium" and "hard" validation subsets respectively with an improvement of 3.4%, 4.0% and 8.3% compared to Retinaface and 3.1%, 4.1% and 4.1% higher than the well-performing lightweight model, LFFD. If the input image is pre-processed and scaled to 1560px in length or 1200px in width, the model achieves an average accuracy of 86.2% on the 'hard' validation subset. The model is lightweight, with a size of only 10.2MB.

相關內容

損失函數（機器學習）

關注 10

損(sun)失(shi)函(han)數(shu)(shu)(shu)(shu)(shu)，在AI中亦稱(cheng)呼(hu)距(ju)離(li)函(han)數(shu)(shu)(shu)(shu)(shu)，度量(liang)函(han)數(shu)(shu)(shu)(shu)(shu)。此(ci)處的(de)(de)(de)距(ju)離(li)代(dai)表的(de)(de)(de)是抽象性的(de)(de)(de)，代(dai)表真實(shi)數(shu)(shu)(shu)(shu)(shu)據(ju)與預(yu)測數(shu)(shu)(shu)(shu)(shu)據(ju)之間的(de)(de)(de)誤差。損(sun)失(shi)函(han)數(shu)(shu)(shu)(shu)(shu)（loss function）是用來(lai)估量(liang)你模(mo)型的(de)(de)(de)預(yu)測值f(x)與真實(shi)值Y的(de)(de)(de)不一致(zhi)程度，它是一個非負(fu)實(shi)值函(han)數(shu)(shu)(shu)(shu)(shu),通常(chang)使用L(Y, f(x))來(lai)表示，損(sun)失(shi)函(han)數(shu)(shu)(shu)(shu)(shu)越小，模(mo)型的(de)(de)(de)魯棒性就越好(hao)。損(sun)失(shi)函(han)數(shu)(shu)(shu)(shu)(shu)是經驗風險函(han)數(shu)(shu)(shu)(shu)(shu)的(de)(de)(de)核心部(bu)(bu)分，也是結構風險函(han)數(shu)(shu)(shu)(shu)(shu)重(zhong)要組成部(bu)(bu)分。

回合 · 簇 · 3D · 稀疏 · 可約的 ·

2023 年 9 月 28 日

CTopPRM: Clustering Topological PRM for Planning Multiple Distinct Paths in 3D Environments

Matej Novosad,Robert Penicka,Vojtech Vonasek

from arxiv, in IEEE Robotics and Automation Letters

In this paper, we propose a new method called Clustering Topological PRM (CTopPRM) for finding multiple homotopically distinct paths in 3D cluttered environments. Finding such distinct paths, e.g., going around an obstacle from a different side, is useful in many applications. Among others, using multiple distinct paths is necessary for optimization-based trajectory planners where found trajectories are restricted to only a single homotopy class of a given path. Distinct paths can also be used to guide sampling-based motion planning and thus increase the effectiveness of planning in environments with narrow passages. Graph-based representation called roadmap is a common representation for path planning and also for finding multiple distinct paths. However, challenging environments with multiple narrow passages require a densely sampled roadmap to capture the connectivity of the environment. Searching such a dense roadmap for multiple paths is computationally too expensive. Therefore, the majority of existing methods construct only a sparse roadmap which, however, struggles to find all distinct paths in challenging environments. To this end, we propose the CTopPRM which creates a sparse graph by clustering an initially sampled dense roadmap. Such a reduced roadmap allows fast identification of homotopically distinct paths captured in the dense roadmap. We show, that compared to the existing methods the CTopPRM improves the probability of finding all distinct paths by almost 20% in tested environments, during same run-time. The source code of our method is released as an open-source package.

Extensibility · 向量化 · ARM · 泛函 · 優化器 ·

2023 年 9 月 28 日

SIMD Everywhere Optimization from ARM NEON to RISC-V Vector Extensions

Ju-Hung Li,Jhih-Kuan Lin,Yung-Cheng Su,Chi-Wei Chu,Lai-Tak Kuok,Hung-Ming Lai,Chao-Lin Lee,Jenq-Kuen Lee

Many libraries, such as OpenCV, FFmpeg, XNNPACK, and Eigen, utilize Arm or x86 SIMD Intrinsics to optimize programs for performance. With the emergence of RISC-V Vector Extensions (RVV), there is a need to migrate these performance legacy codes for RVV. Currently, the migration of NEON code to RVV code requires manual rewriting, which is a time-consuming and error-prone process. In this work, we use the open source tool, "SIMD Everywhere" (SIMDe), to automate the migration. Our primary task is to enhance SIMDe to enable the conversion of ARM NEON Intrinsics types and functions to their corresponding RVV Intrinsics types and functions. For type conversion, we devise strategies to convert Neon Intrinsics types to RVV Intrinsics by considering the vector length agnostic (vla) architectures. With function conversions, we analyze commonly used conversion methods in SIMDe and develop customized conversions for each function based on the results of RVV code generations. In our experiments with Google XNNPACK library, our enhanced SIMDe achieves speedup ranging from 1.51x to 5.13x compared to the original SIMDe, which does not utilize customized RVV implementations for the conversions.

平滑 · 方陣 · 泛函 · 核嶺回歸 · 嶺回歸 ·

2023 年 9 月 28 日

Smooth Nested Simulation: Bridging Cubic and Square Root Convergence Rates in High Dimensions

Wenjia Wang,Yanyuan Wang,Xiaowei Zhang

from arxiv, Main body: 46 pages, 5 figures, 5 tables; Supplemental material: 28 pages

Nested simulation concerns estimating functionals of a conditional expectation via simulation. In this paper, we propose a new method based on kernel ridge regression to exploit the smoothness of the conditional expectation as a function of the multidimensional conditioning variable. Asymptotic analysis shows that the proposed method can effectively alleviate the curse of dimensionality on the convergence rate as the simulation budget increases, provided that the conditional expectation is sufficiently smooth. The smoothness bridges the gap between the cubic root convergence rate (that is, the optimal rate for the standard nested simulation) and the square root convergence rate (that is, the canonical rate for the standard Monte Carlo simulation). We demonstrate the performance of the proposed method via numerical examples from portfolio risk management and input uncertainty quantification.

圖 · Learning · MoDELS · 聯合分布 · Better ·

2023 年 9 月 27 日

Detecting Objects with Context-Likelihood Graphs and Graph Refinement

Aritra Bhowmik,Yu Wang,Nora Baka,Martin R. Oswald,Cees G. M. Snoek

from arxiv, 13 pages, 8 figures. In Proceedings of International Conference on Computer Vision (ICCV) 2023

The goal of this paper is to detect objects by exploiting their interrelationships. Contrary to existing methods, which learn objects and relations separately, our key idea is to learn the object-relation distribution jointly. We first propose a novel way of creating a graphical representation of an image from inter-object relation priors and initial class predictions, we call a context-likelihood graph. We then learn the joint distribution with an energy-based modeling technique which allows to sample and refine the context-likelihood graph iteratively for a given image. Our formulation of jointly learning the distribution enables us to generate a more accurate graph representation of an image which leads to a better object detection performance. We demonstrate the benefits of our context-likelihood graph formulation and the energy-based graph refinement via experiments on the Visual Genome and MS-COCO datasets where we achieve a consistent improvement over object detectors like DETR and Faster-RCNN, as well as alternative methods modeling object interrelationships separately. Our method is detector agnostic, end-to-end trainable, and especially beneficial for rare object classes.

回合 · 機器人 · 可約的 · Job Shop · 機器人操作平臺 ·

2023 年 9 月 27 日

Communication-Constrained Multi-Robot Exploration with Intermittent Rendezvous

Alysson Ribeiro da Silva,Luiz Chaimowicz,Vijay Kumar,Thales Costa Silva,Ani Hsieh

from arxiv, 7 pages, 12 figures, 1 table, video: //youtu.be/EuVbCoyjuIY

This paper deals with the Multi-robot Exploration (MRE) under communication constraints problem. We propose a novel intermittent rendezvous method that allows robots to explore an unknown environment while sharing maps at rendezvous locations through agreements. In our method, robots update the agreements to spread the rendezvous locations during the exploration and prioritize exploring unknown areas near them. To generate the agreements automatically, we reduced the MRE to instances of the Job Shop Scheduling Problem (JSSP) and ensured intermittent communication through a temporal connectivity graph. We evaluate our method in simulation in various virtual urban environments and a Gazebo simulation using the Robot Operating System (ROS). Our results suggest that our method can be better than using relays or maintaining intermittent communication with a base station since we can explore faster without additional hardware to create a relay network.

MoDELS · 語言模型化 · 知識 (knowledge) · INTERACT · INFORMS ·

2023 年 9 月 26 日

PLMM: Personal Large Models on Mobile Devices

Yuanhao Gong

from arxiv, arXiv admin note: substantial text overlap with arXiv:2307.13221

Inspired by Federated Learning, in this paper, we propose personal large models that are distilled from traditional large language models but more adaptive to local users' personal information such as education background and hobbies. We classify the large language models into three levels: the personal level, expert level and traditional level. The personal level models are adaptive to users' personal information. They encrypt the users' input and protect their privacy. The expert level models focus on merging specific knowledge such as finance, IT and art. The traditional models focus on the universal knowledge discovery and upgrading the expert models. In such classifications, the personal models directly interact with the user. For the whole system, the personal models have users' (encrypted) personal information. Moreover, such models must be small enough to be performed on personal computers or mobile devices. Finally, they also have to response in real-time for better user experience and produce high quality results. The proposed personal large models can be applied in a wide range of applications such as language and vision tasks.

SCP · CASE · 近似 · 正則化項 · 泛函 ·

2023 年 9 月 25 日

Bicriteria Approximation Algorithms for the Submodular Cover Problem

Wenjing Chen,Victoria G. Crawford

from arxiv, Accepted at NeurIPS 2023

In this paper, we consider the optimization problem Submodular Cover (SCP), which is to find a minimum cardinality subset of a finite universe $U$ such that the value of a submodular function $f$ is above an input threshold $\tau$. In particular, we consider several variants of SCP including the general case, the case where $f$ is additionally assumed to be monotone, and finally the case where $f$ is a regularized monotone submodular function. Our most significant contributions are that: (i) We propose a scalable algorithm for monotone SCP that achieves nearly the same approximation guarantees as the standard greedy algorithm in significantly faster time; (ii) We are the first to develop an algorithm for general SCP that achieves a solution arbitrarily close to being feasible; and finally (iii) we are the first to develop algorithms for regularized SCP. Our algorithms are then demonstrated to be effective in an extensive experimental section on data summarization and graph cut, two applications of SCP.

FRN · INFORMS · Networking · MoDELS · 學成 ·

2021 年 4 月 12 日

Feature Decomposition and Reconstruction Learning for Effective Facial Expression Recognition

Delian Ruan, YanYan,Shenqi Lai,Zhenhua Chai,Chunhua Shen,Hanzi Wang

from arxiv, IEEE/CVF Conference on Computer Vision and Pattern Recognition 2021 (CVPR 2021)

In this paper, we propose a novel Feature Decomposition and Reconstruction Learning (FDRL) method for effective facial expression recognition. We view the expression information as the combination of the shared information (expression similarities) across different expressions and the unique information (expression-specific variations) for each expression. More specifically, FDRL mainly consists of two crucial networks: a Feature Decomposition Network (FDN) and a Feature Reconstruction Network (FRN). In particular, FDN first decomposes the basic features extracted from a backbone network into a set of facial action-aware latent features to model expression similarities. Then, FRN captures the intra-feature and inter-feature relationships for latent features to characterize expression-specific variations, and reconstructs the expression feature. To this end, two modules including an intra-feature relation modeling module and an inter-feature relation modeling module are developed in FRN. Experimental results on both the in-the-lab databases (including CK+, MMI, and Oulu-CASIA) and the in-the-wild databases (including RAF-DB and SFEW) show that the proposed FDRL method consistently achieves higher recognition accuracy than several state-of-the-art methods. This clearly highlights the benefit of feature decomposition and reconstruction for classifying expressions.

元學習 · 語音識別 · MAML · 學成 · 端到端 ·

2019 年 10 月 26 日

Meta Learning for End-to-End Low-Resource Speech Recognition

Jui-Yang Hsu,Yuan-Jui Chen,Hung-yi Lee

from arxiv, 5 pages, submitted to ICASSP 2020

In this paper, we proposed to apply meta learning approach for low-resource automatic speech recognition (ASR). We formulated ASR for different languages as different tasks, and meta-learned the initialization parameters from many pretraining languages to achieve fast adaptation on unseen target language, via recently proposed model-agnostic meta learning algorithm (MAML). We evaluated the proposed approach using six languages as pretraining tasks and four languages as target tasks. Preliminary results showed that the proposed method, MetaASR, significantly outperforms the state-of-the-art multitask pretraining approach on all target languages with different combinations of pretraining languages. In addition, since MAML's model-agnostic property, this paper also opens new research direction of applying meta learning to more speech-related applications.

注意力機制 · 機器閱讀理解 · Extensibility · state-of-the-art · MoDELS ·

2018 年 4 月 25 日

Reinforced Mnemonic Reader for Machine Reading Comprehension

Minghao Hu,Yuxing Peng,Zhen Huang,Xipeng Qiu,Furu Wei,Ming Zhou

from arxiv, Published in 26th International Joint Conference on Artificial Intelligence (IJCAI), 2018

In this paper, we introduce the Reinforced Mnemonic Reader for machine reading comprehension tasks, which enhances previous attentive readers in two aspects. First, a reattention mechanism is proposed to refine current attentions by directly accessing to past attentions that are temporally memorized in a multi-round alignment architecture, so as to avoid the problems of attention redundancy and attention deficiency. Second, a new optimization approach, called dynamic-critical reinforcement learning, is introduced to extend the standard supervised method. It always encourages to predict a more acceptable answer so as to address the convergence suppression problem occurred in traditional reinforcement learning algorithms. Extensive experiments on the Stanford Question Answering Dataset (SQuAD) show that our model achieves state-of-the-art results. Meanwhile, our model outperforms previous systems by over 6% in terms of both Exact Match and F1 metrics on two adversarial SQuAD datasets.