亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

In this paper, we propose a novel swashplateless-elevon actuation (SEA) for dual-rotor tail-sitter vertical takeoff and landing (VTOL) unmanned aerial vehicles (UAVs). In contrast to the conventional elevon actuation (CEA) which controls both pitch and yaw using elevons, the SEA adopts swashplateless mechanisms to generate an extra moment through motor speed modulation to control pitch and uses elevons solely for controlling yaw, without requiring additional actuators. This decoupled control strategy mitigates the saturation of elevons' deflection needed for large pitch and yaw control actions, thus improving the UAV's control performance on trajectory tracking and disturbance rejection performance in the presence of large external disturbances. Furthermore, the SEA overcomes the actuation degradation issues experienced by the CEA when the UAV is in close proximity to the ground, leading to a smoother and more stable take-off process. We validate and compare the performances of the SEA and the CEA in various real-world flight conditions, including take-off, trajectory tracking, and hover flight and position steps under external disturbance. Experimental results demonstrate that the SEA has better performances than the CEA. Moreover, we verify the SEA's feasibility in the attitude transition process and fixed-wing-mode flight of the VTOL UAV. The results indicate that the SEA can accurately control pitch in the presence of high-speed incoming airflow and maintain a stable attitude during fixed-wing mode flight. Video of all experiments can be found in youtube.com/watch?v=Sx9Rk4Zf7sQ

相關內容

In this paper, we propose a novel text promptable surgical instrument segmentation approach to overcome challenges associated with diversity and differentiation of surgical instruments in minimally invasive surgeries. We redefine the task as text promptable, thereby enabling a more nuanced comprehension of surgical instruments and adaptability to new instrument types. Inspired by recent advancements in vision-language models, we leverage pretrained image and text encoders as our model backbone and design a text promptable mask decoder consisting of attention- and convolution-based prompting schemes for surgical instrument segmentation prediction. Our model leverages multiple text prompts for each surgical instrument through a new mixture of prompts mechanism, resulting in enhanced segmentation performance. Additionally, we introduce a hard instrument area reinforcement module to improve image feature comprehension and segmentation precision. Extensive experiments on several surgical instrument segmentation datasets demonstrate our model's superior performance and promising generalization capability. To our knowledge, this is the first implementation of a promptable approach to surgical instrument segmentation, offering significant potential for practical application in the field of robotic-assisted surgery. Code is available at //github.com/franciszzj/TP-SIS.

In this paper, we propose to develop a new Cram\'er-Rao Bound (CRB) when the parameter to estimate lies in a manifold and follows a prior distribution. This derivation leads to a natural inequality between an error criteria based on geometrical properties and this new bound. This main contribution is illustrated in the problem of covariance estimation when the data follow a Gaussian distribution and the prior distribution is an inverse Wishart. Numerical simulation shows new results where the proposed CRB allows to exhibit interesting properties of the MAP estimator which are not observed with the classical Bayesian CRB.

In the last years, several variants of transformers have emerged. In this paper, we compare different transformer-based models for solving the reverse dictionary task and explore their use in the context of a serious game called The Dictionary Game.

In this paper we present a non-local numerical scheme based on the Local Discontinuous Galerkin method for a non-local diffusive partial differential equation with application to traffic flow. In this model, the velocity is determined by both the average of the traffic density as well as the changes in the traffic density at a neighborhood of each point. We discuss nonphysical behaviors that can arise when including diffusion, and our measures to prevent them in our model. The numerical results suggest that this is an accurate method for solving this type of equation and that the model can capture desired traffic flow behavior. We show that computation of the non-local convolution results in $\mathcal{O}(n^2)$ complexity, but the increased computation time can be mitigated with high-order schemes like the one proposed.

In this paper, we propose a novel Energy-Calibrated Generative Model that utilizes a Conditional EBM for enhancing Variational Autoencoders (VAEs). VAEs are sampling efficient but often suffer from blurry generation results due to the lack of training in the generative direction. On the other hand, Energy-Based Models (EBMs) can generate high-quality samples but require expensive Markov Chain Monte Carlo (MCMC) sampling. To address these issues, we introduce a Conditional EBM for calibrating the generative direction during training, without requiring it for test time sampling. Our approach enables the generative model to be trained upon data and calibrated samples with adaptive weight, thereby enhancing efficiency and effectiveness without necessitating MCMC sampling in the inference phase. We also show that the proposed approach can be extended to calibrate normalizing flows and variational posterior. Moreover, we propose to apply the proposed method to zero-shot image restoration via neural transport prior and range-null theory. We demonstrate the effectiveness of the proposed method through extensive experiments in various applications, including image generation and zero-shot image restoration. Our method shows state-of-the-art performance over single-step non-adversarial generation.

In this paper, we propose leveraging the active reconfigurable intelligence surface (RIS) to support reliable gradient aggregation for over-the-air computation (AirComp) enabled federated learning (FL) systems. An analysis of the FL convergence property reveals that minimizing gradient aggregation errors in each training round is crucial for narrowing the convergence gap. As such, we formulate an optimization problem, aiming to minimize these errors by jointly optimizing the transceiver design and RIS configuration. To handle the formulated highly non-convex problem, we devise a two-layer alternative optimization framework to decompose it into several convex subproblems, each solvable optimally. Simulation results demonstrate the superiority of the active RIS in reducing gradient aggregation errors compared to its passive counterpart.

In this paper we propose two new subclasses of Petri nets with resets, for which the reachability and coverability problems become tractable. Namely, we add an acyclicity condition that only applies to the consumptions and productions, not the resets. The first class is acyclic Petri nets with resets, and we show that coverability is PSPACE-complete for them. This contrasts the known Ackermann-hardness for coverability in (not necessarily acyclic) Petri nets with resets. We prove that the reachability problem remains undecidable for acyclic Petri nets with resets. The second class concerns workflow nets, a practically motivated and natural subclass of Petri nets. Here, we show that both coverability and reachability in acyclic workflow nets with resets are PSPACE-complete. Without the acyclicity condition, reachability and coverability in workflow nets with resets are known to be equally hard as for Petri nets with resets, that being Ackermann-hard and undecidable, respectively.

In this paper, we derive a novel optimal image transport algorithm over sparse dictionaries by taking advantage of Sparse Representation (SR) and Optimal Transport (OT). Concisely, we design a unified optimization framework in which the individual image features (color, textures, styles, etc.) are encoded using sparse representation compactly, and an optimal transport plan is then inferred between two learned dictionaries in accordance with the encoding process. This paradigm gives rise to a simple but effective way for simultaneous image representation and transformation, which is also empirically solvable because of the moderate size of sparse coding and optimal transport sub-problems. We demonstrate its versatility and many benefits to different image-to-image translation tasks, in particular image color transform and artistic style transfer, and show the plausible results for photo-realistic transferred effects.

In this paper, we propose a novel method for speaker adaptation in lip reading, motivated by two observations. Firstly, a speaker's own characteristics can always be portrayed well by his/her few facial images or even a single image with shallow networks, while the fine-grained dynamic features associated with speech content expressed by the talking face always need deep sequential networks to represent accurately. Therefore, we treat the shallow and deep layers differently for speaker adaptive lip reading. Secondly, we observe that a speaker's unique characteristics ( e.g. prominent oral cavity and mandible) have varied effects on lip reading performance for different words and pronunciations, necessitating adaptive enhancement or suppression of the features for robust lip reading. Based on these two observations, we propose to take advantage of the speaker's own characteristics to automatically learn separable hidden unit contributions with different targets for shallow layers and deep layers respectively. For shallow layers where features related to the speaker's characteristics are stronger than the speech content related features, we introduce speaker-adaptive features to learn for enhancing the speech content features. For deep layers where both the speaker's features and the speech content features are all expressed well, we introduce the speaker-adaptive features to learn for suppressing the speech content irrelevant noise for robust lip reading. Our approach consistently outperforms existing methods, as confirmed by comprehensive analysis and comparison across different settings. Besides the evaluation on the popular LRW-ID and GRID datasets, we also release a new dataset for evaluation, CAS-VSR-S68h, to further assess the performance in an extreme setting where just a few speakers are available but the speech content covers a large and diversified range.

In this paper, we propose a novel Feature Decomposition and Reconstruction Learning (FDRL) method for effective facial expression recognition. We view the expression information as the combination of the shared information (expression similarities) across different expressions and the unique information (expression-specific variations) for each expression. More specifically, FDRL mainly consists of two crucial networks: a Feature Decomposition Network (FDN) and a Feature Reconstruction Network (FRN). In particular, FDN first decomposes the basic features extracted from a backbone network into a set of facial action-aware latent features to model expression similarities. Then, FRN captures the intra-feature and inter-feature relationships for latent features to characterize expression-specific variations, and reconstructs the expression feature. To this end, two modules including an intra-feature relation modeling module and an inter-feature relation modeling module are developed in FRN. Experimental results on both the in-the-lab databases (including CK+, MMI, and Oulu-CASIA) and the in-the-wild databases (including RAF-DB and SFEW) show that the proposed FDRL method consistently achieves higher recognition accuracy than several state-of-the-art methods. This clearly highlights the benefit of feature decomposition and reconstruction for classifying expressions.

北京阿比特科技有限公司