亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

In this paper, we propose a novel fully unsupervised framework that learns action representations suitable for the action segmentation task from the single input video itself, without requiring any training data. Our method is a deep metric learning approach rooted in a shallow network with a triplet loss operating on similarity distributions and a novel triplet selection strategy that effectively models temporal and semantic priors to discover actions in the new representational space. Under these circumstances, we successfully recover temporal boundaries in the learned action representations with higher quality compared with existing unsupervised approaches. The proposed method is evaluated on two widely used benchmark datasets for the action segmentation task and it achieves competitive performance by applying a generic clustering algorithm on the learned representations.

相關內容

Next Point-of-Interest (POI) recommendation is a critical task in location-based services that aim to provide personalized suggestions for the user's next destination. Previous works on POI recommendation have laid focused on modeling the user's spatial preference. However, existing works that leverage spatial information are only based on the aggregation of users' previous visited positions, which discourages the model from recommending POIs in novel areas. This trait of position-based methods will harm the model's performance in many situations. Additionally, incorporating sequential information into the user's spatial preference remains a challenge. In this paper, we propose Diff-POI: a Diffusion-based model that samples the user's spatial preference for the next POI recommendation. Inspired by the wide application of diffusion algorithm in sampling from distributions, Diff-POI encodes the user's visiting sequence and spatial character with two tailor-designed graph encoding modules, followed by a diffusion-based sampling strategy to explore the user's spatial visiting trends. We leverage the diffusion process and its reversed form to sample from the posterior distribution and optimized the corresponding score function. We design a joint training and inference framework to optimize and evaluate the proposed Diff-POI. Extensive experiments on four real-world POI recommendation datasets demonstrate the superiority of our Diff-POI over state-of-the-art baseline methods. Further ablation and parameter studies on Diff-POI reveal the functionality and effectiveness of the proposed diffusion-based sampling strategy for addressing the limitations of existing methods.

This paper presents a novel approach to Zero-Shot Action Recognition. Recent works have explored the detection and classification of objects to obtain semantic information from videos with remarkable performance. Inspired by them, we propose using video captioning methods to extract semantic information about objects, scenes, humans, and their relationships. To the best of our knowledge, this is the first work to represent both videos and labels with descriptive sentences. More specifically, we represent videos using sentences generated via video captioning methods and classes using sentences extracted from documents acquired through search engines on the Internet. Using these representations, we build a shared semantic space employing BERT-based embedders pre-trained in the paraphrasing task on multiple text datasets. The projection of both visual and semantic information onto this space is straightforward, as they are sentences, enabling classification using the nearest neighbor rule. We demonstrate that representing videos and labels with sentences alleviates the domain adaptation problem. Additionally, we show that word vectors are unsuitable for building the semantic embedding space of our descriptions. Our method outperforms the state-of-the-art performance on the UCF101 dataset by 3.3 p.p. in accuracy under the TruZe protocol and achieves competitive results on both the UCF101 and HMDB51 datasets under the conventional protocol (0/50\% - training/testing split). Our code is available at //github.com/valterlej/zsarcap.

This paper introduces a novel approach for human-to-robot motion retargeting, enabling robots to mimic human motion with precision while preserving the semantics of the motion. For that, we propose a deep learning method for direct translation from human to robot motion. Our method does not require annotated paired human-to-robot motion data, which reduces the effort when adopting new robots. To this end, we first propose a cross-domain similarity metric to compare the poses from different domains (i.e., human and robot). Then, our method achieves the construction of a shared latent space via contrastive learning and decodes latent representations to robot motion control commands. The learned latent space exhibits expressiveness as it captures the motions precisely and allows direct motion control in the latent space. We showcase how to generate in-between motion through simple linear interpolation in the latent space between two projected human poses. Additionally, we conducted a comprehensive evaluation of robot control using diverse modality inputs, such as texts, RGB videos, and key-poses, which enhances the ease of robot control to users of all backgrounds. Finally, we compare our model with existing works and quantitatively and qualitatively demonstrate the effectiveness of our approach, enhancing natural human-robot communication and fostering trust in integrating robots into daily life.

In this paper, we propose a Riemannian Acceleration with Preconditioning (RAP) for symmetric eigenvalue problems, which is one of the most important geodesically convex optimization problem on Riemannian manifold, and obtain the acceleration. Firstly, the preconditioning for symmetric eigenvalue problems from the Riemannian manifold viewpoint is discussed. In order to obtain the local geodesic convexity, we develop the leading angle to measure the quality of the preconditioner for symmetric eigenvalue problems. A new Riemannian acceleration, called Locally Optimal Riemannian Accelerated Gradient (LORAG) method, is proposed to overcome the local geodesic convexity for symmetric eigenvalue problems. With similar techniques for RAGD and analysis of local convex optimization in Euclidean space, we analyze the convergence of LORAG. Incorporating the local geodesic convexity of symmetric eigenvalue problems under preconditioning with the LORAG, we propose the Riemannian Acceleration with Preconditioning (RAP) and prove its acceleration. Additionally, when the Schwarz preconditioner, especially the overlapping or non-overlapping domain decomposition method, is applied for elliptic eigenvalue problems, we also obtain the rate of convergence as $1-C\kappa^{-1/2}$, where $C$ is a constant independent of the mesh sizes and the eigenvalue gap, $\kappa=\kappa_{\nu}\lambda_{2}/(\lambda_{2}-\lambda_{1})$, $\kappa_{\nu}$ is the parameter from the stable decomposition, $\lambda_{1}$ and $\lambda_{2}$ are the smallest two eigenvalues of the elliptic operator. Numerical results show the power of Riemannian acceleration and preconditioning.

In this paper, we apply the information theory to provide an approximate expression of the steady-state probability distribution for blockchain systems. We achieve this goal by maximizing an entropy function subject to specific constraints. These constraints are based on some prior information, including the average numbers of transactions in the block and the transaction pool, respectively. Furthermore, we use some numerical experiments to analyze how the key factors in this approximate expression depend on the crucial parameters of the blockchain system. As a result, this approximate expression has important theoretical significance in promoting practical applications of blockchain technology. At the same time, not only do the method and results given in this paper provide a new line in the study of blockchain queueing systems, but they also provide the theoretical basis and technical support for how to apply the information theory to the investigation of blockchain queueing networks and stochastic models more broadly.

This paper is written for a Festschrift in honour of Professor Marc Hallin and it proposes some developments on quantile regression. We connect our investigation to Marc's scientific production and we present some theoretical and methodological advances for quantiles estimation in non standard settings. We split our contributions in two parts. The first part is about conditional quantiles estimation for nonstationary time series. The second part is about conditional quantiles estimation for the analysis of multivariate independent data in the presence of possibly large dimensional covariates. Monte Carlo studies illustrate numerically the performance of our methods and compare them to some extant techniques.

In this research work, we propose a high-order time adapted scheme for pricing a coupled system of fixed-free boundary constant elasticity of variance (CEV) model on both equidistant and locally refined space-grid. The performance of our method is substantially enhanced to improve irregularities in the model which are both inherent and induced. Furthermore, the system of coupled PDEs is strongly nonlinear and involves several time-dependent coefficients that include the first-order derivative of the early exercise boundary. These coefficients are approximated from a fourth-order analytical approximation which is derived using a regularized square-root function. The semi-discrete equation for the option value and delta sensitivity is obtained from a non-uniform fourth-order compact finite difference scheme. Fifth-order 5(4) Dormand-Prince time integration method is used to solve the coupled system of discrete equations. Enhancing the performance of our proposed method with local mesh refinement and adaptive strategies enables us to obtain highly accurate solution with very coarse space grids, hence reducing computational runtime substantially. We further verify the performance of our methodology as compared with some of the well-known and better-performing existing methods.

In this paper we establish limit theorems for power variations of stochastic processes controlled by fractional Brownian motions with Hurst parameter $H\leq 1/2$. We show that the power variations of such processes can be decomposed into the mix of several weighted random sums plus some remainder terms, and the convergences of power variations are dominated by different combinations of those weighted sums depending on whether $H<1/4$, $H=1/4$, or $H>1/4$. We show that when $H\geq 1/4$ the centered power variation converges stably at the rate $n^{-1/2}$, and when $H<1/4$ it converges in probability at the rate $n^{-2H}$. We determine the limit of the mixed weighted sum based on a rough path approach developed in \cite{LT20}.

In this paper, two novel classes of implicit exponential Runge-Kutta (ERK) methods are studied for solving highly oscillatory systems. Firstly, we analyze the symplectic conditions for two kinds of exponential integrators and obtain the symplectic method. In order to effectively solve highly oscillatory problems, we try to design the highly accurate implicit ERK integrators. By comparing the Taylor series expansion of numerical solution with exact solution, it can be verified that the order conditions of two new kinds of exponential methods are identical to classical Runge-Kutta (RK) methods, which implies that using the coefficients of RK methods, some highly accurate numerical methods are directly formulated. Furthermore, we also investigate the linear stability properties for these exponential methods. Finally, numerical results not only display the long time energy preservation of the symplectic method, but also present the accuracy and efficiency of these formulated methods in comparison with standard ERK methods.

In this paper, we focus on the self-supervised learning of visual correspondence using unlabeled videos in the wild. Our method simultaneously considers intra- and inter-video representation associations for reliable correspondence estimation. The intra-video learning transforms the image contents across frames within a single video via the frame pair-wise affinity. To obtain the discriminative representation for instance-level separation, we go beyond the intra-video analysis and construct the inter-video affinity to facilitate the contrastive transformation across different videos. By forcing the transformation consistency between intra- and inter-video levels, the fine-grained correspondence associations are well preserved and the instance-level feature discrimination is effectively reinforced. Our simple framework outperforms the recent self-supervised correspondence methods on a range of visual tasks including video object tracking (VOT), video object segmentation (VOS), pose keypoint tracking, etc. It is worth mentioning that our method also surpasses the fully-supervised affinity representation (e.g., ResNet) and performs competitively against the recent fully-supervised algorithms designed for the specific tasks (e.g., VOT and VOS).

北京阿比特科技有限公司