在线点播亚洲日韩国产欧美,欧美日韩在线精品视频一区二区三,波多野结衣电影国产一区二区三区,在线观看免费视频黄片

Self-supervised landmark estimation is a challenging task that demands the formation of locally distinct feature representations to identify sparse facial landmarks in the absence of annotated data. To tackle this task, existing state-of-the-art (SOTA) methods (1) extract coarse features from backbones that are trained with instance-level self-supervised learning (SSL) paradigms, which neglect the dense prediction nature of the task, (2) aggregate them into memory-intensive hypercolumn formations, and (3) supervise lightweight projector networks to naively establish full local correspondences among all pairs of spatial features. In this paper, we introduce SCE-MAE, a framework that (1) leverages the MAE, a region-level SSL method that naturally better suits the landmark prediction task, (2) operates on the vanilla feature map instead of on expensive hypercolumns, and (3) employs a Correspondence Approximation and Refinement Block (CARB) that utilizes a simple density peak clustering algorithm and our proposed Locality-Constrained Repellence Loss to directly hone only select local correspondences. We demonstrate through extensive experiments that SCE-MAE is highly effective and robust, outperforming existing SOTA methods by large margins of approximately 20%-44% on the landmark matching and approximately 9%-15% on the landmark detection tasks.

相關內容

估計/估計量

關注 3

Agent · INTERACT · 大語言模型 · Notability · INFORMS ·

2024 年 7 月 9 日

FinCon: A Synthesized LLM Multi-Agent System with Conceptual Verbal Reinforcement for Enhanced Financial Decision Making

Yangyang Yu,Zhiyuan Yao,Haohang Li,Zhiyang Deng,Yupeng Cao,Zhi Chen,Jordan W. Suchow,Rong Liu,Zhenyu Cui,Denghui Zhang,Zhaozhuo Xu,Koduvayur Subbalakshmi,Guojun Xiong,Yueru He,Jimin Huang,Dong Li,Qianqian Xie

from arxiv, LLM Applications, LLM Agents, Financial Technology, Quantitative Finance, Algorithmic Trading, Cognitive Science

Large language models (LLMs) have demonstrated notable potential in conducting complex tasks and are increasingly utilized in various financial applications. However, high-quality sequential financial investment decision-making remains challenging. These tasks require multiple interactions with a volatile environment for every decision, demanding sufficient intelligence to maximize returns and manage risks. Although LLMs have been used to develop agent systems that surpass human teams and yield impressive investment returns, opportunities to enhance multi-sourced information synthesis and optimize decision-making outcomes through timely experience refinement remain unexplored. Here, we introduce the FinCon, an LLM-based multi-agent framework with CONceptual verbal reinforcement tailored for diverse FINancial tasks. Inspired by effective real-world investment firm organizational structures, FinCon utilizes a manager-analyst communication hierarchy. This structure allows for synchronized cross-functional agent collaboration towards unified goals through natural language interactions and equips each agent with greater memory capacity than humans. Additionally, a risk-control component in FinCon enhances decision quality by episodically initiating a self-critiquing mechanism to update systematic investment beliefs. The conceptualized beliefs serve as verbal reinforcement for the future agent's behavior and can be selectively propagated to the appropriate node that requires knowledge updates. This feature significantly improves performance while reducing unnecessary peer-to-peer communication costs. Moreover, FinCon demonstrates strong generalization capabilities in various financial tasks, including single stock trading and portfolio management.

Processing（編程語言） · 操作 · 代價 · MoDELS · 講稿 ·

2024 年 7 月 7 日

Towards Improving Unit Commitment Economics: An Add-On Tailor for Renewable Energy and Reserve Predictions

Xianbang Chen,Yikui Liu,Lei Wu

from arxiv, This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

Generally, day-ahead unit commitment (UC) is conducted in a predict-then-optimize process: it starts by predicting the renewable energy source (RES) availability and system reserve requirements; given the predictions, the UC model is then optimized to determine the economic operation plans. In fact, predictions within the process are raw. In other words, if the predictions are further tailored to assist UC in making the economic operation plans against realizations of the RES and reserve requirements, UC economics will benefit significantly. To this end, this paper presents a cost-oriented tailor of RES-and-reserve predictions for UC, deployed as an add-on to the predict-then-optimize process. The RES-and-reserve tailor is trained by solving a bi-level mixed-integer programming model: the upper level trains the tailor based on its induced operating cost; the lower level, given tailored predictions, mimics the system operation process and feeds the induced operating cost back to the upper level; finally, the upper level evaluates the training quality according to the fed-back cost. Through this training, the tailor learns to customize the raw predictions into cost-oriented predictions. Moreover, the tailor can be embedded into the existing predict-then-optimize process as an add-on, improving the UC economics. Lastly, the presented method is compared to traditional, binary-relaxation, neural network-based, stochastic, and robust methods.

集成 · MoDELS · 貪心 · Performer · Boosting（一種模型訓練加速方式） ·

2024 年 7 月 7 日

Ensemble Boost: Greedy Selection for Superior Recommender Systems

Zainil Mehta,Tobias Vente

Ensemble techniques have demonstrated remarkable success in improving predictive performance across various domains by aggregating predictions from multiple models [1]. In the realm of recommender systems, this research explores the application of ensemble technique to enhance recommendation quality. Specifically, we propose a novel approach to combine top-k recommendations from ten diverse recommendation models resulting in superior top-n recommendations using this novel ensemble technique. Our method leverages a Greedy Ensemble Selection(GES) strategy, effectively harnessing the collective intelligence of multiple models. We conduct experiments on five distinct datasets to evaluate the effectiveness of our approach. Evaluation across five folds using the NDCG metric reveals significant improvements in recommendation accuracy across all datasets compared to single best performing model. Furthermore, comprehensive comparisons against existing models underscore the efficacy of our ensemble approach in enhancing recommendation quality. Our ensemble approach yielded an average improvement of 21.67% across different NDCG@N metrics and the five datasets, compared to single best model. The popularity recommendation model serves as the baseline for comparison. This research contributes to the advancement of ensemble-based recommender systems, offering insights into the potential of combining diverse recommendation strategies to enhance user experience and satisfaction. By presenting a novel approach and demonstrating its superiority over existing methods, we aim to inspire further exploration and innovation in this domain.

可穿戴設備 · 機器人 · INTERACT · Pattern Recognition · 模型評估 ·

2024 年 7 月 5 日

MoveTouch: Robotic Motion Capturing System with Wearable Tactile Display to Achieve Safe HRI

Ali Alabbas,Miguel Altamirano Cabrera,Mohamed Sayed,Oussama Alyounes,Qian Liu,Dzmitry Tsetserukou

from arxiv, 14 pages, Eurohaptics 2024

The collaborative robot market is flourishing as there is a trend towards simplification, modularity, and increased flexibility on the production line. But when humans and robots are collaborating in a shared environment, the safety of humans should be a priority. We introduce a novel wearable robotic system to enhance safety during Human-Robot Interaction (HRI). The proposed wearable robot is designed to hold a fiducial marker and maintain its visibility to a motion capture system, which, in turn, localizes the user's hand with good accuracy and low latency and provides vibrotactile feedback to the user's wrist. The vibrotactile feedback guides the user's hand movement during collaborative tasks in order to increase safety and enhance collaboration efficiency. A user study was conducted to assess the recognition and discriminability of ten designed vibration patterns applied to the upper (dorsal) and the down (volar) parts of the user's wrist. The results show that the pattern recognition rate on the volar side was higher, with an average of 75.64% among all users. Four patterns with a high recognition rate were chosen to be incorporated into our system. A second experiment was carried out to evaluate users' response to the chosen patterns in real-world collaborative tasks. Results show that all participants responded to the patterns correctly, and the average response time for the patterns was between 0.24 and 2.41 seconds.

閾值 · AIM · INTERACT · 評論員 · 設計 ·

2024 年 7 月 3 日

SD-BLS: Privacy Preserving Selective Disclosure of Verifiable Credentials with Unlinkable Threshold Revocation

Denis Roio,Rebecca Selvaggini,Andrea D'Intino

from arxiv, 7 pages, 3 figures, accepted for publication by the IEEE International Conference on Blockchain (2024) to be held in Copenhagen

It is of critical importance to design digital identity systems that ensure the privacy of citizens as well as protecting them from issuer corruption. We aim to solve this issue and propose a method for selective disclosure and privacy preserving revocation of digital credentials, using the unique homomorphic characteristics of second order Elliptic Curves and Boneh-Lynn-Shacham (BLS) signatures. Our approach ensures that users can selectively reveal credentials signed by a certain issuer, which can be interactively revoked by a quorum of other agreeing issuers without revealing the identity of users. Our goal is to protect users from issuer corruption by requiring collective agreement among multiple revocation issuers.

MoDELS · 穩健性 · 表示 · Learning · Elevate ·

2024 年 7 月 3 日

Lift, Splat, Map: Lifting Foundation Masks for Label-Free Semantic Scene Completion

Arthur Zhang,Rainier Heijne,Joydeep Biswas

from arxiv, 17 pages, 6 figures, 2 Tables

Autonomous mobile robots deployed in urban environments must be context-aware, i.e., able to distinguish between different semantic entities, and robust to occlusions. Current approaches like semantic scene completion (SSC) require pre-enumerating the set of classes and costly human annotations, while representation learning methods relax these assumptions but are not robust to occlusions and learn representations tailored towards auxiliary tasks. To address these limitations, we propose LSMap, a method that lifts masks from visual foundation models to predict a continuous, open-set semantic and elevation-aware representation in bird's eye view (BEV) for the entire scene, including regions underneath dynamic entities and in occluded areas. Our model only requires a single RGBD image, does not require human labels, and operates in real time. We quantitatively demonstrate our approach outperforms existing models trained from scratch on semantic and elevation scene completion tasks with finetuning. Furthermore, we show that our pre-trained representation outperforms existing visual foundation models at unsupervised semantic scene completion. We evaluate our approach using CODa, a large-scale, real-world urban robot dataset. Supplementary visualizations, code, data, and pre-trained models, will be publicly available soon.

控制器 · MoDELS · 去噪 · AIM · HTTPS ·

2024 年 3 月 7 日

Controllable Generation with Text-to-Image Diffusion Models: A Survey

Pu Cao,Feng Zhou,Qing Song,Lu Yang

from arxiv, A collection of resources on controllable generation with text-to-image diffusion models: //github.com/PRIV-Creation/Awesome-Controllable-T2I-Diffusion-Models

In the rapidly advancing realm of visual generation, diffusion models have revolutionized the landscape, marking a significant shift in capabilities with their impressive text-guided generative functions. However, relying solely on text for conditioning these models does not fully cater to the varied and complex requirements of different applications and scenarios. Acknowledging this shortfall, a variety of studies aim to control pre-trained text-to-image (T2I) models to support novel conditions. In this survey, we undertake a thorough review of the literature on controllable generation with T2I diffusion models, covering both the theoretical foundations and practical advancements in this domain. Our review begins with a brief introduction to the basics of denoising diffusion probabilistic models (DDPMs) and widely used T2I diffusion models. We then reveal the controlling mechanisms of diffusion models, theoretically analyzing how novel conditions are introduced into the denoising process for conditional generation. Additionally, we offer a detailed overview of research in this area, organizing it into distinct categories from the condition perspective: generation with specific conditions, generation with multiple conditions, and universal controllable generation. For an exhaustive list of the controllable generation literature surveyed, please refer to our curated repository at \url{//github.com/PRIV-Creation/Awesome-Controllable-T2I-Diffusion-Models}.

圖 · Networking · INTERACT · INFORMS · 圖形處理器 ·

2020 年 11 月 25 日

Time-Series Event Prediction with Evolutionary State Graph

Wenjie Hu,Yang Yang,Ziqiang Cheng,Carl Yang,Xiang Ren

from arxiv, A long version of EvoNet (WSDM 2021)

The accurate and interpretable prediction of future events in time-series data often requires the capturing of representative patterns (or referred to as states) underpinning the observed data. To this end, most existing studies focus on the representation and recognition of states, but ignore the changing transitional relations among them. In this paper, we present evolutionary state graph, a dynamic graph structure designed to systematically represent the evolving relations (edges) among states (nodes) along time. We conduct analysis on the dynamic graphs constructed from the time-series data and show that changes on the graph structures (e.g., edges connecting certain state nodes) can inform the occurrences of events (i.e., time-series fluctuation). Inspired by this, we propose a novel graph neural network model, Evolutionary State Graph Network (EvoNet), to encode the evolutionary state graph for accurate and interpretable time-series event prediction. Specifically, Evolutionary State Graph Network models both the node-level (state-to-state) and graph-level (segment-to-segment) propagation, and captures the node-graph (state-to-segment) interactions over time. Experimental results based on five real-world datasets show that our approach not only achieves clear improvements compared with 11 baselines, but also provides more insights towards explaining the results of event predictions.

語言模型化 · MoDELS · 位置嵌入 · 自編碼器 · 掩碼 ·

2020 年 2 月 28 日

UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training

Hangbo Bao,Li Dong,Furu Wei,Wenhui Wang,Nan Yang,Xiaodong Liu,Yu Wang,Songhao Piao,Jianfeng Gao,Ming Zhou,Hsiao-Wuen Hon

from arxiv, 11 pages

We propose to pre-train a unified language model for both autoencoding and partially autoregressive language modeling tasks using a novel training procedure, referred to as a pseudo-masked language model (PMLM). Given an input text with masked tokens, we rely on conventional masks to learn inter-relations between corrupted tokens and context via autoencoding, and pseudo masks to learn intra-relations between masked spans via partially autoregressive modeling. With well-designed position embeddings and self-attention masks, the context encodings are reused to avoid redundant computation. Moreover, conventional masks used for autoencoding provide global masking information, so that all the position embeddings are accessible in partially autoregressive language modeling. In addition, the two tasks pre-train a unified language model as a bidirectional encoder and a sequence-to-sequence decoder, respectively. Our experiments show that the unified language models pre-trained using PMLM achieve new state-of-the-art results on a wide range of natural language understanding and generation tasks across several widely used benchmarks.

語言表示 · 知識神經元 · MoDELS · 圖 · 知識圖譜 ·

2019 年 9 月 17 日

K-BERT: Enabling Language Representation with Knowledge Graph

Weijie Liu,Peng Zhou,Zhe Zhao,Zhiruo Wang,Qi Ju,Haotang Deng,Ping Wang

from arxiv, 8 pages, 20190917

Pre-trained language representation models, such as BERT, capture a general language representation from large-scale corpora, but lack domain-specific knowledge. When reading a domain text, experts make inferences with relevant knowledge. For machines to achieve this capability, we propose a knowledge-enabled language representation model (K-BERT) with knowledge graphs (KGs), in which triples are injected into the sentences as domain knowledge. However, too much knowledge incorporation may divert the sentence from its correct meaning, which is called knowledge noise (KN) issue. To overcome KN, K-BERT introduces soft-position and visible matrix to limit the impact of knowledge. K-BERT can easily inject domain knowledge into the models by equipped with a KG without pre-training by-self because it is capable of loading model parameters from the pre-trained BERT. Our investigation reveals promising results in twelve NLP tasks. Especially in domain-specific tasks (including finance, law, and medicine), K-BERT significantly outperforms BERT, which demonstrates that K-BERT is an excellent choice for solving the knowledge-driven problems that require experts.