又黄又爽又色的视频免费_午夜男女爽爽爽免费大片_日本YY午夜电影日本久久久_无遮挡十八禁在线真人视频_六月婷婷国产精品综合_岛国视频在线观看_一级做A爱片久久毛片A高清

This paper considers a two-player game where each player chooses a resource from a finite collection of options. Each resource brings a random reward. Both players have statistical information regarding the rewards of each resource. Additionally, there exists an information asymmetry where each player has knowledge of the reward realizations of different subsets of the resources. If both players choose the same resource, the reward is divided equally between them, whereas if they choose different resources, each player gains the full reward of the resource. We first implement the iterative best response algorithm to find an $\epsilon$-approximate Nash equilibrium for this game. This method of finding a Nash equilibrium may not be desirable when players do not trust each other and place no assumptions on the incentives of the opponent. To handle this case, we solve the problem of maximizing the worst-case expected utility of the first player. The solution leads to counter-intuitive insights in certain special cases. To solve the general version of the problem, we develop an efficient algorithmic solution that combines online-convex optimization and the drift-plus penalty technique.

相關內容

INFORMS

關注 10

《計算機信息》雜志發表高質量的論文，擴大了運籌學和計算的范圍，尋求有關理論、方法、實驗、系統和應用方面的原創研究論文、新穎的調查和教程論文，以及描述新的和有用的軟件工具的論文。官網鏈接： · 優化器 · 線性的 · Learning · Projection ·

2023 年 9 月 28 日

Method and Validation for Optimal Lineup Creation for Daily Fantasy Football Using Machine Learning and Linear Programming

Joseph M. Mahoney,Tomasz B. Paniak

from arxiv, 32 pages, 12 figures

Daily fantasy sports (DFS) are weekly or daily online contests where real-game performances of individual players are converted to fantasy points (FPTS). Users select players for their lineup to maximize their FPTS within a set player salary cap. This paper focuses on (1) the development of a method to forecast NFL player performance under uncertainty and (2) determining an optimal lineup to maximize FPTS under a set salary limit. A supervised learning neural network was created and used to project FPTS based on past player performance (2018 NFL regular season for this work) prior to the upcoming week. These projected FPTS were used in a mixed integer linear program to find the optimal lineup. The performance of resultant lineups was compared to randomly-created lineups. On average, the optimal lineups outperformed the random lineups. The generated lineups were then compared to real-world lineups from users on DraftKings. The generated lineups generally fell in approximately the 31st percentile (median). The FPTS methods and predictions presented here can be further improved using this study as a baseline comparison.

Continuity · MoDELS · Performer · 縮放 · 上下文窗口 ·

2023 年 9 月 27 日

Effective Long-Context Scaling of Foundation Models

Wenhan Xiong,Jingyu Liu,Igor Molybog,Hejia Zhang,Prajjwal Bhargava,Rui Hou,Louis Martin,Rashi Rungta,Karthik Abinav Sankararaman,Barlas Oguz,Madian Khabsa,Han Fang,Yashar Mehdad,Sharan Narang,Kshitiz Malik,Angela Fan,Shruti Bhosale,Sergey Edunov,Mike Lewis,Sinong Wang,Hao Ma

We present a series of long-context LLMs that support effective context windows of up to 32,768 tokens. Our model series are built through continual pretraining from Llama 2 with longer training sequences and on a dataset where long texts are upsampled. We perform extensive evaluation on language modeling, synthetic context probing tasks, and a wide range of research benchmarks. On research benchmarks, our models achieve consistent improvements on most regular tasks and significant improvements on long-context tasks over Llama 2. Notably, with a cost-effective instruction tuning procedure that does not require human-annotated long instruction data, the 70B variant can already surpass gpt-3.5-turbo-16k's overall performance on a suite of long-context tasks. Alongside these results, we provide an in-depth analysis on the individual components of our method. We delve into Llama's position encodings and discuss its limitation in modeling long dependencies. We also examine the impact of various design choices in the pretraining process, including the data mix and the training curriculum of sequence lengths -- our ablation experiments suggest that having abundant long texts in the pretrain dataset is not the key to achieving strong performance, and we empirically verify that long context continual pretraining is more efficient and similarly effective compared to pretraining from scratch with long sequences.

回合 · 機器人 · 可約的 · Job Shop · 機器人操作平臺 ·

2023 年 9 月 27 日

Communication-Constrained Multi-Robot Exploration with Intermittent Rendezvous

Alysson Ribeiro da Silva,Luiz Chaimowicz,Vijay Kumar,Thales Costa Silva,Ani Hsieh

from arxiv, 7 pages, 12 figures, 1 table, video: //youtu.be/EuVbCoyjuIY

This paper deals with the Multi-robot Exploration (MRE) under communication constraints problem. We propose a novel intermittent rendezvous method that allows robots to explore an unknown environment while sharing maps at rendezvous locations through agreements. In our method, robots update the agreements to spread the rendezvous locations during the exploration and prioritize exploring unknown areas near them. To generate the agreements automatically, we reduced the MRE to instances of the Job Shop Scheduling Problem (JSSP) and ensured intermittent communication through a temporal connectivity graph. We evaluate our method in simulation in various virtual urban environments and a Gazebo simulation using the Robot Operating System (ROS). Our results suggest that our method can be better than using relays or maintaining intermittent communication with a base station since we can explore faster without additional hardware to create a relay network.

損失 · 3D · 監督 · 控制器 · 可交換的 ·

2023 年 9 月 26 日

Emotional Speech-Driven Animation with Content-Emotion Disentanglement

Radek Daně?ek,Kiran Chhatre,Shashank Tripathi,Yandong Wen,Michael J. Black,Timo Bolkart

from arxiv, SIGGRAPH Asia 2023 Conference Paper

To be widely adopted, 3D facial avatars must be animated easily, realistically, and directly from speech signals. While the best recent methods generate 3D animations that are synchronized with the input audio, they largely ignore the impact of emotions on facial expressions. Realistic facial animation requires lip-sync together with the natural expression of emotion. To that end, we propose EMOTE (Expressive Model Optimized for Talking with Emotion), which generates 3D talking-head avatars that maintain lip-sync from speech while enabling explicit control over the expression of emotion. To achieve this, we supervise EMOTE with decoupled losses for speech (i.e., lip-sync) and emotion. These losses are based on two key observations: (1) deformations of the face due to speech are spatially localized around the mouth and have high temporal frequency, whereas (2) facial expressions may deform the whole face and occur over longer intervals. Thus, we train EMOTE with a per-frame lip-reading loss to preserve the speech-dependent content, while supervising emotion at the sequence level. Furthermore, we employ a content-emotion exchange mechanism in order to supervise different emotions on the same audio, while maintaining the lip motion synchronized with the speech. To employ deep perceptual losses without getting undesirable artifacts, we devise a motion prior in the form of a temporal VAE. Due to the absence of high-quality aligned emotional 3D face datasets with speech, EMOTE is trained with 3D pseudo-ground-truth extracted from an emotional video dataset (i.e., MEAD). Extensive qualitative and perceptual evaluations demonstrate that EMOTE produces speech-driven facial animations with better lip-sync than state-of-the-art methods trained on the same data, while offering additional, high-quality emotional control.

contrastive · 對比學習 · 學成 · Extensibility · 稀疏 ·

2022 年 3 月 25 日

Versatile Multi-Modal Pre-Training for Human-Centric Perception

Fangzhou Hong,Liang Pan,Zhongang Cai,Ziwei Liu

from arxiv, CVPR 2022; Project Page //hongfz16.github.io/projects/HCMoCo.html; Codes available at //github.com/hongfz16/HCMoCo

Human-centric perception plays a vital role in vision and graphics. But their data annotations are prohibitively expensive. Therefore, it is desirable to have a versatile pre-train model that serves as a foundation for data-efficient downstream tasks transfer. To this end, we propose the Human-Centric Multi-Modal Contrastive Learning framework HCMoCo that leverages the multi-modal nature of human data (e.g. RGB, depth, 2D keypoints) for effective representation learning. The objective comes with two main challenges: dense pre-train for multi-modality data, efficient usage of sparse human priors. To tackle the challenges, we design the novel Dense Intra-sample Contrastive Learning and Sparse Structure-aware Contrastive Learning targets by hierarchically learning a modal-invariant latent space featured with continuous and ordinal feature distribution and structure-aware semantic consistency. HCMoCo provides pre-train for different modalities by combining heterogeneous datasets, which allows efficient usage of existing task-specific human data. Extensive experiments on four downstream tasks of different modalities demonstrate the effectiveness of HCMoCo, especially under data-efficient settings (7.16% and 12% improvement on DensePose Estimation and Human Parsing). Moreover, we demonstrate the versatility of HCMoCo by exploring cross-modality supervision and missing-modality inference, validating its strong ability in cross-modal association and reasoning.

學成 · Networking · INFORMS · Performer · Neural Networks ·

2020 年 2 月 27 日

Meta-Transfer Learning for Zero-Shot Super-Resolution

Jae Woong Soh,Sunwoo Cho,Nam Ik Cho

from arxiv, Will be presented in CVPR 2020

Convolutional neural networks (CNNs) have shown dramatic improvements in single image super-resolution (SISR) by using large-scale external samples. Despite their remarkable performance based on the external dataset, they cannot exploit internal information within a specific image. Another problem is that they are applicable only to the specific condition of data that they are supervised. For instance, the low-resolution (LR) image should be a "bicubic" downsampled noise-free image from a high-resolution (HR) one. To address both issues, zero-shot super-resolution (ZSSR) has been proposed for flexible internal learning. However, they require thousands of gradient updates, i.e., long inference time. In this paper, we present Meta-Transfer Learning for Zero-Shot Super-Resolution (MZSR), which leverages ZSSR. Precisely, it is based on finding a generic initial parameter that is suitable for internal learning. Thus, we can exploit both external and internal information, where one single gradient update can yield quite considerable results. (See Figure 1). With our method, the network can quickly adapt to a given image condition. In this respect, our method can be applied to a large spectrum of image conditions within a fast adaptation process.

估計/估計量 · 3D · 全 · 塑造 · 真實值 ·

2019 年 3 月 3 日

3D Hand Shape and Pose Estimation from a Single RGB Image

Liuhao Ge,Zhou Ren,Yuncheng Li,Zehao Xue,Yingying Wang,Jianfei Cai,Junsong Yuan

from arxiv, CVPR 2019 (Oral), //sites.google.com/site/geliuhaontu/home/cvpr2019

This work addresses a novel and challenging problem of estimating the full 3D hand shape and pose from a single RGB image. Most current methods in 3D hand analysis from monocular RGB images only focus on estimating the 3D locations of hand keypoints, which cannot fully express the 3D shape of hand. In contrast, we propose a Graph Convolutional Neural Network (Graph CNN) based method to reconstruct a full 3D mesh of hand surface that contains richer information of both 3D hand shape and pose. To train networks with full supervision, we create a large-scale synthetic dataset containing both ground truth 3D meshes and 3D poses. When fine-tuning the networks on real-world datasets without 3D ground truth, we propose a weakly-supervised approach by leveraging the depth map as a weak supervision in training. Through extensive evaluations on our proposed new datasets and two public datasets, we show that our proposed method can produce accurate and reasonable 3D hand mesh, and can achieve superior 3D hand pose estimation accuracy when compared with state-of-the-art methods.

小樣本學習 · 學成 · 示例 · 泛函 · 訓練實例 ·

2018 年 12 月 10 日

Learning Embedding Adaptation for Few-Shot Learning

Han-Jia Ye,Hexiang Hu,De-Chuan Zhan,Fei Sha

Learning with limited data is a key challenge for visual recognition. Few-shot learning methods address this challenge by learning an instance embedding function from seen classes and apply the function to instances from unseen classes with limited labels. This style of transfer learning is task-agnostic: the embedding function is not learned optimally discriminative with respect to the unseen classes, where discerning among them is the target task. In this paper, we propose a novel approach to adapt the embedding model to the target classification task, yielding embeddings that are task-specific and are discriminative. To this end, we employ a type of self-attention mechanism called Transformer to transform the embeddings from task-agnostic to task-specific by focusing on relating instances from the test instances to the training instances in both seen and unseen classes. Our approach also extends to both transductive and generalized few-shot classification, two important settings that have essential use cases. We verify the effectiveness of our model on two standard benchmark few-shot classification datasets --- MiniImageNet and CUB, where our approach demonstrates state-of-the-art empirical performance.

Single-Shot · Branch · 目標檢測 · 推斷 · MS ·

2018 年 4 月 8 日

Single-Shot Object Detection with Enriched Semantics

Zhishuai Zhang,Siyuan Qiao,Cihang Xie,Wei Shen,Bo Wang,Alan L. Yuille

We propose a novel single shot object detection network named Detection with Enriched Semantics (DES). Our motivation is to enrich the semantics of object detection features within a typical deep detector, by a semantic segmentation branch and a global activation module. The segmentation branch is supervised by weak segmentation ground-truth, i.e., no extra annotation is required. In conjunction with that, we employ a global activation module which learns relationship between channels and object classes in a self-supervised manner. Comprehensive experimental results on both PASCAL VOC and MS COCO detection datasets demonstrate the effectiveness of the proposed method. In particular, with a VGG16 based DES, we achieve an mAP of 81.7 on VOC2007 test and an mAP of 32.8 on COCO test-dev with an inference speed of 31.5 milliseconds per image on a Titan Xp GPU. With a lower resolution version, we achieve an mAP of 79.7 on VOC2007 with an inference speed of 13.0 milliseconds per image.

Performance · 相似度度量 · Performer · state-of-the-art · 圖像檢索 ·

2018 年 4 月 6 日

Cross-Domain Image Matching with Deep Feature Maps

Bailey Kong,James Supancic,Deva Ramanan,Charless C. Fowlkes

We investigate the problem of automatically determining what type of shoe left an impression found at a crime scene. This recognition problem is made difficult by the variability in types of crime scene evidence (ranging from traces of dust or oil on hard surfaces to impressions made in soil) and the lack of comprehensive databases of shoe outsole tread patterns. We find that mid-level features extracted by pre-trained convolutional neural nets are surprisingly effective descriptors for this specialized domains. However, the choice of similarity measure for matching exemplars to a query image is essential to good performance. For matching multi-channel deep features, we propose the use of multi-channel normalized cross-correlation and analyze its effectiveness. Our proposed metric significantly improves performance in matching crime scene shoeprints to laboratory test impressions. We also show its effectiveness in other cross-domain image retrieval problems: matching facade images to segmentation labels and aerial photos to map images. Finally, we introduce a discriminatively trained variant and fine-tune our system through our proposed metric, obtaining state-of-the-art performance.