免费看水蜜桃爱如潮水带你飞IOS_二区亚洲国产精品一区久久_好爽又高潮了粉色视频_日日噜噜夜夜狠狠视频无韩无码_欧美在线人一区二区三区视频_少妇无码一区二区不卡AV_久久久久久精品中文字幕无码

This paper introduces a new real and synthetic dataset called NeRFBK specifically designed for testing and comparing NeRF-based 3D reconstruction algorithms. High-quality 3D reconstruction has significant potential in various fields, and advancements in image-based algorithms make it essential to evaluate new advanced techniques. However, gathering diverse data with precise ground truth is challenging and may not encompass all relevant applications. The NeRFBK dataset addresses this issue by providing multi-scale, indoor and outdoor datasets with high-resolution images and videos and camera parameters for testing and comparing NeRF-based algorithms. This paper presents the design and creation of the NeRFBK benchmark, various examples and application scenarios, and highlights its potential for advancing the field of 3D reconstruction.

相關內容

三維重建

關注 1173

在計算機視覺中, 三維重建是指根據單視圖或者多視圖的圖像重建三維信息的過程. 由于單視頻的信息不完全,因此三維重建需要利用經驗知識. 而多視圖的三維重建(類似人的雙目定位)相對比較容易, 其方法是先對攝像機進行標定, 即計算出攝像機的圖象坐標系與世界坐標系的關系.然后利用多個二維圖象中的信息重建出三維信息。物體三維重建是計算機輔助幾何設計(CAGD)、計算機圖形學(CG)、計算機動畫、計算機視覺、醫學圖像處理、科學計算和虛擬現實、數字媒體創作等領域的共性科學問題和核心技術。在計算機內生成物體三維表示主要有兩類方法。一類是使用幾何建模軟件通過人機交互生成人為控制下的物體三維幾何模型,另一類是通過一定的手段獲取真實物體的幾何形狀。前者實現技術已經十分成熟,現有若干軟件支持,比如:3DMAX、Maya、AutoCAD、UG等等,它們一般使用具有數學表達式的曲線曲面表示幾何形狀。后者一般稱為三維重建過程,三維重建是指利用二維投影恢復物體三維信息(形狀等)的數學過程和計算機技術,包括數據獲取、預處理、點云拼接和特征分析等步驟。

穩健性 · Networking · Neural Networks · CNN · 三角形化 ·

2023 年 8 月 4 日

AirVO: An Illumination-Robust Point-Line Visual Odometry

Kuan Xu,Yuefan Hao,Shenghai Yuan,Chen Wang,Lihua Xie

This paper proposes an illumination-robust visual odometry (VO) system that incorporates both accelerated learning-based corner point algorithms and an extended line feature algorithm. To be robust to dynamic illumination, the proposed system employs the convolutional neural network (CNN) and graph neural network (GNN) to detect and match reliable and informative corner points. Then point feature matching results and the distribution of point and line features are utilized to match and triangulate lines. By accelerating CNN and GNN parts and optimizing the pipeline, the proposed system is able to run in real-time on low-power embedded platforms. The proposed VO was evaluated on several datasets with varying illumination conditions, and the results show that it outperforms other state-of-the-art VO systems in terms of accuracy and robustness. The open-source nature of the proposed system allows for easy implementation and customization by the research community, enabling further development and improvement of VO for various applications.

估計/估計量 · 正交 · 通道 · LDPC · 可約的 ·

2023 年 8 月 4 日

A Graph-Based Collision Resolution Scheme for Asynchronous Unsourced Random Access

Tianya Li,Yongpeng Wu,Wenjun Zhang,Xiang-Gen Xia,Chengshan Xiao

from arxiv, 6 pages, 6 figures, accepted by IEEE GLOBECOM 2023

This paper investigates the multiple-input-multiple-output (MIMO) massive unsourced random access in an asynchronous orthogonal frequency division multiplexing (OFDM) system, with both timing and frequency offsets (TFO) and non-negligible user collisions. The proposed coding framework splits the data into two parts encoded by sparse regression code (SPARC) and low-density parity check (LDPC) code. Multistage orthogonal pilots are transmitted in the first part to reduce collision density. Unlike existing schemes requiring a quantization codebook with a large size for estimating TFO, we establish a \textit{graph-based channel reconstruction and collision resolution (GB-CR$^2$)} algorithm to iteratively reconstruct channels, resolve collisions, and compensate for TFO rotations on the formulated graph jointly among multiple stages. We further propose to leverage the geometric characteristics of signal constellations to correct TFO estimations. Exhaustive simulations demonstrate remarkable performance superiority in channel estimation and data recovery with substantial complexity reduction compared to state-of-the-art schemes.

Analysis · Twitter · INFORMS · 話題 · 詞向量表示 ·

2023 年 8 月 4 日

Tweet Insights: A Visualization Platform to Extract Temporal Insights from Twitter

Daniel Loureiro,Kiamehr Rezaee,Talayeh Riahi,Francesco Barbieri,Leonardo Neves,Luis Espinosa Anke,Jose Camacho-Collados

from arxiv, Demo paper. Visualization platform available at //tweetnlp.org/insights

This paper introduces a large collection of time series data derived from Twitter, postprocessed using word embedding techniques, as well as specialized fine-tuned language models. This data comprises the past five years and captures changes in n-gram frequency, similarity, sentiment and topic distribution. The interface built on top of this data enables temporal analysis for detecting and characterizing shifts in meaning, including complementary information to trending metrics, such as sentiment and topic association over time. We release an online demo for easy experimentation, and we share code and the underlying aggregated data for future work. In this paper, we also discuss three case studies unlocked thanks to our platform, showcasing its potential for temporal linguistic analysis.

流 · MoDELS · 塊 · 評論員 · 隨機變量 ·

2023 年 8 月 3 日

Density-Based Semantics for Reactive Probabilistic Programming

Guillaume Baudart,Louis Mandel,Christine Tasson

Synchronous languages are now a standard industry tool for critical embedded systems. Designers write high-level specifications by composing streams of values using block diagrams. These languages have been extended with Bayesian reasoning to program state-space models which compute a stream of distributions given a stream of observations. However, the semantics of probabilistic models is only defined for scheduled equations -- a significant limitation compared to dataflow synchronous languages and block diagrams which do not require any ordering. In this paper we propose two schedule agnostic semantics for a probabilistic synchronous language. The key idea is to interpret probabilistic expressions as a stream of un-normalized density functions which maps random variable values to a result and positive score. The co-iterative semantics interprets programs as state machines and equations are computed using a fixpoint operator. The relational semantics directly manipulates streams and is thus a better fit to reason about program equivalence. We use the relational semantics to prove the correctness of a program transformation required to run an optimized inference algorithm for state-space models with constant parameters.

Performer · MoDELS · SOTA · E2E · state-of-the-art ·

2023 年 8 月 3 日

ReIDTrack: Multi-Object Track and Segmentation Without Motion

Kaer Huang,Bingchuan Sun,Feng Chen,Tao Zhang,Jun Xie,Jian Li,Christopher Walter Twombly,Zhepeng Wang

In recent years, dominant Multi-object tracking (MOT) and segmentation (MOTS) methods mainly follow the tracking-by-detection paradigm. Transformer-based end-to-end (E2E) solutions bring some ideas to MOT and MOTS, but they cannot achieve a new state-of-the-art (SOTA) performance in major MOT and MOTS benchmarks. Detection and association are two main modules of the tracking-by-detection paradigm. Association techniques mainly depend on the combination of motion and appearance information. As deep learning has been recently developed, the performance of the detection and appearance model is rapidly improved. These trends made us consider whether we can achieve SOTA based on only high-performance detection and appearance model. Our paper mainly focuses on exploring this direction based on CBNetV2 with Swin-B as a detection model and MoCo-v2 as a self-supervised appearance model. Motion information and IoU mapping were removed during the association. Our method wins 1st place on the MOTS track and wins 2nd on the MOT track in the CVPR2023 WAD workshop. We hope our simple and effective method can give some insights to the MOT and MOTS research community. Source code will be released under this git repository

Mixup · MoDELS · 訓練數據 · 潛在 · state-of-the-art ·

2023 年 8 月 3 日

MusicLDM: Enhancing Novelty in Text-to-Music Generation Using Beat-Synchronous Mixup Strategies

Ke Chen,Yusong Wu,Haohe Liu,Marianna Nezhurina,Taylor Berg-Kirkpatrick,Shlomo Dubnov

from arxiv, 16 pages, 3 figures, 2 tables, demo page: //musicldm.github.io/

Diffusion models have shown promising results in cross-modal generation tasks, including text-to-image and text-to-audio generation. However, generating music, as a special type of audio, presents unique challenges due to limited availability of music data and sensitive issues related to copyright and plagiarism. In this paper, to tackle these challenges, we first construct a state-of-the-art text-to-music model, MusicLDM, that adapts Stable Diffusion and AudioLDM architectures to the music domain. We achieve this by retraining the contrastive language-audio pretraining model (CLAP) and the Hifi-GAN vocoder, as components of MusicLDM, on a collection of music data samples. Then, to address the limitations of training data and to avoid plagiarism, we leverage a beat tracking model and propose two different mixup strategies for data augmentation: beat-synchronous audio mixup and beat-synchronous latent mixup, which recombine training audio directly or via a latent embeddings space, respectively. Such mixup strategies encourage the model to interpolate between musical training samples and generate new music within the convex hull of the training data, making the generated music more diverse while still staying faithful to the corresponding style. In addition to popular evaluation metrics, we design several new evaluation metrics based on CLAP score to demonstrate that our proposed MusicLDM and beat-synchronous mixup strategies improve both the quality and novelty of generated music, as well as the correspondence between input text and generated music.

TOOLS · Learning · 可辨認的 · 機器人 · 情景 ·

2023 年 8 月 2 日

LEMMA: Learning Language-Conditioned Multi-Robot Manipulation

Ran Gong,Xiaofeng Gao,Qiaozi Gao,Suhaila Shakiah,Govind Thattai,Gaurav S. Sukhatme

from arxiv, 8 pages, 3 figures

Complex manipulation tasks often require robots with complementary capabilities to collaborate. We introduce a benchmark for LanguagE-Conditioned Multi-robot MAnipulation (LEMMA) focused on task allocation and long-horizon object manipulation based on human language instructions in a tabletop setting. LEMMA features 8 types of procedurally generated tasks with varying degree of complexity, some of which require the robots to use tools and pass tools to each other. For each task, we provide 800 expert demonstrations and human instructions for training and evaluations. LEMMA poses greater challenges compared to existing benchmarks, as it requires the system to identify each manipulator's limitations and assign sub-tasks accordingly while also handling strong temporal dependencies in each task. To address these challenges, we propose a modular hierarchical planning approach as a baseline. Our results highlight the potential of LEMMA for developing future language-conditioned multi-robot systems.

簇 · Machine Learning · Learning · Networking · ML ·

2023 年 8 月 1 日

CASSINI: Network-Aware Job Scheduling in Machine Learning Clusters

Sudarsanan Rajasekaran,Manya Ghobadi,Aditya Akella

We present CASSINI, a network-aware job scheduler for machine learning (ML) clusters. CASSINI introduces a novel geometric abstraction to consider the communication pattern of different jobs while placing them on network links. To do so, CASSINI uses an affinity graph that finds a series of time-shift values to adjust the communication phases of a subset of jobs, such that the communication patterns of jobs sharing the same network link are interleaved with each other. Experiments with 13 common ML models on a 24-server testbed demonstrate that compared to the state-of-the-art ML schedulers, CASSINI improves the average and tail completion time of jobs by up to 1.6x and 2.5x, respectively. Moreover, we show that CASSINI reduces the number of ECN marked packets in the cluster by up to 33x.

圖 · 知識圖譜 · 鏈路預測 · Extensibility · entity ·

2020 年 10 月 6 日

CoDEx: A Comprehensive Knowledge Graph Completion Benchmark

Tara Safavi,Danai Koutra

from arxiv, EMNLP 2020

We present CoDEx, a set of knowledge graph completion datasets extracted from Wikidata and Wikipedia that improve upon existing knowledge graph completion benchmarks in scope and level of difficulty. In terms of scope, CoDEx comprises three knowledge graphs varying in size and structure, multilingual descriptions of entities and relations, and tens of thousands of hard negative triples that are plausible but verified to be false. To characterize CoDEx, we contribute thorough empirical analyses and benchmarking experiments. First, we analyze each CoDEx dataset in terms of logical relation patterns. Next, we report baseline link prediction and triple classification results on CoDEx for five extensively tuned embedding models. Finally, we differentiate CoDEx from the popular FB15K-237 knowledge graph completion dataset by showing that CoDEx covers more diverse and interpretable content, and is a more difficult link prediction benchmark. Data, code, and pretrained models are available at //bit.ly/2EPbrJs.

語言模型化 · MoDELS · 位置嵌入 · 自編碼器 · 掩碼 ·

2020 年 2 月 28 日

UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training

Hangbo Bao,Li Dong,Furu Wei,Wenhui Wang,Nan Yang,Xiaodong Liu,Yu Wang,Songhao Piao,Jianfeng Gao,Ming Zhou,Hsiao-Wuen Hon

from arxiv, 11 pages

We propose to pre-train a unified language model for both autoencoding and partially autoregressive language modeling tasks using a novel training procedure, referred to as a pseudo-masked language model (PMLM). Given an input text with masked tokens, we rely on conventional masks to learn inter-relations between corrupted tokens and context via autoencoding, and pseudo masks to learn intra-relations between masked spans via partially autoregressive modeling. With well-designed position embeddings and self-attention masks, the context encodings are reused to avoid redundant computation. Moreover, conventional masks used for autoencoding provide global masking information, so that all the position embeddings are accessible in partially autoregressive language modeling. In addition, the two tasks pre-train a unified language model as a bidirectional encoder and a sequence-to-sequence decoder, respectively. Our experiments show that the unified language models pre-trained using PMLM achieve new state-of-the-art results on a wide range of natural language understanding and generation tasks across several widely used benchmarks.