好男人在线观看免费2019_亚洲日韩中文字幕一级乱码在线播放不卡_免费网址在线观看视频_好吊色国产欧美日韩免费观看_精品国产精品亚洲一本大道_亚洲国产中文电影视频链接_亚洲区一区二区三深田咏美

Reflectance bounds the frequency spectrum of illumination in the object appearance. In this paper, we introduce the first stochastic inverse rendering method, which recovers the full frequency spectrum of an illumination jointly with the object reflectance from a single image. Our key idea is to solve this blind inverse problem in the reflectance map, an appearance representation invariant to the underlying geometry, by learning to reverse the image formation with a novel diffusion model which we refer to as the Diffusion Reflectance Map Network (DRMNet). Given an observed reflectance map converted and completed from the single input image, DRMNet generates a reflectance map corresponding to a perfect mirror sphere while jointly estimating the reflectance. The forward process can be understood as gradually filtering a natural illumination with lower and lower frequency reflectance and additive Gaussian noise. DRMNet learns to invert this process with two subnetworks, IllNet and RefNet, which work in concert towards this joint estimation. The network is trained on an extensive synthetic dataset and is demonstrated to generalize to real images, showing state-of-the-art accuracy on established datasets.

相關內容

Networking

關注 22

Networking：IFIP International Conferences on Networking。 Explanation：國際(ji)網絡會議(yi)。 Publisher：IFIP。 SIT：

Mobile-Agent · Agent · 操作 · 可辨認的 · MoDELS ·

2024 年 1 月 29 日

Mobile-Agent: Autonomous Multi-Modal Mobile Device Agent with Visual Perception

Junyang Wang,Haiyang Xu,Jiabo Ye,Ming Yan,Weizhou Shen,Ji Zhang,Fei Huang,Jitao Sang

from arxiv, 13 pages, 14 figures

Mobile device agent based on Multimodal Large Language Models (MLLM) is becoming a popular application. In this paper, we introduce Mobile-Agent, an autonomous multi-modal mobile device agent. Mobile-Agent first leverages visual perception tools to accurately identify and locate both the visual and textual elements within the app's front-end interface. Based on the perceived vision context, it then autonomously plans and decomposes the complex operation task, and navigates the mobile Apps through operations step by step. Different from previous solutions that rely on XML files of Apps or mobile system metadata, Mobile-Agent allows for greater adaptability across diverse mobile operating environments in a vision-centric way, thereby eliminating the necessity for system-specific customizations. To assess the performance of Mobile-Agent, we introduced Mobile-Eval, a benchmark for evaluating mobile device operations. Based on Mobile-Eval, we conducted a comprehensive evaluation of Mobile-Agent. The experimental results indicate that Mobile-Agent achieved remarkable accuracy and completion rates. Even with challenging instructions, such as multi-app operations, Mobile-Agent can still complete the requirements. Code and model will be open-sourced at //github.com/X-PLUG/MobileAgent.

操作 · 講稿 · 跡 · 流 · 泛函 ·

2024 年 1 月 29 日

A Fully Compositional Theory of Sequential Digital Circuits: Denotational, Operational and Algebraic Semantics

Dan R. Ghica,George Kaye,David Sprunger

from arxiv, Improved content and presentation, 31 pages

Digital circuits, despite having been studied for nearly a century and used at scale for about half that time, have until recently evaded a fully compositional theoretical understanding, in which arbitrary circuits may be freely composed together without consulting their internals. Recent work remedied this theoretical shortcoming by showing how digital circuits can be presented compositionally as morphisms in a freely generated symmetric traced category. However, this was done informally; in this paper we refine and expand the previous work in several ways, culminating in the presentation of three sound and complete semantics for digital circuits: denotational, operational and algebraic. For the denotational semantics, we establish a correspondence between stream functions with certain properties and circuits constructed syntactically. For the operational semantics, we present the reductions required to model how a circuit processes a value, including the addition of a new reduction for eliminating non-delay-guarded feedback; this leads to an adequate notion of observational equivalence for digital circuits. Finally, we define a new family of equations for translating circuits into bisimilar circuits of a 'normal form', leading to a complete algebraic semantics for sequential circuits

Markov · 樣本復雜度 · 估計/估計量 · 控制器 · Minimax ·

2024 年 1 月 26 日

Offline Estimation of Controlled Markov Chains: Minimaxity and Sample Complexity

Imon Banerjee,Harsha Honnappa,Vinayak Rao

from arxiv, 71 pages, 23 main

In this work, we study a natural nonparametric estimator of the transition probability matrices of a finite controlled Markov chain. We consider an offline setting with a fixed dataset, collected using a so-called logging policy. We develop sample complexity bounds for the estimator and establish conditions for minimaxity. Our statistical bounds depend on the logging policy through its mixing properties. We show that achieving a particular statistical risk bound involves a subtle and interesting trade-off between the strength of the mixing properties and the number of samples. We demonstrate the validity of our results under various examples, such as ergodic Markov chains, weakly ergodic inhomogeneous Markov chains, and controlled Markov chains with non-stationary Markov, episodic, and greedy controls. Lastly, we use these sample complexity bounds to establish concomitant ones for offline evaluation of stationary Markov control policies.

變換 · 解碼 · 粵港澳大灣區數字經濟研究院 · Subspace · 代碼 ·

2024 年 1 月 26 日

Explicit Subcodes of Reed-Solomon Codes that Efficiently Achieve List Decoding Capacity

Amit Berman,Yaron Shany,Itzhak Tamo

from arxiv, 20 pages

In this paper, we introduce a novel explicit family of subcodes of Reed-Solomon (RS) codes that efficiently achieve list decoding capacity with a constant output list size. Our approach builds upon the idea of large linear subcodes of RS codes evaluated on a subfield, similar to the method employed by Guruswami and Xing (STOC 2013). However, our approach diverges by leveraging the idea of {\it permuted product codes}, thereby simplifying the construction by avoiding the need of {\it subspace designs}. Specifically, the codes are constructed by initially forming the tensor product of two RS codes with carefully selected evaluation sets, followed by specific cyclic shifts to the codeword rows. This process results in each codeword column being treated as an individual coordinate, reminiscent of prior capacity-achieving codes, such as folded RS codes and univariate multiplicity codes. This construction is easily shown to be a subcode of an interleaved RS code, equivalently, an RS code evaluated on a subfield. Alternatively, the codes can be constructed by the evaluation of bivariate polynomials over orbits generated by \emph{two} affine transformations with coprime orders, extending the earlier use of a single affine transformation in folded RS codes and the recent affine folded RS codes introduced by Bhandari {\it et al.} (IEEE T-IT, Feb.~2024). While our codes require large, yet constant characteristic, the two affine transformations facilitate achieving code length equal to the field size, without the restriction of the field being prime, contrasting with univariate multiplicity codes.

玻爾茲曼機 · MoDELS · 變換 · 配分函數 · 論文 ·

2024 年 1 月 26 日

Topology-Aware Exploration of Energy-Based Models Equilibrium: Toric QC-LDPC Codes and Hyperbolic MET QC-LDPC Codes

Vasiliy Usatyuk,Denis Sapozhnikov,Sergey Egorov

from arxiv, 16 pages, 29 figures. arXiv admin note: text overlap with arXiv:2307.15778

This paper presents a method for achieving equilibrium in the ISING Hamiltonian when confronted with unevenly distributed charges on an irregular grid. Employing (Multi-Edge) QC-LDPC codes and the Boltzmann machine, our approach involves dimensionally expanding the system, substituting charges with circulants, and representing distances through circulant shifts. This results in a systematic mapping of the charge system onto a space, transforming the irregular grid into a uniform configuration, applicable to Torical and Circular Hyperboloid Topologies. The paper covers fundamental definitions and notations related to QC-LDPC Codes, Multi-Edge QC-LDPC codes, and the Boltzmann machine. It explores the marginalization problem in code on the graph probabilistic models for evaluating the partition function, encompassing exact and approximate estimation techniques. Rigorous proof is provided for the attainability of equilibrium states for the Boltzmann machine under Torical and Circular Hyperboloid, paving the way for the application of our methodology. Practical applications of our approach are investigated in Finite Geometry QC-LDPC Codes, specifically in Material Science. The paper further explores its effectiveness in the realm of Natural Language Processing Transformer Deep Neural Networks, examining Generalized Repeat Accumulate Codes, Spatially-Coupled and Cage-Graph QC-LDPC Codes. The versatile and impactful nature of our topology-aware hardware-efficient quasi-cycle codes equilibrium method is showcased across diverse scientific domains without the use of specific section delineations.

INFORMS · 信息理論 · 代碼 · Extensibility · Performer ·

2024 年 1 月 25 日

A Mathematical Theory of Semantic Communication: Overview

Kai Niu,Ping Zhang

from arxiv, 6 pages, 2 figures. This paper is submitted to the 2024 IEEE International Symposium on Information Theory (ISIT 2024). arXiv admin note: substantial text overlap with arXiv:2401.13387

Semantic communication initiates a new direction for future communication. In this paper, we aim to establish a systematic framework of semantic information theory (SIT). First, we propose a semantic communication model and define the synonymous mapping to indicate the critical relationship between semantic information and syntactic information. Based on this core concept, we introduce the measures of semantic information, such as semantic entropy $H_s(\tilde{U})$, up/down semantic mutual information $I^s(\tilde{X};\tilde{Y})$ $(I_s(\tilde{X};\tilde{Y}))$, semantic capacity $C_s=\max_{p(x)}I^s(\tilde{X};\tilde{Y})$, and semantic rate-distortion function $R_s(D)=\min_{p(\hat{x}|x):\mathbb{E}d_s(\tilde{x},\hat{\tilde{x}})\leq D}I_s(\tilde{X};\hat{\tilde{X}})$. Furthermore, we prove three coding theorems of SIT, that is, the semantic source coding theorem, semantic channel coding theorem, and semantic rate-distortion coding theorem. We find that the limits of information theory are extended by using synonymous mapping, that is, $H_s(\tilde{U})\leq H(U)$, $C_s\geq C$ and $R_s(D)\leq R(D)$. All these works composite the basis of semantic information theory. In summary, the theoretic framework proposed in this paper is a natural extension of classic information theory and may reveal great performance potential for future communication.

方差 · 在線 · MoDELS · Performer · Pair ·

2024 年 1 月 24 日

Tight Competitive and Variance Analyses of Matching Policies in Gig Platforms

Pan Xu

from arxiv, This paper was accepted to the 2024 ACM Web Conference

In this paper, we propose an online-matching-based model to tackle the two fundamental issues, matching and pricing, existing in a wide range of real-world gig platforms, including ride-hailing (matching riders and drivers), crowdsourcing markets (pairing workers and tasks), and online recommendations (offering items to customers). Our model assumes the arriving distributions of dynamic agents (e.g., riders, workers, and buyers) are accessible in advance, and they can change over time, which is referred to as \emph{Known Heterogeneous Distributions} (KHD). In this paper, we initiate variance analysis for online matching algorithms under KHD. Unlike the popular competitive-ratio (CR) metric, the variance of online algorithms' performance is rarely studied due to inherent technical challenges, though it is well linked to robustness. We focus on two natural parameterized sampling policies, denoted by $\mathsf{ATT}(\gamma)$ and $\mathsf{SAMP}(\gamma)$, which appear as foundational bedrock in online algorithm design. We offer rigorous competitive ratio (CR) and variance analyses for both policies. Specifically, we show that $\mathsf{ATT}(\gamma)$ with $\gamma \in [0,1/2]$ achieves a CR of $\gamma$ and a variance of $\gamma \cdot (1-\gamma) \cdot B$ on the total number of matches with $B$ being the total matching capacity. In contrast, $\mathsf{SAMP}(\gamma)$ with $\gamma \in [0,1]$ accomplishes a CR of $\gamma (1-\gamma)$ and a variance of $\bar{\gamma} (1-\bar{\gamma})\cdot B$ with $\bar{\gamma}=\min(\gamma,1/2)$. All CR and variance analyses are tight and unconditional of any benchmark. As a byproduct, we prove that $\mathsf{ATT}(\gamma=1/2)$ achieves an optimal CR of $1/2$.

Engineering · Prompt · 設計 · BASIC · 論文 ·

2024 年 1 月 24 日

Prompt Design and Engineering: Introduction and Advanced Methods

Xavier Amatriain

Prompt design and engineering has become an important discipline in just the past few months. In this paper, we provide an introduction to the main concepts as well as review basic and more advanced approaches to prompt design and engineering.

自動問答 · 注意力機制 · 可約的 · MoDELS · 匯聚 ·

2021 年 5 月 10 日

Poolingformer: Long Document Modeling with Pooling Attention

Hang Zhang,Yeyun Gong,Yelong Shen,Weisheng Li,Jiancheng Lv,Nan Duan,Weizhu Chen

from arxiv, Accepted by ICML 2021

In this paper, we introduce a two-level attention schema, Poolingformer, for long document modeling. Its first level uses a smaller sliding window pattern to aggregate information from neighbors. Its second level employs a larger window to increase receptive fields with pooling attention to reduce both computational cost and memory consumption. We first evaluate Poolingformer on two long sequence QA tasks: the monolingual NQ and the multilingual TyDi QA. Experimental results show that Poolingformer sits atop three official leaderboards measured by F1, outperforming previous state-of-the-art models by 1.9 points (79.8 vs. 77.9) on NQ long answer, 1.9 points (79.5 vs. 77.6) on TyDi QA passage answer, and 1.6 points (67.6 vs. 66.0) on TyDi QA minimal answer. We further evaluate Poolingformer on a long sequence summarization task. Experimental results on the arXiv benchmark continue to demonstrate its superior performance.

秩 · 目標檢測 · Performer · 排序 · DATE ·

2018 年 3 月 14 日

Revisiting Salient Object Detection: Simultaneous Detection, Ranking, and Subitizing of Multiple Salient Objects

Md Amirul Islam,Mahmoud Kalash,Neil D. B. Bruce

from arxiv, To appear in CVPR 2018

Salient object detection is a problem that has been considered in detail and many solutions proposed. In this paper, we argue that work to date has addressed a problem that is relatively ill-posed. Specifically, there is not universal agreement about what constitutes a salient object when multiple observers are queried. This implies that some objects are more likely to be judged salient than others, and implies a relative rank exists on salient objects. The solution presented in this paper solves this more general problem that considers relative rank, and we propose data and metrics suitable to measuring success in a relative objects saliency landscape. A novel deep learning solution is proposed based on a hierarchical representation of relative saliency and stage-wise refinement. We also show that the problem of salient object subitizing can be addressed with the same network, and our approach exceeds performance of any prior work across all metrics considered (both traditional and newly proposed).