欧美精品日韩精品国内精品_无遮挡又黄又刺激的免费视频_1024你懂的国产在线播放_国产免费V视频在线观看_亚洲国产欧美日韩第一精品_国产黄网站免费观看_亚洲AV无码片久久精品

The "short cycle removal" technique was recently introduced by Abboud, Bringmann, Khoury and Zamir (STOC '22) to prove fine-grained hardness of approximation. Its main technical result is that listing all triangles in an $n^{1/2}$-regular graph is $n^{2-o(1)}$-hard under the 3-SUM conjecture even when the number of short cycles is small; namely, when the number of $k$-cycles is $O(n^{k/2+\gamma})$ for $\gamma<1/2$. Abboud et al. achieve $\gamma\geq 1/4$ by applying structure vs. randomness arguments on graphs. In this paper, we take a step back and apply conceptually similar arguments on the numbers of the 3-SUM problem. Consequently, we achieve the best possible $\gamma=0$ and the following lower bounds under the 3-SUM conjecture: * Approximate distance oracles: The seminal Thorup-Zwick distance oracles achieve stretch $2k\pm O(1)$ after preprocessing a graph in $O(m n^{1/k})$ time. For the same stretch, and assuming the query time is $n^{o(1)}$ Abboud et al. proved an $\Omega(m^{1+\frac{1}{12.7552 \cdot k}})$ lower bound on the preprocessing time; we improve it to $\Omega(m^{1+\frac1{2k}})$ which is only a factor 2 away from the upper bound. We also obtain tight bounds for stretch $2+o(1)$ and $3-\epsilon$ and higher lower bounds for dynamic shortest paths. * Listing 4-cycles: Abboud et al. proved the first super-linear lower bound for listing all 4-cycles in a graph, ruling out $(m^{1.1927}+t)^{1+o(1)}$ time algorithms where $t$ is the number of 4-cycles. We settle the complexity of this basic problem by showing that the $\widetilde{O}(\min(m^{4/3},n^2) +t)$ upper bound is tight up to $n^{o(1)}$ factors. Our results exploit a rich tool set from additive combinatorics, most notably the Balog-Szemer\'edi-Gowers theorem and Rusza's covering lemma. A key ingredient that may be of independent interest is a subquadratic algorithm for 3-SUM if one of the sets has small doubling.

相關內容

近似

關注 0

預測器/決策函數 · Networking · Neural Networks · 損失 · 情景 ·

2023 年 12 月 8 日

Loss Minimization Yields Multicalibration for Large Neural Networks

Jaros?aw B?asiok,Parikshit Gopalan,Lunjia Hu,Adam Tauman Kalai,Preetum Nakkiran

from arxiv, In ITCS 2024

Multicalibration is a notion of fairness for predictors that requires them to provide calibrated predictions across a large set of protected groups. Multicalibration is known to be a distinct goal than loss minimization, even for simple predictors such as linear functions. In this work, we consider the setting where the protected groups can be represented by neural networks of size $k$, and the predictors are neural networks of size $n > k$. We show that minimizing the squared loss over all neural nets of size $n$ implies multicalibration for all but a bounded number of unlucky values of $n$. We also give evidence that our bound on the number of unlucky values is tight, given our proof technique. Previously, results of the flavor that loss minimization yields multicalibration were known only for predictors that were near the ground truth, hence were rather limited in applicability. Unlike these, our results rely on the expressivity of neural nets and utilize the representation of the predictor.

估計/估計量 · 規范化的 · contrastive · 推斷 · 設計 ·

2023 年 12 月 7 日

Parameter Inference for Hypo-Elliptic Diffusions under a Weak Design Condition

Yuga Iguchi,Alexandros Beskos

We address the problem of parameter estimation for degenerate diffusion processes defined via the solution of Stochastic Differential Equations (SDEs) with diffusion matrix that is not full-rank. For this class of hypo-elliptic diffusions recent works have proposed contrast estimators that are asymptotically normal, provided that the step-size in-between observations $\Delta=\Delta_n$ and their total number $n$ satisfy $n \to \infty$, $n \Delta_n \to \infty$, $\Delta_n \to 0$, and additionally $\Delta_n = o (n^{-1/2})$. This latter restriction places a requirement for a so-called `rapidly increasing experimental design'. In this paper, we overcome this limitation and develop a general contrast estimator satisfying asymptotic normality under the weaker design condition $\Delta_n = o(n^{-1/p})$ for general $p \ge 2$. Such a result has been obtained for elliptic SDEs in the literature, but its derivation in a hypo-elliptic setting is highly non-trivial. We provide numerical results to illustrate the advantages of the developed theory.

state-of-the-art · HTTPS · 模型復雜度 · ENJOY · Better ·

2023 年 12 月 7 日

4D Gaussian Splatting for Real-Time Dynamic Scene Rendering

Guanjun Wu,Taoran Yi,Jiemin Fang,Lingxi Xie,Xiaopeng Zhang,Wei Wei,Wenyu Liu,Qi Tian,Xinggang Wang

from arxiv, Project page: //guanjunwu.github.io/4dgs/

Representing and rendering dynamic scenes has been an important but challenging task. Especially, to accurately model complex motions, high efficiency is usually hard to guarantee. To achieve real-time dynamic scene rendering while also enjoying high training and storage efficiency, we propose 4D Gaussian Splatting (4D-GS) as a holistic representation for dynamic scenes rather than applying 3D-GS for each individual frame. In 4D-GS, a novel explicit representation containing both 3D Gaussians and 4D neural voxels is proposed. A decomposed neural voxel encoding algorithm inspired by HexPlane is proposed to efficiently build Gaussian features from 4D neural voxels and then a lightweight MLP is applied to predict Gaussian deformations at novel timestamps. Our 4D-GS method achieves real-time rendering under high resolutions, 82 FPS at an 800$\times$800 resolution on an RTX 3090 GPU while maintaining comparable or better quality than previous state-of-the-art methods. More demos and code are available at //guanjunwu.github.io/4dgs/.

MoDELS · 小樣本學習 · 語言模型化 · Performer · 圖像字幕 ·

2023 年 12 月 6 日

Self-Supervised Open-Ended Classification with Small Visual Language Models

Mohammad Mahdi Derakhshani,Ivona Najdenkoska,Cees G. M. Snoek,Marcel Worring,Yuki M. Asano

We present Self-Context Adaptation (SeCAt), a self-supervised approach that unlocks few-shot abilities for open-ended classification with small visual language models. Our approach imitates image captions in a self-supervised way based on clustering a large pool of images followed by assigning semantically-unrelated names to clusters. By doing so, we construct a training signal consisting of interleaved sequences of image and pseudocaption pairs and a query image, which we denote as the 'self-context' sequence. Based on this signal the model is trained to produce the right pseudo-caption. We demonstrate the performance and flexibility of SeCAt on several multimodal few-shot datasets, spanning various granularities. By using models with approximately 1B parameters we outperform the few-shot abilities of much larger models, such as Frozen and FROMAGe. SeCAt opens new possibilities for research and applications in open-ended few-shot learning that otherwise requires access to large or proprietary models.

CASES · Performer · 圖像字幕 · 變換 · INFORMS ·

2023 年 12 月 6 日

Metamorphic Testing of Image Captioning Systems via Image-Level Reduction

Xiaoyuan Xie,Xingpeng Li,Songqiang Chen

from arxiv, Due to the limitation "The abstract field cannot be longer than 1,920 characters", the abstract here is shorter than that in the PDF file

The Image Captioning (IC) technique is widely used to describe images in natural language. Recently, some IC system testing methods have been proposed. However, these methods still rely on pre-annotated information and hence cannot really alleviate the oracle problem in testing. Besides, their method artificially manipulates objects, which may generate unreal images as test cases and thus lead to less meaningful testing results. Thirdly, existing methods have various requirements on the eligibility of source test cases, and hence cannot fully utilize the given images to perform testing. To tackle these issues, in this paper, we propose REIC to perform metamorphic testing for IC systems with some image-level reduction transformations like image cropping and stretching. Instead of relying on the pre-annotated information, REIC uses a localization method to align objects in the caption with corresponding objects in the image, and checks whether each object is correctly described or deleted in the caption after transformation. With the image-level reduction transformations, REIC does not artificially manipulate any objects and hence can avoid generating unreal follow-up images. Besides, it eliminates the requirement on the eligibility of source test cases in the metamorphic transformation process, as well as decreases the ambiguity and boosts the diversity among the follow-up test cases, which consequently enables testing to be performed on any test image and reveals more distinct valid violations. We employ REIC to test five popular IC systems. The results demonstrate that REIC can sufficiently leverage the provided test images to generate follow-up cases of good reality, and effectively detect a great number of distinct violations, without the need for any pre-annotated information.

Subspace · Pyramid · Networking · Shuffle · Attention ·

2023 年 12 月 6 日

Technical Report on Subspace Pyramid Fusion Network for Semantic Segmentation

Mohammed A. M. Elhassan,Chenhui Yang,Chenxi Huang,Tewodros Legesse Munea

The following is a technical report to test the validity of the proposed Subspace Pyramid Fusion Module (SPFM) to capture multi-scale feature representations, which is more useful for semantic segmentation. In this investigation, we have proposed the Efficient Shuffle Attention Module(ESAM) to reconstruct the skip-connections paths by fusing multi-level global context features. Experimental results on two well-known semantic segmentation datasets, including Camvid and Cityscapes, show the effectiveness of our proposed method.

大語言模型 · 多峰值 · 語言模型化 · Performer · MoDELS ·

2023 年 12 月 6 日

MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models

Chaoyou Fu,Peixian Chen,Yunhang Shen,Yulei Qin,Mengdan Zhang,Xu Lin,Jinrui Yang,Xiawu Zheng,Ke Li,Xing Sun,Yunsheng Wu,Rongrong Ji

from arxiv, Project page: //github.com/BradyFU/Awesome-Multimodal-Large-Language-Models

Multimodal Large Language Model (MLLM) relies on the powerful LLM to perform multimodal tasks, showing amazing emergent abilities in recent studies, such as writing poems based on an image. However, it is difficult for these case studies to fully reflect the performance of MLLM, lacking a comprehensive evaluation. In this paper, we fill in this blank, presenting the first comprehensive MLLM Evaluation benchmark MME. It measures both perception and cognition abilities on a total of 14 subtasks. In order to avoid data leakage that may arise from direct use of public datasets for evaluation, the annotations of instruction-answer pairs are all manually designed. The concise instruction design allows us to fairly compare MLLMs, instead of struggling in prompt engineering. Besides, with such an instruction, we can also easily carry out quantitative statistics. A total of 30 advanced MLLMs are comprehensively evaluated on our MME, which not only suggests that existing MLLMs still have a large room for improvement, but also reveals the potential directions for the subsequent model optimization.

視頻描述生成（Video Caption） · INFORMS · Performer · 蒸餾 · Extensibility ·

2020 年 3 月 31 日

Spatio-Temporal Graph for Video Captioning with Knowledge Distillation

Boxiao Pan,Haoye Cai,De-An Huang,Kuan-Hui Lee,Adrien Gaidon,Ehsan Adeli,Juan Carlos Niebles

from arxiv, CVPR 2020

Video captioning is a challenging task that requires a deep understanding of visual scenes. State-of-the-art methods generate captions using either scene-level or object-level information but without explicitly modeling object interactions. Thus, they often fail to make visually grounded predictions, and are sensitive to spurious correlations. In this paper, we propose a novel spatio-temporal graph model for video captioning that exploits object interactions in space and time. Our model builds interpretable links and is able to provide explicit visual grounding. To avoid unstable performance caused by the variable number of objects, we further propose an object-aware knowledge distillation mechanism, in which local object information is used to regularize global scene features. We demonstrate the efficacy of our approach through extensive experiments on two benchmarks, showing our approach yields competitive performance with interpretable predictions.

Single-Shot · Branch · 目標檢測 · 推斷 · MS ·

2018 年 4 月 8 日

Single-Shot Object Detection with Enriched Semantics

Zhishuai Zhang,Siyuan Qiao,Cihang Xie,Wei Shen,Bo Wang,Alan L. Yuille

We propose a novel single shot object detection network named Detection with Enriched Semantics (DES). Our motivation is to enrich the semantics of object detection features within a typical deep detector, by a semantic segmentation branch and a global activation module. The segmentation branch is supervised by weak segmentation ground-truth, i.e., no extra annotation is required. In conjunction with that, we employ a global activation module which learns relationship between channels and object classes in a self-supervised manner. Comprehensive experimental results on both PASCAL VOC and MS COCO detection datasets demonstrate the effectiveness of the proposed method. In particular, with a VGG16 based DES, we achieve an mAP of 81.7 on VOC2007 test and an mAP of 32.8 on COCO test-dev with an inference speed of 31.5 milliseconds per image on a Titan Xp GPU. With a lower resolution version, we achieve an mAP of 79.7 on VOC2007 with an inference speed of 13.0 milliseconds per image.

自動問答 · MoDELS · Networking · Processing（編程語言） · state-of-the-art ·

2018 年 1 月 15 日

An Interpretable Reasoning Network for Multi-Relation Question Answering

Mantong Zhou,Minlie Huang,Xiaoyan Zhu

Multi-relation Question Answering is a challenging task, due to the requirement of elaborated analysis on questions and reasoning over multiple fact triples in knowledge base. In this paper, we present a novel model called Interpretable Reasoning Network that employs an interpretable, hop-by-hop reasoning process for question answering. The model dynamically decides which part of an input question should be analyzed at each hop; predicts a relation that corresponds to the current parsed results; utilizes the predicted relation to update the question representation and the state of the reasoning process; and then drives the next-hop reasoning. Experiments show that our model yields state-of-the-art results on two datasets. More interestingly, the model can offer traceable and observable intermediate predictions for reasoning analysis and failure diagnosis.