免费在线黄色电影_中文字幕一二三区乱码不卡_免费永久黄色片在线观看_国产亚洲欧美日韩综合另类_波多野结衣强奷系列在线一区_日本免费一级一区二区三区_禁止18点击进入在线看片尤物

Constructing small-sized coresets for various clustering problems in different metric spaces has attracted significant attention for the past decade. A central problem in the coreset literature is to understand what is the best possible coreset size for $(k,z)$-clustering in Euclidean space. While there has been significant progress in the problem, there is still a gap between the state-of-the-art upper and lower bounds. For instance, the best known upper bound for $k$-means ($z=2$) is $\min \{O(k^{3/2} \varepsilon^{-2}),O(k \varepsilon^{-4})\}$ [1,2], while the best known lower bound is $\Omega(k\varepsilon^{-2})$ [1]. In this paper, we make significant progress on both upper and lower bounds. For a large range of parameters (i.e., $\varepsilon, k$), we have a complete understanding of the optimal coreset size. In particular, we obtain the following results: (1) We present a new coreset lower bound $\Omega(k \varepsilon^{-z-2})$ for Euclidean $(k,z)$-clustering when $\varepsilon \geq \Omega(k^{-1/(z+2)})$. In view of the prior upper bound $\tilde{O}_z(k \varepsilon^{-z-2})$ [1], the bound is optimal. The new lower bound also implies improved lower bounds for $(k,z)$-clustering in doubling metrics. (2) For the upper bound, we provide efficient coreset construction algorithms for $(k,z)$-clustering with improved or optimal coreset sizes in several metric spaces. In particular, we provide an $\tilde{O}_z(k^{\frac{2z+2}{z+2}} \varepsilon^{-2})$-sized coreset, with a unfied analysis, for $(k,z)$-clustering for all $z\geq 1$ in Euclidean space. [1] Cohen-Addad, Larsen, Saulpic, Schwiegelshohn. STOC'22. [2] Cohen-Addad, Larsen, Saulpic, Schwiegelshohn, Sheikh-Omar, NeurIPS'22.

相關內容

優(you)化器

關注 4

哈希學習 · 可辨認的 · 知識 (knowledge) · Machine Learning · Learning ·

2024 年 1 月 5 日

Hashing Modulo Context-Sensitive $α$-Equivalence

Lasse Blaauwbroek,Miroslav Ol?ák,Herman Geuvers

from arxiv, 33 pages

The notion of $\alpha$-equivalence between $\lambda$-terms is commonly used to identify terms that are considered equal. However, due to the primitive treatment of free variables, this notion falls short when comparing subterms occurring within a larger context. Depending on the usage of the Barendregt convention (choosing different variable names for all involved binders), it will equate either too few or too many subterms. We introduce a formal notion of context-sensitive $\alpha$-equivalence, where two open terms can be compared within a context that resolves their free variables. We show that this equivalence coincides exactly with the notion of bisimulation equivalence. Furthermore, we present an efficient $O(n\log n)$ runtime algorithm that identifies $\lambda$-terms modulo context-sensitive $\alpha$-equivalence, improving upon a previously established $O(n\log^2 n)$ bound for a hashing modulo ordinary $\alpha$-equivalence by Maziarz et al. Hashing $\lambda$-terms is useful in many applications that require common subterm elimination and structure sharing. We employ the algorithm to obtain a large-scale, densely packed, interconnected graph of mathematical knowledge from the Coq proof assistant for machine learning purposes.

可理解性 · MoDELS · 大語言模型 · 語言模型化 · Integration ·

2024 年 1 月 5 日

M$^{2}$UGen: Multi-modal Music Understanding and Generation with the Power of Large Language Models

Atin Sakkeer Hussain,Shansong Liu,Chenshuo Sun,Ying Shan

The current landscape of research leveraging large language models (LLMs) is experiencing a surge. Many works harness the powerful reasoning capabilities of these models to comprehend various modalities, such as text, speech, images, videos, etc. They also utilize LLMs to understand human intention and generate desired outputs like images, videos, and music. However, research that combines both understanding and generation using LLMs is still limited and in its nascent stage. To address this gap, we introduce a Multi-modal Music Understanding and Generation (M$^{2}$UGen) framework that integrates LLM's abilities to comprehend and generate music for different modalities. The M$^{2}$UGen framework is purpose-built to unlock creative potential from diverse sources of inspiration, encompassing music, image, and video through the use of pretrained MERT, ViT, and ViViT models, respectively. To enable music generation, we explore the use of AudioLDM 2 and MusicGen. Bridging multi-modal understanding and music generation is accomplished through the integration of the LLaMA 2 model. Furthermore, we make use of the MU-LLaMA model to generate extensive datasets that support text/image/video-to-music generation, facilitating the training of our M$^{2}$UGen framework. We conduct a thorough evaluation of our proposed framework. The experimental results demonstrate that our model achieves or surpasses the performance of the current state-of-the-art models.

Copilot · Chatbot · 圖片分類 · 代碼 · Performer ·

2024 年 1 月 4 日

Correctness Comparison of ChatGPT-4, Bard, Claude-2, and Copilot for Spatial Tasks

Hartwig H. Hochmair,Levente Juhasz,Takoda Kemp

from arxiv, Submitted for review in Transactions in GIS

Generative AI including large language models (LLMs) have recently gained significant interest in the geo-science community through its versatile task-solving capabilities including coding, spatial computations, generation of sample data, time-series forecasting, toponym recognition, or image classification. So far, the assessment of LLMs for spatial tasks has primarily focused on ChatGPT, arguably the most prominent AI chatbot, whereas other chatbots received less attention. To narrow this research gap, this study evaluates the correctness of responses for a set of 54 spatial tasks assigned to four prominent chatbots, i.e., ChatGPT-4, Bard, Claude-2, and Copilot. Overall, the chatbots performed well on spatial literacy, GIS theory, and interpretation of programming code and given functions, but revealed weaknesses in mapping, code generation, and code translation. ChatGPT-4 outperformed other chatbots across most task categories.

黑盒 · 有向 · Pair · Analysis · contrastive ·

2024 年 1 月 4 日

Quantum 2-SAT on low dimensional systems is $\mathsf{QMA}_1$-complete: Direct embeddings and black-box simulation

Dorian Rudolph,Sevag Gharibian,Daniel Nagaj

from arxiv, 37 pages, 8 figures

Despite the fundamental role the Quantum Satisfiability (QSAT) problem has played in quantum complexity theory, a central question remains open: At which local dimension does the complexity of QSAT transition from "easy" to "hard"? Here, we study QSAT with each constraint acting on a $k$-dimensional and $l$-dimensional qudit pair, denoted $(k,l)$-QSAT. Our first main result shows that, surprisingly, QSAT on qubits can remain $\mathsf{QMA}_1$-hard, in that $(2,5)$-QSAT is $\mathsf{QMA}_1$-complete. In contrast, $2$-SAT on qubits is well-known to be poly-time solvable [Bravyi, 2006]. Our second main result proves that $(3,d)$-QSAT on the 1D line with $d\in O(1)$ is also $\mathsf{QMA}_1$-hard. Finally, we initiate the study of 1D $(2,d)$-QSAT by giving a frustration-free 1D Hamiltonian with a unique, entangled ground state. Our first result uses a direct embedding, combining a novel clock construction with the 2D circuit-to-Hamiltonian construction of [Gosset, Nagaj, 2013]. Of note is a new simplified and analytic proof for the latter (as opposed to a partially numeric proof in [GN13]). This exploits Unitary Labelled Graphs [Bausch, Cubitt, Ozols, 2017] together with a new "Nullspace Connection Lemma", allowing us to break low energy analyses into small patches of projectors, and to improve the soundness analysis of [GN13] from $\Omega(1/T^6)$ to $\Omega(1/T^2)$, for $T$ the number of gates. Our second result goes via black-box reduction: Given an arbitrary 1D Hamiltonian $H$ on $d'$-dimensional qudits, we show how to embed it into an effective null-space of a 1D $(3,d)$-QSAT instance, for $d\in O(1)$. Our approach may be viewed as a weaker notion of "simulation" (\`a la [Bravyi, Hastings 2017], [Cubitt, Montanaro, Piddock 2018]). As far as we are aware, this gives the first "black-box simulation"-based $\mathsf{QMA}_1$-hardness result, i.e. for frustration-free Hamiltonians.

語言模型化 · MoDELS · Performer · 任務對話系統 · HTTPS ·

2024 年 1 月 4 日

LLaVA-$φ$: Efficient Multi-Modal Assistant with Small Language Model

Yichen Zhu,Minjie Zhu,Ning Liu,Zhicai Ou,Xiaofeng Mou,Jian Tang

from arxiv, technique report

In this paper, we introduce LLaVA-$\phi$ (LLaVA-Phi), an efficient multi-modal assistant that harnesses the power of the recently advanced small language model, Phi-2, to facilitate multi-modal dialogues. LLaVA-Phi marks a notable advancement in the realm of compact multi-modal models. It demonstrates that even smaller language models, with as few as 2.7B parameters, can effectively engage in intricate dialogues that integrate both textual and visual elements, provided they are trained with high-quality corpora. Our model delivers commendable performance on publicly available benchmarks that encompass visual comprehension, reasoning, and knowledge-based perception. Beyond its remarkable performance in multi-modal dialogue tasks, our model opens new avenues for applications in time-sensitive environments and systems that require real-time interaction, such as embodied agents. It highlights the potential of smaller language models to achieve sophisticated levels of understanding and interaction, while maintaining greater resource efficiency.The project is available at {//github.com/zhuyiche/llava-phi}.

因子分析 · 分解的 · Analysis · 穩健性 · 向量化 ·

2024 年 1 月 4 日

Robust bilinear factor analysis based on the matrix-variate $t$ distribution

Xuan Ma,Jianhua Zhao,Changchun Shang,Fen Jiang,Philip L. H. Yu

Factor Analysis based on multivariate $t$ distribution ($t$fa) is a useful robust tool for extracting common factors on heavy-tailed or contaminated data. However, $t$fa is only applicable to vector data. When $t$fa is applied to matrix data, it is common to first vectorize the matrix observations. This introduces two challenges for $t$fa: (i) the inherent matrix structure of the data is broken, and (ii) robustness may be lost, as vectorized matrix data typically results in a high data dimension, which could easily lead to the breakdown of $t$fa. To address these issues, starting from the intrinsic matrix structure of matrix data, a novel robust factor analysis model, namely bilinear factor analysis built on the matrix-variate $t$ distribution ($t$bfa), is proposed in this paper. The novelty is that it is capable to simultaneously extract common factors for both row and column variables of interest on heavy-tailed or contaminated matrix data. Two efficient algorithms for maximum likelihood estimation of $t$bfa are developed. Closed-form expression for the Fisher information matrix to calculate the accuracy of parameter estimates are derived. Empirical studies are conducted to understand the proposed $t$bfa model and compare with related competitors. The results demonstrate the superiority and practicality of $t$bfa. Importantly, $t$bfa exhibits a significantly higher breakdown point than $t$fa, making it more suitable for matrix data.

圖 · CASES · CASE · 情景 · 邊 ·

2024 年 1 月 4 日

Recognition of Unit Segment and Polyline Graphs is $\exists\mathbb{R}$-Complete

Michael Hoffmann,Tillmann Miltzow,Simon Weber,Lasse Wulf

from arxiv, 18 pages, 15 figures

Given a set of objects O in the plane, the corresponding intersection graph is defined as follows. A vertex is created for each object and an edge joins two vertices whenever the corresponding objects intersect. We study here the case of unit segments and polylines with exactly k bends. In the recognition problem, we are given a graph and want to decide whether the graph can be represented as the intersection graph of certain geometric objects. In previous work it was shown that various recognition problems are $\exists\mathbb{R}$-complete, leaving unit segments and polylines as few remaining natural cases. We show that recognition for both families of objects is $\exists\mathbb{R}$-complete.

樣本 · Subspace · 近似 · 得分 · 可約的 ·

2024 年 1 月 3 日

Sharper Bounds for $\ell_p$ Sensitivity Sampling

David P. Woodruff,Taisuke Yasuda

from arxiv, To appear in ICML 2023; added discussion of prior work

In large scale machine learning, random sampling is a popular way to approximate datasets by a small representative subset of examples. In particular, sensitivity sampling is an intensely studied technique which provides provable guarantees on the quality of approximation, while reducing the number of examples to the product of the VC dimension $d$ and the total sensitivity $\mathfrak S$ in remarkably general settings. However, guarantees going beyond this general bound of $\mathfrak S d$ are known in perhaps only one setting, for $\ell_2$ subspace embeddings, despite intense study of sensitivity sampling in prior work. In this work, we show the first bounds for sensitivity sampling for $\ell_p$ subspace embeddings for $p > 2$ that improve over the general $\mathfrak S d$ bound, achieving a bound of roughly $\mathfrak S^{2-2/p}$ for $2<p<\infty$. Furthermore, our techniques yield further new results in the study of sampling algorithms, showing that the root leverage score sampling algorithm achieves a bound of roughly $d$ for $1\leq p<2$, and that a combination of leverage score and sensitivity sampling achieves an improved bound of roughly $d^{2/p}\mathfrak S^{2-4/p}$ for $2<p<\infty$. Our sensitivity sampling results yield the best known sample complexity for a wide class of structured matrices that have small $\ell_p$ sensitivity.

樣本 · MoDELS · INFORMS · Notability · GANs ·

2024 年 1 月 3 日

S$^{2}$-DMs:Skip-Step Diffusion Models

Yixuan Wang,Shuangyin Li

from arxiv, 11 pages

Diffusion models have emerged as powerful generative tools, rivaling GANs in sample quality and mirroring the likelihood scores of autoregressive models. A subset of these models, exemplified by DDIMs, exhibit an inherent asymmetry: they are trained over $T$ steps but only sample from a subset of $T$ during generation. This selective sampling approach, though optimized for speed, inadvertently misses out on vital information from the unsampled steps, leading to potential compromises in sample quality. To address this issue, we present the S$^{2}$-DMs, which is a new training method by using an innovative $L_{skip}$, meticulously designed to reintegrate the information omitted during the selective sampling phase. The benefits of this approach are manifold: it notably enhances sample quality, is exceptionally simple to implement, requires minimal code modifications, and is flexible enough to be compatible with various sampling algorithms. On the CIFAR10 dataset, models trained using our algorithm showed an improvement of 3.27% to 14.06% over models trained with traditional methods across various sampling algorithms (DDIMs, PNDMs, DEIS) and different numbers of sampling steps (10, 20, ..., 1000). On the CELEBA dataset, the improvement ranged from 8.97% to 27.08%. Access to the code and additional resources is provided in the github.

學成 · 未標記 · 小樣本學習 · 判別器 · 直推學習 ·

2019 年 3 月 25 日

f-VAEGAN-D2: A Feature Generating Framework for Any-Shot Learning

Yongqin Xian,Saurabh Sharma,Bernt Schiele,Zeynep Akata

from arxiv, Accepted at CVPR 2019

When labeled training data is scarce, a promising data augmentation approach is to generate visual features of unknown classes using their attributes. To learn the class conditional distribution of CNN features, these models rely on pairs of image features and class attributes. Hence, they can not make use of the abundance of unlabeled data samples. In this paper, we tackle any-shot learning problems i.e. zero-shot and few-shot, in a unified feature generating framework that operates in both inductive and transductive learning settings. We develop a conditional generative model that combines the strength of VAE and GANs and in addition, via an unconditional discriminator, learns the marginal feature distribution of unlabeled images. We empirically show that our model learns highly discriminative CNN features for five datasets, i.e. CUB, SUN, AWA and ImageNet, and establish a new state-of-the-art in any-shot learning, i.e. inductive and transductive (generalized) zero- and few-shot learning settings. We also demonstrate that our learned features are interpretable: we visualize them by inverting them back to the pixel space and we explain them by generating textual arguments of why they are associated with a certain label.