国产一国产一级毛片A久久久-国产男女无套内谢免费视频

The Rashomon set is the set of models that perform approximately equally well on a given dataset, and the Rashomon ratio is the fraction of all models in a given hypothesis space that are in the Rashomon set. Rashomon ratios are often large for tabular datasets in criminal justice, healthcare, lending, education, and in other areas, which has practical implications about whether simpler models can attain the same level of accuracy as more complex models. An open question is why Rashomon ratios often tend to be large. In this work, we propose and study a mechanism of the data generation process, coupled with choices usually made by the analyst during the learning process, that determines the size of the Rashomon ratio. Specifically, we demonstrate that noisier datasets lead to larger Rashomon ratios through the way that practitioners train models. Additionally, we introduce a measure called pattern diversity, which captures the average difference in predictions between distinct classification patterns in the Rashomon set, and motivate why it tends to increase with label noise. Our results explain a key aspect of why simpler models often tend to perform as well as black box models on complex, noisier datasets.

相關內容

MoDELS

關注 43

ACM/IEEE第23屆模型驅動工程語言和系統國際會議，是模型驅動軟件和系統工程的首要會議系列，由ACM-SIGSOFT和IEEE-TCSE支持組織。自1998年以來，模型涵蓋了建模的各個方面，從語言和方法到工具和應用程序。模特的參加者來自不同的背景，包括研究人員、學者、工程師和工業專業人士。MODELS 2019是一個論壇，參與者可以圍繞建模和模型驅動的軟件和系統交流前沿研究成果和創新實踐經驗。今年的版本將為建模社區提供進一步推進建模基礎的機會，并在網絡物理系統、嵌入式系統、社會技術系統、云計算、大數據、機器學習、安全、開源等新興領域提出建模的創新應用以及可持續性。官網鏈接： · 量子計算 · 編譯器 · 可約的 · dynamic programming ·

2023 年 12 月 18 日

Minimizing Photonic Cluster State Depth in Measurement-Based Quantum Computing

Yingheng Li,Aditya Pawar,Zewei Mo,Youtao Zhang,Jun Yang,Xulong Tang

Measurement-based quantum computing (MBQC) is a promising quantum computing paradigm that performs computation through ``one-way'' measurements on entangled quantum qubits. It is widely used in photonic quantum computing (PQC), where the computation is carried out on photonic cluster states (i.e., a 2-D mesh of entangled photons). In MBQC-based PQC, the cluster state depth (i.e., the length of one-way measurements) to execute a quantum circuit plays an important role in the overall execution time and error. Thus, it is important to reduce the cluster state depth. In this paper, we propose FMCC, a compilation framework that employs dynamic programming to efficiently minimize the cluster state depth. Experimental results on five representative quantum algorithms show that FMCC achieves 53.6%, 60.6%, and 60.0% average depth reductions in small, medium, and large qubit counts compared to the state-of-the-art MBQC compilation frameworks.

INTERACT · Learning · 約束 · 學習的學習 · MoDELS ·

2023 年 12 月 17 日

Learning to Learn in Interactive Constraint Acquisition

Dimos Tsouros,Senne Berden,Tias Guns

from arxiv, Accepted in AAAI

Constraint Programming (CP) has been successfully used to model and solve complex combinatorial problems. However, modeling is often not trivial and requires expertise, which is a bottleneck to wider adoption. In Constraint Acquisition (CA), the goal is to assist the user by automatically learning the model. In (inter)active CA, this is done by interactively posting queries to the user, e.g., asking whether a partial solution satisfies their (unspecified) constraints or not. While interac tive CA methods learn the constraints, the learning is related to symbolic concept learning, as the goal is to learn an exact representation. However, a large number of queries is still required to learn the model, which is a major limitation. In this paper, we aim to alleviate this limitation by tightening the connection of CA and Machine Learning (ML), by, for the first time in interactive CA, exploiting statistical ML methods. We propose to use probabilistic classification models to guide interactive CA to generate more promising queries. We discuss how to train classifiers to predict whether a candidate expression from the bias is a constraint of the problem or not, using both relation-based and scope-based features. We then show how the predictions can be used in all layers of interactive CA: the query generation, the scope finding, and the lowest-level constraint finding. We experimentally evaluate our proposed methods using different classifiers and show that our methods greatly outperform the state of the art, decreasing the number of queries needed to converge by up to 72%.

話題模型 · MoDELS · 話題 · 大語言模型 · 語言模型化 ·

2023 年 12 月 15 日

Prompting Large Language Models for Topic Modeling

Han Wang,Nirmalendu Prakash,Nguyen Khoi Hoang,Ming Shan Hee,Usman Naseem,Roy Ka-Wei Lee

from arxiv, 6 pages, 3 figures, IEEE International Conference on Big Data

Topic modeling is a widely used technique for revealing underlying thematic structures within textual data. However, existing models have certain limitations, particularly when dealing with short text datasets that lack co-occurring words. Moreover, these models often neglect sentence-level semantics, focusing primarily on token-level semantics. In this paper, we propose PromptTopic, a novel topic modeling approach that harnesses the advanced language understanding of large language models (LLMs) to address these challenges. It involves extracting topics at the sentence level from individual documents, then aggregating and condensing these topics into a predefined quantity, ultimately providing coherent topics for texts of varying lengths. This approach eliminates the need for manual parameter tuning and improves the quality of extracted topics. We benchmark PromptTopic against the state-of-the-art baselines on three vastly diverse datasets, establishing its proficiency in discovering meaningful topics. Furthermore, qualitative analysis showcases PromptTopic's ability to uncover relevant topics in multiple datasets.

Subspace · state-of-the-art · 樣本 · 線性的 · 分解 ·

2023 年 12 月 15 日

Decomposed Diffusion Sampler for Accelerating Large-Scale Inverse Problems

Hyungjin Chung,Suhyeon Lee,Jong Chul Ye

from arxiv, 28 pages, 9 figures

Krylov subspace, which is generated by multiplying a given vector by the matrix of a linear transformation and its successive powers, has been extensively studied in classical optimization literature to design algorithms that converge quickly for large linear inverse problems. For example, the conjugate gradient method (CG), one of the most popular Krylov subspace methods, is based on the idea of minimizing the residual error in the Krylov subspace. However, with the recent advancement of high-performance diffusion solvers for inverse problems, it is not clear how classical wisdom can be synergistically combined with modern diffusion models. In this study, we propose a novel and efficient diffusion sampling strategy that synergistically combine the diffusion sampling and Krylov subspace methods. Specifically, we prove that if the tangent space at a denoised sample by Tweedie's formula forms a Krylov subspace, then the CG initialized with the denoised data ensures the data consistency update to remain in the tangent space. This negates the need to compute the manifold-constrained gradient (MCG), leading to a more efficient diffusion sampling method. Our method is applicable regardless of the parametrization and setting (i.e., VE, VP). Notably, we achieve state-of-the-art reconstruction quality on challenging real-world medical inverse imaging problems, including multi-coil MRI reconstruction and 3D CT reconstruction. Moreover, our proposed method achieves more than 80 times faster inference time than the previous state-of-the-art method.

蒸餾 · 數據集 · 原點 · INFORMS · state-of-the-art ·

2023 年 12 月 14 日

Dataset Distillation via Adversarial Prediction Matching

Mingyang Chen,Bo Huang,Junda Lu,Bing Li,Yi Wang,Minhao Cheng,Wei Wang

Dataset distillation is the technique of synthesizing smaller condensed datasets from large original datasets while retaining necessary information to persist the effect. In this paper, we approach the dataset distillation problem from a novel perspective: we regard minimizing the prediction discrepancy on the real data distribution between models, which are respectively trained on the large original dataset and on the small distilled dataset, as a conduit for condensing information from the raw data into the distilled version. An adversarial framework is proposed to solve the problem efficiently. In contrast to existing distillation methods involving nested optimization or long-range gradient unrolling, our approach hinges on single-level optimization. This ensures the memory efficiency of our method and provides a flexible tradeoff between time and memory budgets, allowing us to distil ImageNet-1K using a minimum of only 6.5GB of GPU memory. Under the optimal tradeoff strategy, it requires only 2.5$\times$ less memory and 5$\times$ less runtime compared to the state-of-the-art. Empirically, our method can produce synthetic datasets just 10% the size of the original, yet achieve, on average, 94% of the test accuracy of models trained on the full original datasets including ImageNet-1K, significantly surpassing state-of-the-art. Additionally, extensive tests reveal that our distilled datasets excel in cross-architecture generalization capabilities.

近似 · 線性的 · state-of-the-art · 約束 · 采樣法 ·

2023 年 12 月 14 日

Approximate Integer Solution Counts over Linear Arithmetic Constraints

Cunjing Ge

Counting integer solutions of linear constraints has found interesting applications in various fields. It is equivalent to the problem of counting lattice points inside a polytope. However, state-of-the-art algorithms for this problem become too slow for even a modest number of variables. In this paper, we propose a new framework to approximate the lattice counts inside a polytope with a new random-walk sampling method. The counts computed by our approach has been proved approximately bounded by a $(\epsilon, \delta)$-bound. Experiments on extensive benchmarks show that our algorithm could solve polytopes with dozens of dimensions, which significantly outperforms state-of-the-art counters.

Weight · CASE · 情景 · ILP · SimPLe ·

2023 年 12 月 13 日

Covering Rectilinear Polygons with Area-Weighted Rectangles

Kathrin Hanauer,Martin P. Seybold,Julian Unterweger

from arxiv, Accepted to ALENEX 2024

Representing a polygon using a set of simple shapes has numerous applications in different use-case scenarios. We consider the problem of covering the interior of a rectilinear polygon with holes by a set of area-weighted, axis-aligned rectangles such that the total weight of the rectangles in the cover is minimized. Already the unit-weight case is known to be NP-hard and the general problem has, to the best of our knowledge, not been studied experimentally before. We show a new basic property of optimal solutions of the weighted problem. This allows us to speed up existing algorithms for the unit-weight case, obtain an improved ILP formulation for both the weighted and unweighted problem, and develop several approximation algorithms and heuristics for the weighted case. All our algorithms are evaluated in a large experimental study on 186 837 polygons combined with six cost functions, which provides evidence that our algorithms are both fast and yield close-to-optimal solutions in practice.

似然 · 極大似然 · 估計/估計量 · MoDELS · 置信度 ·

2023 年 12 月 13 日

Likelihood Based Inference for ARMA Models

Jesse Wheeler,Edward L. Ionides

from arxiv, 39 pages including citations and supplement, 25 figures, 7 tables. Submitted to the R Journal. The developmental version of the R package used in this paper is available at the following GitHub repository: :jeswheel/arima2.git

Autoregressive moving average (ARMA) models are frequently used to analyze time series data. Despite the popularity of these models, algorithms for fitting ARMA models have weaknesses that are not well known. We provide a summary of parameter estimation via maximum likelihood and discuss common pitfalls that may lead to sub-optimal parameter estimates. We propose a random restart algorithm for parameter estimation that frequently yields higher likelihoods than traditional maximum likelihood estimation procedures. We then investigate the parameter uncertainty of maximum likelihood estimates, and propose the use of profile confidence intervals as a superior alternative to intervals derived from the Fisher's information matrix. Through a series of simulation studies, we demonstrate the efficacy of our proposed algorithm and the improved nominal coverage of profile confidence intervals compared to the normal approximation based on Fisher's Information.

圖形處理器 · Neural Networks · MoDELS · 通用近似器 · 圖 ·

2021 年 9 月 9 日

Relating Graph Neural Networks to Structural Causal Models

Matej Ze?evi?,Devendra Singh Dhami,Petar Veli?kovi?,Kristian Kersting

from arxiv, Main paper: 7 pages, References: 2 pages, Appendix: 10 pages; Main paper: 5 figures, Appendix: 3 figures

Causality can be described in terms of a structural causal model (SCM) that carries information on the variables of interest and their mechanistic relations. For most processes of interest the underlying SCM will only be partially observable, thus causal inference tries to leverage any exposed information. Graph neural networks (GNN) as universal approximators on structured input pose a viable candidate for causal learning, suggesting a tighter integration with SCM. To this effect we present a theoretical analysis from first principles that establishes a novel connection between GNN and SCM while providing an extended view on general neural-causal models. We then establish a new model class for GNN-based causal inference that is necessary and sufficient for causal effect identification. Our empirical illustration on simulations and standard benchmarks validate our theoretical proofs.

長短期記憶網絡 · 命名實體識別 · MoDELS · Better · 門控 ·

2018 年 5 月 15 日

Chinese NER Using Lattice LSTM

Yue Zhang,Jie Yang

from arxiv, Accepted at ACL 2018 as Long paper

We investigate a lattice-structured LSTM model for Chinese NER, which encodes a sequence of input characters as well as all potential words that match a lexicon. Compared with character-based methods, our model explicitly leverages word and word sequence information. Compared with word-based methods, lattice LSTM does not suffer from segmentation errors. Gated recurrent cells allow our model to choose the most relevant characters and words from a sentence for better NER results. Experiments on various datasets show that lattice LSTM outperforms both word-based and character-based LSTM baselines, achieving the best results.