亚州AV无码专区在线电影_日韩理论图片网止_国产一国产精品一级毛片_亚洲日韩精品A片无码毛片_国产美女紧精品久久久久久久_麻豆饥渴少妇勾诱男教师_97午夜理论片在线观看播放

We study case influence in the Lasso regression using Cook's distance which measures overall change in the fitted values when one observation is deleted. Unlike in ordinary least squares regression, the estimated coefficients in the Lasso do not have a closed form due to the nondifferentiability of the $\ell_1$ penalty, and neither does Cook's distance. To find the case-deleted Lasso solution without refitting the model, we approach it from the full data solution by introducing a weight parameter ranging from 1 to 0 and generating a solution path indexed by this parameter. We show that the solution path is piecewise linear with respect to a simple function of the weight parameter under a fixed penalty. The resulting case influence is a function of the penalty and weight, and it becomes Cook's distance when the weight is 0. As the penalty parameter changes, selected variables change, and the magnitude of Cook's distance for the same data point may vary with the subset of variables selected. In addition, we introduce a case influence graph to visualize how the contribution of each data point changes with the penalty parameter. From the graph, we can identify influential points at different penalty levels and make modeling decisions accordingly. Moreover, we find that case influence graphs exhibit different patterns between underfitting and overfitting phases, which can provide additional information for model selection.

相關內容

CASE

關注 1

INFORMS · CASE · 統計量 · 泛函 · 信息檢索 ·

2024 年 7 月 14 日

Insecurity of Quantum Two-Party Computation with Applications to Cheat-Sensitive Protocols and Oblivious Transfer Reductions

Esther H?nggi,Severin Winkler

from arxiv, The main results are unchanged. We have added some explanations and corrected typos and a mistake in the calculation of the error terms of Theorems 3 and 4

Oblivious transfer (OT) is a fundamental primitive for secure two-party computation. It is well known that OT cannot be implemented with information-theoretic security if the two players only have access to noiseless communication channels, even in the quantum case. As a result, weaker variants of OT have been studied. In this work, we rigorously establish the impossibility of cheat-sensitive OT, where a dishonest party can cheat, but risks being detected. We construct a general attack on any quantum protocol that allows the receiver to compute all inputs of the sender and provide an explicit upper bound on the success probability of this attack. This implies that cheat-sensitive quantum Symmetric Private Information Retrieval cannot be implemented with statistical information-theoretic security. Leveraging the techniques devised for our proofs, we provide entropic bounds on primitives needed for secure function evaluation. They imply impossibility results for protocols where the players have access to OT as a resource. This result significantly improves upon existing bounds and yields tight bounds for reductions of 1-out-of-n OT to a resource primitive. Our results hold in particular for transformations between a finite number of primitives and for any error.

估計/估計量 · 推斷 · 可辨認的 · Extensibility · 可理解性 ·

2024 年 7 月 14 日

Semiparametric Efficient Inference for the Probability of Necessary and Sufficient Causation

Zhaoqing Tian,Peng Wu

Causal attribution, which aims to explain why events or behaviors occur, is crucial in causal inference and enhances our understanding of cause-and-effect relationships in scientific research. The probabilities of necessary causation (PN) and sufficient causation (PS) are two of the most common quantities for attribution in causal inference. While many works have explored the identification or bounds of PN and PS, efficient estimation remains unaddressed. To fill this gap, this paper focuses on obtaining semiparametric efficient estimators of PN and PS under two sets of identifiability assumptions: strong ignorability and monotonicity, and strong ignorability and conditional independence. We derive efficient influence functions and semiparametric efficiency bounds for PN and PS under the two sets of identifiability assumptions, respectively. Based on this, we propose efficient estimators for PN and PS, and show their large sample properties. Extensive simulations validate the superiority of our estimators compared to competing methods. We apply our methods to a real-world dataset to assess various risk factors affecting stroke.

貝葉斯網/貝葉斯網絡 · Networking · Performer · Analysis · MoDELS ·

2024 年 7 月 12 日

Scalability of Bayesian Network Structure Elicitation with Large Language Models: a Novel Methodology and Comparative Analysis

Nikolay Babakov,Ehud Reiter,Alberto Bugarin

from arxiv, 27 pages

In this work, we propose a novel method for Bayesian Networks (BNs) structure elicitation that is based on the initialization of several LLMs with different experiences, independently querying them to create a structure of the BN, and further obtaining the final structure by majority voting. We compare the method with one alternative method on various widely and not widely known BNs of different sizes and study the scalability of both methods on them. We also propose an approach to check the contamination of BNs in LLM, which shows that some widely known BNs are inapplicable for testing the LLM usage for BNs structure elicitation. We also show that some BNs may be inapplicable for such experiments because their node names are indistinguishable. The experiments on the other BNs show that our method performs better than the existing method with one of the three studied LLMs; however, the performance of both methods significantly decreases with the increase in BN size.

Vision · 計算機視覺 · 可約的 · state-of-the-art · 相關系數 ·

2024 年 7 月 12 日

Leveraging Computer Vision in the Intensive Care Unit (ICU) for Examining Visitation and Mobility

Scott Siegel,Jiaqing Zhang,Sabyasachi Bandyopadhyay,Subhash Nerella,Brandon Silva,Tezcan Baslanti,Azra Bihorac,Parisa Rashidi

Despite the importance of closely monitoring patients in the Intensive Care Unit (ICU), many aspects are still assessed in a limited manner due to the time constraints imposed on healthcare providers. For example, although excessive visitations during rest hours can potentially exacerbate the risk of circadian rhythm disruption and delirium, it is not captured in the ICU. Likewise, while mobility can be an important indicator of recovery or deterioration in ICU patients, it is only captured sporadically or not captured at all. In the past few years, the computer vision field has found application in many domains by reducing the human burden. Using computer vision systems in the ICU can also potentially enable non-existing assessments or enhance the frequency and accuracy of existing assessments while reducing the staff workload. In this study, we leverage a state-of-the-art noninvasive computer vision system based on depth imaging to characterize ICU visitations and patients' mobility. We then examine the relationship between visitation and several patient outcomes, such as pain, acuity, and delirium. We found an association between deteriorating patient acuity and the incidence of delirium with increased visitations. In contrast, self-reported pain, reported using the Defense and Veteran Pain Rating Scale (DVPRS), was correlated with decreased visitations. Our findings highlight the feasibility and potential of using noninvasive autonomous systems to monitor ICU patients.

規范化的 · Learning · 得分 · 估計/估計量 · 分數匹配 ·

2024 年 7 月 12 日

Learning Distances from Data with Normalizing Flows and Score Matching

Peter Sorrenson,Daniel Behrend-Uriarte,Christoph Schn?rr,Ullrich K?the

Density-based distances (DBDs) offer an elegant solution to the problem of metric learning. By defining a Riemannian metric which increases with decreasing probability density, shortest paths naturally follow the data manifold and points are clustered according to the modes of the data. We show that existing methods to estimate Fermat distances, a particular choice of DBD, suffer from poor convergence in both low and high dimensions due to i) inaccurate density estimates and ii) reliance on graph-based paths which are increasingly rough in high dimensions. To address these issues, we propose learning the densities using a normalizing flow, a generative model with tractable density estimation, and employing a smooth relaxation method using a score model initialized from a graph-based proposal. Additionally, we introduce a dimension-adapted Fermat distance that exhibits more intuitive behavior when scaled to high dimensions and offers better numerical properties. Our work paves the way for practical use of density-based distances, especially in high-dimensional spaces.

Agent · 多峰值 · 論文 · INTERACT · 語言模型化 ·

2024 年 7 月 12 日

Security Matrix for Multimodal Agents on Mobile Devices: A Systematic and Proof of Concept Study

Yulong Yang,Xinshan Yang,Shuaidong Li,Chenhao Lin,Zhengyu Zhao,Chao Shen,Tianwei Zhang

from arxiv, Preprint. Work in progress

The rapid progress in the reasoning capability of the Multi-modal Large Language Models (MLLMs) has triggered the development of autonomous agent systems on mobile devices. MLLM-based mobile agent systems consist of perception, reasoning, memory, and multi-agent collaboration modules, enabling automatic analysis of user instructions and the design of task pipelines with only natural language and device screenshots as inputs. Despite the increased human-machine interaction efficiency, the security risks of MLLM-based mobile agent systems have not been systematically studied. Existing security benchmarks for agents mainly focus on Web scenarios, and the attack techniques against MLLMs are also limited in the mobile agent scenario. To close these gaps, this paper proposes a mobile agent security matrix covering 3 functional modules of the agent systems. Based on the security matrix, this paper proposes 4 realistic attack paths and verifies these attack paths through 8 attack methods. By analyzing the attack results, this paper reveals that MLLM-based mobile agent systems are not only vulnerable to multiple traditional attacks, but also raise new security concerns previously unconsidered. This paper highlights the need for security awareness in the design of MLLM-based systems and paves the way for future research on attacks and defense methods.

狀態空間 · INFORMS · IMDB · Learning · 表示 ·

2024 年 7 月 12 日

Exploring State Space and Reasoning by Elimination in Tsetlin Machine

Ahmed K. Kadhim,Ole-Christoffer Granmo,Lei Jiao,Rishad Shafik

from arxiv, 8 pages, 8 figures

The Tsetlin Machine (TM) has gained significant attention in Machine Learning (ML). By employing logical fundamentals, it facilitates pattern learning and representation, offering an alternative approach for developing comprehensible Artificial Intelligence (AI) with a specific focus on pattern classification in the form of conjunctive clauses. In the domain of Natural Language Processing (NLP), TM is utilised to construct word embedding and describe target words using clauses. To enhance the descriptive capacity of these clauses, we study the concept of Reasoning by Elimination (RbE) in clauses' formulation, which involves incorporating feature negations to provide a more comprehensive representation. In more detail, this paper employs the Tsetlin Machine Auto-Encoder (TM-AE) architecture to generate dense word vectors, aiming at capturing contextual information by extracting feature-dense vectors for a given vocabulary. Thereafter, the principle of RbE is explored to improve descriptivity and optimise the performance of the TM. Specifically, the specificity parameter s and the voting margin parameter T are leveraged to regulate feature distribution in the state space, resulting in a dense representation of information for each clause. In addition, we investigate the state spaces of TM-AE, especially for the forgotten/excluded features. Empirical investigations on artificially generated data, the IMDB dataset, and the 20 Newsgroups dataset showcase the robustness of the TM, with accuracy reaching 90.62\% for the IMDB.

Performer · 泛函 · 離散化 · 查準率/準確率 · Color ·

2024 年 7 月 12 日

Finite Blocklength Performance of Capacity-achieving Codes in the Light of Complexity Theory

Holger Boche,Andrea Grigorescu,Rafael F. Schaefer,H. Vincent Poor

from arxiv, The results were presented at ISIT 2024 in the recent result session. The ISIT 2024 poster for the extended abstract is attached to the paper

Since the work of Polyanskiy, Poor and Verd\'u on the finite blocklength performance of capacity-achieving codes for discrete memoryless channels, many papers have attempted to find further results for more practically relevant channels. However, it seems that the complexity of computing capacity-achieving codes has not been investigated until now. We study this question for the simplest non-trivial Gaussian channels, i.e., the additive colored Gaussian noise channel. To assess the computational complexity, we consider the classes $\mathrm{FP}_1$ and $\#\mathrm{P}_1$. $\mathrm{FP}_1$ includes functions computable by a deterministic Turing machine in polynomial time, whereas $\#\mathrm{P}_1$ encompasses functions that count the number of solutions verifiable in polynomial time. It is widely assumed that $\mathrm{FP}_1\neq\#\mathrm{P}_1$. It is of interest to determine the conditions under which, for a given $M \in \mathbb{N}$, where $M$ describes the precision of the deviation of $C(P,N)$, for a certain blocklength $n_M$ and a decoding error $\epsilon > 0$ with $\epsilon\in\mathbb{Q}$, the following holds: $R_{n_M}(\epsilon)>C(P,N)-\frac{1}{2^M}$. It is shown that there is a polynomial-time computable $N_*$ such that for sufficiently large $P_*\in\mathbb{Q}$, the sequences $\{R_{n_M}(\epsilon)\}_{{n_M}\in\mathbb{N}}$, where each $R_{n_M}(\epsilon)$ satisfies the previous condition, cannot be computed in polynomial time if $\mathrm{FP}_1\neq\#\mathrm{P}_1$. Hence, the complexity of computing the sequence $\{R_{n_M}(\epsilon)\}_{n_M\in\mathbb{N}}$ grows faster than any polynomial as $M$ increases. Consequently, it is shown that either the sequence of achievable rates $\{R_{n_M}(\epsilon)\}_{n_M\in\mathbb{N}}$ as a function of the blocklength, or the sequence of blocklengths $\{n_M\}_{M\in\mathbb{N}}$ corresponding to the achievable rates, is not a polynomial-time computable sequence.

統計量 · 輸出 · Processing（編程語言） · 泛函 · 概率密度函數 ·

2024 年 7 月 12 日

Exploring the Statistical Properties of Outputs from a Process Inspired by Geometrical Interpretation of Newton's Method

Taki Kirouani

In this paper, the statistical properties of Newton s method algorithm output in a specific case have been studied. The relative frequency density of this sample converges to a well-defined function, prompting us to explore its distribution. Through rigorous mathematical proof, we demonstrate that the probability density function follows a Cauchy distribution. Additionally, a new method to generate a uniform distribution is proposed. To further confirm our findings, we employed statistical tests, including the Kolmogorov-Smirnov test and Anderson-Darling test, which showed high p-values. Furthermore, we show that the distribution of the distance between two successive outputs can be obtained through a transformation method applied to the Cauchy distribution.

Gamma分布 · Projection · Pair · Analysis · 近似 ·

2024 年 7 月 12 日

An Analysis of the Johnson-Lindenstrauss Lemma with the Bivariate Gamma Distribution

Jason Bernstein,Alec M. Dunton,Benjamin W. Priest

from arxiv, 20 pages, 5 figures. This revision improves figure 2 and format of citations

Probabilistic proofs of the Johnson-Lindenstrauss lemma imply that random projection can reduce the dimension of a data set and approximately preserve pairwise distances. If a distance being approximately preserved is called a success, and the complement of this event is called a failure, then such a random projection likely results in no failures. Assuming a Gaussian random projection, the lemma is proved by showing that the no-failure probability is positive using a combination of Bonferroni's inequality and Markov's inequality. This paper modifies this proof in two ways to obtain a greater lower bound on the no-failure probability. First, Bonferroni's inequality is applied to pairs of failures instead of individual failures. Second, since a pair of projection errors has a bivariate gamma distribution, the probability of a pair of successes is bounded using an inequality from Jensen (1969). If $n$ is the number of points to be embedded and $\mu$ is the probability of a success, then this leads to an increase in the lower bound on the no-failure probability of $\frac{1}{2}\binom{n}{2}(1-\mu)^2$ if $\binom{n}{2}$ is even and $\frac{1}{2}\left(\binom{n}{2}-1\right)(1-\mu)^2$ if $\binom{n}{2}$ is odd. For example, if $n=10^5$ points are to be embedded in $k=10^4$ dimensions with a tolerance of $\epsilon=0.1$, then the improvement in the lower bound is on the order of $10^{-14}$. We also show that further improvement is possible if the inequality in Jensen (1969) extends to three successes, though we do not have a proof of this result.