在线亚洲91SE亚洲综合在线_韩国成年性午夜免费视频_99国产在线精品视频_精品久久男人的天堂亚洲_亚洲精品国产91午夜黄久久久久久无码_国产高清精品一区二区三区不卡_欧美一区二区三区免费

With the precipitous decline in response rates, researchers and pollsters have been left with highly non-representative samples, relying on constructed weights to make these samples representative of the desired target population. Though practitioners employ valuable expert knowledge to choose what variables, $X$ must be adjusted for, they rarely defend particular functional forms relating these variables to the response process or the outcome. Unfortunately, commonly-used calibration weights -- which make the weighted mean $X$ in the sample equal that of the population -- only ensure correct adjustment when the portion of the outcome and the response process left unexplained by linear functions of $X$ are independent. To alleviate this functional form dependency, we describe kernel balancing for population weighting (kpop). This approach replaces the design matrix $\mathbf{X}$ with a kernel matrix, $\mathbf{K}$ encoding high-order information about $\mathbf{X}$. Weights are then found to make the weighted average row of $\mathbf{K}$ among sampled units approximately equal that of the target population. This produces good calibration on a wide range of smooth functions of $X$, without relying on the user to decide which $X$ or what functions of them to include. We describe the method and illustrate it by application to polling data from the 2016 U.S. presidential election.

相關內容

Weight

關注 0

Analysis · 近似 · 模型評估 · FAST · 類別 ·

2024 年 4 月 15 日

Fast randomized algorithms for low-rank matrix approximations with applications in global comparative analysis of a class of data sets

Weiwei Xu,Weijie Shen,Wen Li,Weiguo Gao,Yingzhou Li

Generalized singular values (GSVs) play an essential role in the comparative analysis. In the real world data for comparative analysis, both data matrices are usually numerically low-rank. This paper proposes a randomized algorithm to first approximately extract bases and then calculate GSVs efficiently. The accuracy of both basis extration and comparative analysis quantities, angular distances, generalized fractions of the eigenexpression, and generalized normalized Shannon entropy, are rigursly analyzed. The proposed algorithm is applied to both synthetic data sets and the genome-scale expression data sets. Comparing to other GSVs algorithms, the proposed algorithm achieves the fastest runtime while preserving sufficient accuracy in comparative analysis.

統計量 · 均值 · 相互獨立的 · CASES · Performer ·

2024 年 4 月 12 日

On testing mean of high dimensional compositional data

Qianqian Jiang,Wenbo Li,Zeng Li

We investigate one/two-sample mean tests for high-dimensional compositional data when the number of variables is comparable with the sample size, as commonly encountered in microbiome research. Existing methods mainly focus on max-type test statistics which are suitable for detecting sparse signals. However, in this paper, we introduce a novel approach using sum-type test statistics which are capable of detecting weak but dense signals. By establishing the asymptotic independence between the max-type and sum-type test statistics, we further propose a combined max-sum type test to cover both cases. We derived the asymptotic null distributions and power functions for these test statistics. Simulation studies demonstrate the superiority of our max-sum type test statistics which exhibit robust performance regardless of data sparsity.

方陣 · 設計 · 泛化理論 · CASE · 代碼 ·

2024 年 4 月 11 日

Expansion of higher-dimensional cubical complexes with application to quantum locally testable codes

Irit Dinur,Ting-Chun Lin,Thomas Vidick

from arxiv, Stronger result: constant degree complexes and without product-expansion conjecture

We introduce a high-dimensional cubical complex, for any dimension t>0, and apply it to the design of quantum locally testable codes. Our complex is a natural generalization of the constructions by Panteleev and Kalachev and by Dinur et. al of a square complex (case t=2), which have been applied to the design of classical locally testable codes (LTC) and quantum low-density parity check codes (qLDPC) respectively. We turn the geometric (cubical) complex into a chain complex by relying on constant-sized local codes $h_1,\ldots,h_t$ as gadgets. A recent result of Panteleev and Kalachev on existence of tuples of codes that are product expanding enables us to prove lower bounds on the cycle and co-cycle expansion of our chain complex. For t=4 our construction gives a new family of "almost-good" quantum LTCs -- with constant relative rate, inverse-polylogarithmic relative distance and soundness, and constant-size parity checks. Both the distance of the quantum code and its local testability are proven directly from the cycle and co-cycle expansion of our chain complex.

推斷 · MoDELS · INFORMS · 情景 · 樣本 ·

2024 年 4 月 11 日

Diffusion posterior sampling for simulation-based inference in tall data settings

Julia Linhart,Gabriel Victorino Cardoso,Alexandre Gramfort,Sylvain Le Corff,Pedro L. C. Rodrigues

from arxiv, 38 pages, 20 figures, 3 tables, 11 appendices

Determining which parameters of a non-linear model could best describe a set of experimental data is a fundamental problem in science and it has gained much traction lately with the rise of complex large-scale simulators (a.k.a. black-box simulators). The likelihood of such models is typically intractable, which is why classical MCMC methods can not be used. Simulation-based inference (SBI) stands out in this context by only requiring a dataset of simulations to train deep generative models capable of approximating the posterior distribution that relates input parameters to a given observation. In this work, we consider a tall data extension in which multiple observations are available and one wishes to leverage their shared information to better infer the parameters of the model. The method we propose is built upon recent developments from the flourishing score-based diffusion literature and allows us to estimate the tall data posterior distribution simply using information from the score network trained on individual observations. We compare our method to recently proposed competing approaches on various numerical experiments and demonstrate its superiority in terms of numerical stability and computational cost.

估計/估計量 · AIM · 圖像分類器 · CASE · 去噪 ·

2024 年 4 月 10 日

Adversarial purification for no-reference image-quality metrics: applicability study and new methods

Aleksandr Gushchin,Anna Chistyakova,Vladislav Minashkin,Anastasia Antsiferova,Dmitriy Vatolin

Recently, the area of adversarial attacks on image quality metrics has begun to be explored, whereas the area of defences remains under-researched. In this study, we aim to cover that case and check the transferability of adversarial purification defences from image classifiers to IQA methods. In this paper, we apply several widespread attacks on IQA models and examine the success of the defences against them. The purification methodologies covered different preprocessing techniques, including geometrical transformations, compression, denoising, and modern neural network-based methods. Also, we address the challenge of assessing the efficacy of a defensive methodology by proposing ways to estimate output visual quality and the success of neutralizing attacks. Defences were tested against attack on three IQA metrics -- Linearity, MetaIQA and SPAQ. The code for attacks and defences is available at: (link is hidden for a blind review).

INTERACT · 成對型 · 推斷 · 統計量 · 評論員 ·

2024 年 4 月 9 日

A feature-based information-theoretic approach for detecting interpretable, long-timescale pairwise interactions from time series

Aria Nguyen,Oscar McMullin,Joseph T. Lizier,Ben D. Fulcher

from arxiv, 20 pages, 7 figures

Quantifying relationships between components of a complex system is critical to understanding the rich network of interactions that characterize the behavior of the system. Traditional methods for detecting pairwise dependence of time series, such as Pearson correlation, Granger causality, and mutual information, are computed directly in the space of measured time-series values. But for systems in which interactions are mediated by statistical properties of the time series (`time-series features') over longer timescales, this approach can fail to capture the underlying dependence from limited and noisy time-series data, and can be challenging to interpret. Addressing these issues, here we introduce an information-theoretic method for detecting dependence between time series mediated by time-series features that provides interpretable insights into the nature of the interactions. Our method extracts a candidate set of time-series features from sliding windows of the source time series and assesses their role in mediating a relationship to values of the target process. Across simulations of three different generative processes, we demonstrate that our feature-based approach can outperform a traditional inference approach based on raw time-series values, especially in challenging scenarios characterized by short time-series lengths, high noise levels, and long interaction timescales. Our work introduces a new tool for inferring and interpreting feature-mediated interactions from time-series data, contributing to the broader landscape of quantitative analysis in complex systems research, with potential applications in various domains including but not limited to neuroscience, finance, climate science, and engineering.

MoDELS · PDE · Processing（編程語言） · 近似 · 縮放 ·

2024 年 4 月 7 日

Generative downscaling of PDE solvers with physics-guided diffusion models

Yulong Lu,Wuzhe Xu

Solving partial differential equations (PDEs) on fine spatio-temporal scales for high-fidelity solutions is critical for numerous scientific breakthroughs. Yet, this process can be prohibitively expensive, owing to the inherent complexities of the problems, including nonlinearity and multiscale phenomena. To speed up large-scale computations, a process known as downscaling is employed, which generates high-fidelity approximate solutions from their low-fidelity counterparts. In this paper, we propose a novel Physics-Guided Diffusion Model (PGDM) for downscaling. Our model, initially trained on a dataset comprising low-and-high-fidelity paired solutions across coarse and fine scales, generates new high-fidelity approximations from any new low-fidelity inputs. These outputs are subsequently refined through fine-tuning, aimed at minimizing the physical discrepancies as defined by the discretized PDEs at the finer scale. We evaluate and benchmark our model's performance against other downscaling baselines in three categories of nonlinear PDEs. Our numerical experiments demonstrate that our model not only outperforms the baselines but also achieves a computational acceleration exceeding tenfold, while maintaining the same level of accuracy as the conventional fine-scale solvers.

推斷 · 統計量 · 錯誤率 · 分離的 · CASES ·

2024 年 4 月 3 日

Inconistent multiple testing corrections: The fallacy of using family-based error rates to make inferences about individual hypotheses

Mark Rubin

During multiple testing, researchers often adjust their alpha level to control the familywise error rate for a statistical inference about a joint union alternative hypothesis (e.g., "H1,1 or H1,2"). However, in some cases, they do not make this inference. Instead, they make separate inferences about each of the individual hypotheses that comprise the joint hypothesis (e.g., H1,1 and H1,2). For example, a researcher might use a Bonferroni correction to adjust their alpha level from the conventional level of 0.050 to 0.025 when testing H1,1 and H1,2, find a significant result for H1,1 (p < 0.025) and not for H1,2 (p > .0.025), and so claim support for H1,1 and not for H1,2. However, these separate individual inferences do not require an alpha adjustment. Only a statistical inference about the union alternative hypothesis "H1,1 or H1,2" requires an alpha adjustment because it is based on "at least one" significant result among the two tests, and so it refers to the familywise error rate. Hence, an inconsistent correction occurs when a researcher corrects their alpha level during multiple testing but does not make an inference about a union alternative hypothesis. In the present article, I discuss this inconsistent correction problem, including its reduction in statistical power for tests of individual hypotheses and its potential causes vis-a-vis error rate confusions and the alpha adjustment ritual. I also provide three illustrations of inconsistent corrections from recent psychology studies. I conclude that inconsistent corrections represent a symptom of statisticism, and I call for a more nuanced inference-based approach to multiple testing corrections.

Networking · INTERACT · MoDELS · 推斷 · 可辨認的 ·

2024 年 4 月 2 日

Nonparametric inference of higher order interaction patterns in networks

Anatol E. Wegner,Sofia C. Olhede

from arxiv, 18 pages, 3 figures, Supporting Information (SI.pdf)

We propose a method for obtaining parsimonious decompositions of networks into higher order interactions which can take the form of arbitrary motifs.The method is based on a class of analytically solvable generative models, where vertices are connected via explicit copies of motifs, which in combination with non-parametric priors allow us to infer higher order interactions from dyadic graph data without any prior knowledge on the types or frequencies of such interactions. Crucially, we also consider 'degree--corrected' models that correctly reflect the degree distribution of the network and consequently prove to be a better fit for many real world--networks compared to non-degree corrected models. We test the presented approach on simulated data for which we recover the set of underlying higher order interactions to a high degree of accuracy. For empirical networks the method identifies concise sets of atomic subgraphs from within thousands of candidates that cover a large fraction of edges and include higher order interactions of known structural and functional significance. The method not only produces an explicit higher order representation of the network but also a fit of the network to analytically tractable models opening new avenues for the systematic study of higher order network structures.

優化器 · 可約的 · 相同 · 評論員 · 噪聲 ·

2024 年 4 月 1 日

Quantum circuit scheduler for QPUs usage optimization

Javier Romero-Alvarez,Jaime Alvarado-Valiente,Jorge Casco-Seco,Enrique Moguel,Jose Garcia-Alonso,Javier Berrocal,Juan M. Murillo

Progress in the realm of quantum technologies is paving the way for a multitude of potential applications across different sectors. However, the reduced number of available quantum computers, their technical limitations and the high demand for their use are posing some problems for developers and researchers. Mainly, users trying to execute quantum circuits on these devices are usually facing long waiting times in the tasks queues. In this context, this work propose a technique to reduce waiting times and optimize quantum computers usage by scheduling circuits from different users into combined circuits that are executed at the same time. To validate this proposal, different widely known quantum algorithms have been selected and executed in combined circuits. The obtained results are then compared with the results of executing the same algorithms in an isolated way. This allowed us to measure the impact of the use of the scheduler. Among the obtained results, it has been possible to verify that the noise suffered by executing a combination of circuits through the proposed scheduler does not critically affect the outcomes.