一级片免费电影看黄片免费-在线看片日中文福利免费

The widespread use of black box prediction methods has sparked an increasing interest in algorithm/model-agnostic approaches for quantifying goodness-of-fit, with direct ties to specification testing, model selection and variable importance assessment. A commonly used framework involves defining a predictiveness criterion, applying a cross-fitting procedure to estimate the predictiveness, and utilizing the difference in estimated predictiveness between two models as the test statistic. However, even after standardization, the test statistic typically fails to converge to a non-degenerate distribution under the null hypothesis of equal goodness, leading to what is known as the degeneracy issue. To addresses this degeneracy issue, we present a simple yet effective device, Zipper. It draws inspiration from the strategy of additional splitting of testing data, but encourages an overlap between two testing data splits in predictiveness evaluation. Zipper binds together the two overlapping splits using a slider parameter that controls the proportion of overlap. Our proposed test statistic follows an asymptotically normal distribution under the null hypothesis for any fixed slider value, guaranteeing valid size control while enhancing power by effective data reuse. Finite-sample experiments demonstrate that our procedure, with a simple choice of the slider, works well across a wide range of settings.

相關內容

統計量

關注 3

樣本 · Networking · MoDELS · 循環神經網絡 · 輸出 ·

2023 年 8 月 22 日

Expressive probabilistic sampling in recurrent neural networks

Shirui Chen,Linxin Preston Jiang,Rajesh P. N. Rao,Eric Shea-Brown

In sampling-based Bayesian models of brain function, neural activities are assumed to be samples from probability distributions that the brain uses for probabilistic computation. However, a comprehensive understanding of how mechanistic models of neural dynamics can sample from arbitrary distributions is still lacking. We use tools from functional analysis and stochastic differential equations to explore the minimum architectural requirements for $\textit{recurrent}$ neural circuits to sample from complex distributions. We first consider the traditional sampling model consisting of a network of neurons whose outputs directly represent the samples (sampler-only network). We argue that synaptic current and firing-rate dynamics in the traditional model have limited capacity to sample from a complex probability distribution. We show that the firing rate dynamics of a recurrent neural circuit with a separate set of output units can sample from an arbitrary probability distribution. We call such circuits reservoir-sampler networks (RSNs). We propose an efficient training procedure based on denoising score matching that finds recurrent and output weights such that the RSN implements Langevin sampling. We empirically demonstrate our model's ability to sample from several complex data distributions using the proposed neural dynamics and discuss its applicability to developing the next generation of sampling-based brain models.

MCMC · Microsoft Surface · 近似 · 模型評估 · 馬爾可夫鏈蒙特卡羅 ·

2023 年 8 月 22 日

Evaluating the accuracy of Gaussian approximations in VSWIR imaging spectroscopy retrievals

Kelvin M. Leung,David R. Thompson,Jouni Susiluoto,Jayanth Jagalur-Mohan,Amy Braverman,Youssef Marzouk

The joint retrieval of surface reflectances and atmospheric parameters in VSWIR imaging spectroscopy is a computationally challenging high-dimensional problem. Using NASA's Surface Biology and Geology mission as the motivational context, the uncertainty associated with the retrievals is crucial for further application of the retrieved results for environmental applications. Although Markov chain Monte Carlo (MCMC) is a Bayesian method ideal for uncertainty quantification, the full-dimensional implementation of MCMC for the retrieval is computationally intractable. In this work, we developed a block Metropolis MCMC algorithm for the high-dimensional VSWIR surface reflectance retrieval that leverages the structure of the forward radiative transfer model to enable tractable fully Bayesian computation. We use the posterior distribution from this MCMC algorithm to assess the limitations of optimal estimation, the state-of-the-art Bayesian algorithm in operational retrievals which is more computationally efficient but uses a Gaussian approximation to characterize the posterior. Analyzing the differences in the posterior computed by each method, the MCMC algorithm was shown to give more physically sensible results and reveals the non-Gaussian structure of the posterior, specifically in the atmospheric aerosol optical depth parameter and the low-wavelength surface reflectances.

TransAct · Performer · 評論員 · Performance · 周期的 ·

2023 年 8 月 22 日

Albatross: An optimistic consensus algorithm

Pascal Berrang,Inês Cruz,Bruno Fran?a,Philipp von Styp-Rekowsky,Marvin Wissfeld

The consensus protocol is a critical component of distributed ledgers and blockchains. Achieving consensus over a decentralized network poses challenges to transaction finality and performance. Currently, the highest-performing consensus algorithms are speculative BFT algorithms, which, however, compromise on the transaction finality guarantees offered by their non-speculative counterparts. In this paper, we introduce Albatross, a Proof-of-Stake (PoS) blockchain consensus algorithm that aims to combine the best of both worlds. At its heart, Albatross is a high-performing, speculative BFT algorithm that offers strong probabilistic finality. We complement this by periodically guaranteeing finality through the Tendermint protocol. We prove our protocol to be secure under standard BFT assumptions and analyze its performance both on a theoretical and practical level. For that, we provide an open-source Rust implementation of Albatross. Our real-world measurements support that our protocol has a performance close to the theoretical maximum for single-chain Proof-of-Stake consensus algorithms.

CASE · Next · 統計方法 · 自然語言處理 · 機器學習 ·

2023 年 8 月 22 日

NLP-based detection of systematic anomalies among the narratives of consumer complaints

Peiheng Gao,Ning Sun,Xuefeng Wang,Chen Yang,Ri?ardas Zitikis

We develop an NLP-based procedure for detecting systematic nonmeritorious consumer complaints, simply called systematic anomalies, among complaint narratives. While classification algorithms are used to detect pronounced anomalies, in the case of smaller and frequent systematic anomalies, the algorithms may falter due to a variety of reasons, including technical ones as well as natural limitations of human analysts. Therefore, as the next step after classification, we convert the complaint narratives into quantitative data, which are then analyzed using an algorithm for detecting systematic anomalies. We illustrate the entire procedure using complaint narratives from the Consumer Complaint Database of the Consumer Financial Protection Bureau.

Analysis · INFORMS · Kronecker積 · 易處理的 · 統計量 ·

2023 年 8 月 22 日

A novel analysis of utility in privacy pipelines, using Kronecker products and quantitative information flow

Mário S. Alvim,Natasha Fernandes,Annabelle McIver,Carroll Morgan,Gabriel H. Nunes

We combine Kronecker products, and quantitative information flow, to give a novel formal analysis for the fine-grained verification of utility in complex privacy pipelines. The combination explains a surprising anomaly in the behaviour of utility of privacy-preserving pipelines -- that sometimes a reduction in privacy results also in a decrease in utility. We use the standard measure of utility for Bayesian analysis, introduced by Ghosh at al., to produce tractable and rigorous proofs of the fine-grained statistical behaviour leading to the anomaly. More generally, we offer the prospect of formal-analysis tools for utility that complement extant formal analyses of privacy. We demonstrate our results on a number of common privacy-preserving designs.

INFORMS · 路徑 · 模型選擇 · MoDELS · Markov ·

2023 年 8 月 21 日

Information content and maximum entropy of compartmental systems in equilibrium

Holger Metzler,Carlos A. Sierra

from arxiv, Code repository: //github.com/goujou/entropy_and_complexity_in_eq

Although compartmental dynamical systems are used in many different areas of science, model selection based on the maximum entropy principle (MaxEnt) is challenging because of the lack of methods for quantifying the entropy for this type of systems. Here, we take advantage of the interpretation of compartmental systems as continuous-time Markov chains to obtain entropy measures that quantify model information content. In particular, we quantify the uncertainty of a single particle's path as it travels through the system as described by path entropy and entropy rates. Path entropy measures the uncertainty of the entire path of a traveling particle from its entry into the system until its exit, whereas entropy rates measure the average uncertainty of the instantaneous future of a particle while it is in the system. We derive explicit formulas for these two types of entropy for compartmental systems in equilibrium based on Shannon information entropy and show how they can be used to solve equifinality problems in the process of model selection by means of MaxEnt.

MoDELS · Analysis · 有向 · 泛函 · Nuance ·

2023 年 8 月 20 日

A unified approach to radial, hyperbolic, and directional efficiency measurement in Data Envelopment Analysis

Margaréta Halická,Mária Trnovská,Ale? ?erny

from arxiv, 36 pages

The paper analyses properties of a large class of "path-based" Data Envelopment Analysis models through a unifying general scheme. The scheme includes the well-known oriented radial models, the hyperbolic distance function model, the directional distance function models, and even permits their generalisations. The modelling is not constrained to non-negative data and is flexible enough to accommodate variants of standard models over arbitrary data. Mathematical tools developed in the paper allow systematic analysis of the models from the point of view of ten desirable properties. It is shown that some of the properties are satisfied (resp., fail) for all models in the general scheme, while others have a more nuanced behaviour and must be assessed individually in each model. Our results can help researchers and practitioners navigate among the different models and apply the models to mixed data.

相互獨立的 · 線性的 · 直徑 · 數值分析 ·

2023 年 8 月 19 日

Additive Schwarz methods for semilinear elliptic problems with convex energy functionals: Convergence rate independent of nonlinearity

Jongho Park

from arxiv, 19 pages, 1 figures

We investigate additive Schwarz methods for semilinear elliptic problems with convex energy functionals, which have wide scientific applications. A key observation is that the convergence rates of both one- and two-level additive Schwarz methods have bounds independent of the nonlinear term in the problem. That is, the convergence rates do not deteriorate by the presence of nonlinearity, so that solving a semilinear problem requires no more iterations than a linear problem. Moreover, the two-level method is scalable in the sense that the convergence rate of the method depends on $H/h$ and $H/\delta$ only, where $h$ and $H$ are the typical diameters of an element and a subdomain, respectively, and $\delta$ measures the overlap among the subdomains. Numerical results are provided to support our theoretical findings.

全 · Lyapunov · 推斷 · 長短期記憶網絡 · 雅克比 ·

2023 年 8 月 18 日

Reconstruction, forecasting, and stability of chaotic dynamics from partial data

Elise ?zalp,Georgios Margazoglou,Luca Magri

The forecasting and computation of the stability of chaotic systems from partial observations are tasks for which traditional equation-based methods may not be suitable. In this computational paper, we propose data-driven methods to (i) infer the dynamics of unobserved (hidden) chaotic variables (full-state reconstruction); (ii) time forecast the evolution of the full state; and (iii) infer the stability properties of the full state. The tasks are performed with long short-term memory (LSTM) networks, which are trained with observations (data) limited to only part of the state: (i) the low-to-high resolution LSTM (LH-LSTM), which takes partial observations as training input, and requires access to the full system state when computing the loss; and (ii) the physics-informed LSTM (PI-LSTM), which is designed to combine partial observations with the integral formulation of the dynamical system's evolution equations. First, we derive the Jacobian of the LSTMs. Second, we analyse a chaotic partial differential equation, the Kuramoto-Sivashinsky (KS), and the Lorenz-96 system. We show that the proposed networks can forecast the hidden variables, both time-accurately and statistically. The Lyapunov exponents and covariant Lyapunov vectors, which characterize the stability of the chaotic attractors, are correctly inferred from partial observations. Third, the PI-LSTM outperforms the LH-LSTM by successfully reconstructing the hidden chaotic dynamics when the input dimension is smaller or similar to the Kaplan-Yorke dimension of the attractor. This work opens new opportunities for reconstructing the full state, inferring hidden variables, and computing the stability of chaotic systems from partial data.

Performer · 離散化 · 采樣法 · Continuity · 自適應采樣 ·

2023 年 8 月 18 日

FunQuant: A R package to perform quantization in the context of rare events and time-consuming simulations

Charlie Sire,Yann Richet,Rodolphe Le Riche,Didier Rullière,Jérémy Rohmer,Lucie Pheulpin

from arxiv, 7 pages, 4 figures. Submitted to Journal Of Open Source Software

Quantization summarizes continuous distributions by calculating a discrete approximation. Among the widely adopted methods for data quantization is Lloyd's algorithm, which partitions the space into Vorono\"i cells, that can be seen as clusters, and constructs a discrete distribution based on their centroids and probabilistic masses. Lloyd's algorithm estimates the optimal centroids in a minimal expected distance sense, but this approach poses significant challenges in scenarios where data evaluation is costly, and relates to rare events. Then, the single cluster associated to no event takes the majority of the probability mass. In this context, a metamodel is required and adapted sampling methods are necessary to increase the precision of the computations on the rare clusters.