亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

This paper studies inference in randomized controlled trials with multiple treatments, where treatment status is determined according to a "matched tuples" design. Here, by a matched tuples design, we mean an experimental design where units are sampled i.i.d. from the population of interest, grouped into "homogeneous" blocks with cardinality equal to the number of treatments, and finally, within each block, each treatment is assigned exactly once uniformly at random. We first study estimation and inference for matched tuples designs in the general setting where the parameter of interest is a vector of linear contrasts over the collection of average potential outcomes for each treatment. Parameters of this form include standard average treatment effects used to compare one treatment relative to another, but also include parameters which may be of interest in the analysis of factorial designs. We first establish conditions under which a sample analogue estimator is asymptotically normal and construct a consistent estimator of its corresponding asymptotic variance. Combining these results establishes the asymptotic exactness of tests based on these estimators. In contrast, we show that, for two common testing procedures based on t-tests constructed from linear regressions, one test is generally conservative while the other generally invalid. We go on to apply our results to study the asymptotic properties of what we call "fully-blocked" 2^K factorial designs, which are simply matched tuples designs applied to a full factorial experiment. Leveraging our previous results, we establish that our estimator achieves a lower asymptotic variance under the fully-blocked design than that under any stratified factorial design which stratifies the experimental sample into a finite number of "large" strata. A simulation study and empirical application illustrate the practical relevance of our results.

相關內容

Advanced science and technology provide a wealth of big data from different sources for extreme value analysis.Classic extreme value theory was extended to obtain an accelerated max-stable distribution family for modelling competing risk-based extreme data in Cao and Zhang (2021). In this paper, we establish probability models for power normalized maxima and minima from competing risks. The limit distributions consist of an extensional new accelerated max-stable and min-stable distribution family (termed as the accelerated p-max/p-min stable distribution), and its left-truncated version. The limit types of distributions are determined principally by the sample generating process and the interplay among the competing risks, which are illustrated by common examples. Further, the statistical inference concerning the maximum likelihood estimation and model diagnosis of this model was investigated. Numerical studies show first the efficient approximation of all limit scenarios as well as its comparable convergence rate in contrast with those under linear normalization, and then present the maximum likelihood estimation and diagnosis of accelerated p-max/p-min stable models for simulated data sets. Finally, two real datasets concerning annual maximum of ground level ozone and survival times of Stanford heart plant demonstrate the performance of our accelerated p-max and accelerated p-min stable models.

We propose a sparse algebra for samplet compressed kernel matrices, to enable efficient scattered data analysis. We show the compression of kernel matrices by means of samplets produces optimally sparse matrices in a certain S-format. It can be performed in cost and memory that scale essentially linearly with the matrix size $N$, for kernels of finite differentiability, along with addition and multiplication of S-formatted matrices. We prove and exploit the fact that the inverse of a kernel matrix (if it exists) is compressible in the S-format as well. Selected inversion allows to directly compute the entries in the corresponding sparsity pattern. The S-formatted matrix operations enable the efficient, approximate computation of more complicated matrix functions such as ${\bm A}^\alpha$ or $\exp({\bm A})$. The matrix algebra is justified mathematically by pseudo differential calculus. As an application, efficient Gaussian process learning algorithms for spatial statistics is considered. Numerical results are presented to illustrate and quantify our findings.

Score-based generative modeling (SGM) is a highly successful approach for learning a probability distribution from data and generating further samples. We prove the first polynomial convergence guarantees for the core mechanic behind SGM: drawing samples from a probability density $p$ given a score estimate (an estimate of $\nabla \ln p$) that is accurate in $L^2(p)$. Compared to previous works, we do not incur error that grows exponentially in time or that suffers from a curse of dimensionality. Our guarantee works for any smooth distribution and depends polynomially on its log-Sobolev constant. Using our guarantee, we give a theoretical analysis of score-based generative modeling, which transforms white-noise input into samples from a learned data distribution given score estimates at different noise scales. Our analysis gives theoretical grounding to the observation that an annealed procedure is required in practice to generate good samples, as our proof depends essentially on using annealing to obtain a warm start at each step. Moreover, we show that a predictor-corrector algorithm gives better convergence than using either portion alone.

The core of many cryptocurrencies is the decentralised validation network operating on proof-of-work technology. In these systems, validation is done by so-called miners who can digitally sign blocks once they solve a computationally-hard problem. Conventional wisdom generally considers this protocol as secure and stable as miners are incentivised to follow the behaviour of the majority. However, whether some strategic mining behaviours occur in practice is still a major concern. In this paper we target this question by focusing on a security threat: a selfish mining attack in which malicious miners deviate from protocol by not immediately revealing their newly mined blocks. We propose a statistical test to analyse each miner's behaviour in five popular cryptocurrencies: Bitcoin, Litecoin, Monacoin, Ethereum and Bitcoin Cash. Our method is based on the realisation that selfish mining behaviour will cause identifiable anomalies in the statistics of miner's successive blocks discovery. Secondly, we apply heuristics-based address clustering to improve the detectability of this kind of behaviour. We find a marked presence of abnormal miners in Monacoin and Bitcoin Cash, and, to a lesser extent, in Ethereum. Finally, we extend our method to detect coordinated selfish mining attacks, finding mining cartels in Monacoin where miners might secretly share information about newly mined blocks in advance. Our analysis contributes to the research on security in cryptocurrency systems by providing the first empirical evidence that the aforementioned strategic mining behaviours do take place in practice.

Optimal model reduction for large-scale linear dynamical systems is studied. In contrast to most existing works, the systems under consideration are not required to be stable, neither in discrete nor in continuous time. As a consequence, the underlying rational transfer functions are allowed to have poles in general domains in the complex plane. In particular, this covers the case of specific conservative partial differential equations such as the linear Schr\"odinger and the undamped linear wave equation with spectra on the imaginary axis. By an appropriate modification of the classical continuous time Hardy space $\mathcal{H}_2$, a new $\mathcal{H}_2$ like optimal model reduction problem is introduced and first order optimality conditions are derived. As in the classical $\mathcal{H}_2$ case, these conditions exhibit a rational Hermite interpolation structure for which an iterative model reduction algorithm is proposed. Numerical examples demonstrate the effectiveness of the new method.

Many experimental time series measurements share unobserved causal drivers. Examples include genes targeted by transcription factors, ocean flows influenced by large-scale atmospheric currents, and motor circuits steered by descending neurons. Reliably inferring this unseen driving force is necessary to understand the intermittent nature of top-down control schemes in diverse biological and engineered systems. Here, we introduce a new unsupervised learning algorithm that uses recurrences in time series measurements to gradually reconstruct an unobserved driving signal. Drawing on the mathematical theory of skew-product dynamical systems, we identify recurrence events shared across response time series, which implicitly define a recurrence graph with glass-like structure. As the amount or quality of observed data improves, this recurrence graph undergoes a percolation transition manifesting as weak ergodicity breaking for random walks on the induced landscape -- revealing the shared driver's dynamics, even in the presence of strongly corrupted or noisy measurements. Across several thousand random dynamical systems, we empirically quantify the dependence of reconstruction accuracy on the rate of information transfer from a chaotic driver to the response systems, and we find that effective reconstruction proceeds through gradual approximation of the driver's dominant orbit topology. Through extensive benchmarks against classical and neural-network-based signal processing techniques, we demonstrate our method's strong ability to extract causal driving signals from diverse real-world datasets spanning ecology, genomics, fluid dynamics, and physiology.

This work studies an experimental design problem where {the values of a predictor variable, denoted by $x$}, are to be determined with the goal of estimating a function $m(x)$, which is observed with noise. A linear model is fitted to $m(x)$ but it is not assumed that the model is correctly specified. It follows that the quantity of interest is the best linear approximation of $m(x)$, which is denoted by $\ell(x)$. It is shown that in this framework the ordinary least squares estimator typically leads to an inconsistent estimation of $\ell(x)$, and rather weighted least squares should be considered. An asymptotic minimax criterion is formulated for this estimator, and a design that minimizes the criterion is constructed. An important feature of this problem is that the $x$'s should be random, rather than fixed. Otherwise, the minimax risk is infinite. It is shown that the optimal random minimax design is different from its deterministic counterpart, which was studied previously, and a simulation study indicates that it generally performs better when $m(x)$ is a quadratic or a cubic function. Another finding is that when the variance of the noise goes to infinity, the random and deterministic minimax designs coincide. The results are illustrated for polynomial regression models and the general case is also discussed.

We employ a toolset -- dubbed Dr. Frankenstein -- to analyse the similarity of representations in deep neural networks. With this toolset, we aim to match the activations on given layers of two trained neural networks by joining them with a stitching layer. We demonstrate that the inner representations emerging in deep convolutional neural networks with the same architecture but different initializations can be matched with a surprisingly high degree of accuracy even with a single, affine stitching layer. We choose the stitching layer from several possible classes of linear transformations and investigate their performance and properties. The task of matching representations is closely related to notions of similarity. Using this toolset, we also provide a novel viewpoint on the current line of research regarding similarity indices of neural network representations: the perspective of the performance on a task.

Seeking the equivalent entities among multi-source Knowledge Graphs (KGs) is the pivotal step to KGs integration, also known as \emph{entity alignment} (EA). However, most existing EA methods are inefficient and poor in scalability. A recent summary points out that some of them even require several days to deal with a dataset containing 200,000 nodes (DWY100K). We believe over-complex graph encoder and inefficient negative sampling strategy are the two main reasons. In this paper, we propose a novel KG encoder -- Dual Attention Matching Network (Dual-AMN), which not only models both intra-graph and cross-graph information smartly, but also greatly reduces computational complexity. Furthermore, we propose the Normalized Hard Sample Mining Loss to smoothly select hard negative samples with reduced loss shift. The experimental results on widely used public datasets indicate that our method achieves both high accuracy and high efficiency. On DWY100K, the whole running process of our method could be finished in 1,100 seconds, at least 10* faster than previous work. The performances of our method also outperform previous works across all datasets, where Hits@1 and MRR have been improved from 6% to 13%.

This paper focuses on the expected difference in borrower's repayment when there is a change in the lender's credit decisions. Classical estimators overlook the confounding effects and hence the estimation error can be magnificent. As such, we propose another approach to construct the estimators such that the error can be greatly reduced. The proposed estimators are shown to be unbiased, consistent, and robust through a combination of theoretical analysis and numerical testing. Moreover, we compare the power of estimating the causal quantities between the classical estimators and the proposed estimators. The comparison is tested across a wide range of models, including linear regression models, tree-based models, and neural network-based models, under different simulated datasets that exhibit different levels of causality, different degrees of nonlinearity, and different distributional properties. Most importantly, we apply our approaches to a large observational dataset provided by a global technology firm that operates in both the e-commerce and the lending business. We find that the relative reduction of estimation error is strikingly substantial if the causal effects are accounted for correctly.

北京阿比特科技有限公司