亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Suppose that a random variable $X$ of interest is observed. This paper concerns "the least favorable noise" $\hat{Y}_{\epsilon}$, which maximizes the prediction error $E [X - E[X|X+Y]]^2 $ (or minimizes the variance of $E[X| X+Y]$) in the class of $Y$ with $Y$ independent of $X$ and $\mathrm{var} Y \leq \epsilon^2$. This problem was first studied by Ernst, Kagan, and Rogers ([3]). In the present manuscript, we show that the least favorable noise $\hat{Y}_{\epsilon}$ must exist and that its variance must be $\epsilon^2$. The proof of existence relies on a convergence result we develop for variances of conditional expectations. Further, we show that the function $\inf_{\mathrm{var} Y \leq \epsilon^2} \, \mathrm{var} \, E[X|X+Y]$ is both strictly decreasing and right continuous in $\epsilon$.

相關內容

A random algebraic graph is defined by a group ${G}$ with a "uniform" distribution over it and a connection $\sigma:{G}\longrightarrow [0,1]$ with expectation $p,$ satisfying $\sigma({g}) = \sigma({g}^{-1}).$ The random graph $\mathsf{RAG}(n,{G},p,\sigma)$ with vertex set $[n]$ is formed as follows. First, $n$ independent latent vectors ${x}_1, \ldots, {x}_n$ are sampled uniformly from ${G}.$ Then, vertices $i,j$ are connected with probability $\sigma({x}_i{x}_j^{-1}).$ This model captures random geometric graphs with latent space the unit sphere and the hypercube, certain regimes of the stochastic block model, and random subgraphs of Cayley graphs. The main question of interest to the current paper is: when is a random algebraic graph statistically and/or computationally distinguishable from $\mathsf{G}(n,p)$? Our results fall into two main categories. 1) Geometric. We focus on the case ${G} =\{\pm1\}^d$ and use Fourier-analytic tools. For hard threshold connections, we match [LMSY22b] for $p = \omega(1/n)$ and for connections that are $\frac{1}{r\sqrt{d}}$-Lipschitz we extend the results of [LR21b] when $d = \Omega(n\log n)$ to the non-monotone setting. 2) Algebraic. We provide evidence for an exponential statistical-computational gap. Consider any finite group ${G}$ and let $A\subseteq {G}$ be a set of elements formed by including each set of the form $\{{g}, {g}^{-1}\}$ independently with probability $1/2.$ Let $\Gamma_n({G},A)$ be the distribution of random graphs formed by taking a uniformly random induced subgraph of size $n$ of the Cayley graph $\Gamma({G},A).$ Then, $\Gamma_n({G}, A)$ and $\mathsf{G}(n,1/2)$ are statistically indistinguishable with high probability over $A$ if and only if $\log |{G}| \gtrsim n.$ However, low-degree polynomial tests fail to distinguish $\Gamma_n({G}, A)$ and $\mathsf{G}(n,1/2)$ with high probability over $A$ when $\log |{G}| = \log^{\Omega(1)}n.$

According to Aistleitner and Weimar, there exist two-dimensional (double) infinite matrices whose star-discrepancy $D_N^{*s}$ of the first $N$ rows and $s$ columns, interpreted as $N$ points in $[0,1]^s$, satisfies an inequality of the form $$D_N^{*s} \leq \sqrt{\alpha} \sqrt{A+B\frac{\ln(\log_2(N))}{s}}\sqrt{\frac{s}{N}}$$ with $\alpha = \zeta^{-1}(2) \approx 1.73, A=1165$ and $B=178$. These matrices are obtained by using i.i.d sequences, and the parameters $s$ and $N$ refer to the dimension and the sample size respectively. In this paper, we improve their result in two directions: First, we change the character of the equation so that the constant $A$ gets replaced by a value $A_s$ dependent on the dimension $s$ such that for $s>1$ we have $A_s<A$. Second, we generalize the result to the case of the (extreme) discrepancy. The paper is complemented by a section where we show numerical results for the dependence of the parameter $A_s$ on $s$.

Let a polytope $P$ be defined by a system $A x \leq b$. We consider the problem of counting the number of integer points inside $P$, assuming that $P$ is $\Delta$-modular, where the polytope $P$ is called $\Delta$-modular if all the rank sub-determinants of $A$ are bounded by $\Delta$ in the absolute value. We present a new FPT-algorithm, parameterized by $\Delta$ and by the maximal number of vertices in $P$, where the maximum is taken by all r.h.s. vectors $b$. We show that our algorithm is more efficient for $\Delta$-modular problems than the approach of A. Barvinok et al. To this end, we do not directly compute the short rational generating function for $P \cap Z^n$, which is commonly used for the considered problem. Instead, we use the dynamic programming principle to compute its particular representation in the form of exponential series that depends on a single variable. We completely do not rely to the Barvinok's unimodular sign decomposition technique. Using our new complexity bound, we consider different special cases that may be of independent interest. For example, we give FPT-algorithms for counting the integer points number in $\Delta$-modular simplices and similar polytopes that have $n + O(1)$ facets. As a special case, for any fixed $m$, we give an FPT-algorithm to count solutions of the unbounded $m$-dimensional $\Delta$-modular subset-sum problem.

The central space of a joint distribution $(\vX,Y)$ is the minimal subspace $\mathcal S$ such that $Y\perp\hspace{-2mm}\perp \vX \mid P_{\mathcal S}\vX$ where $P_{\mathcal S}$ is the projection onto $\mathcal S$. Sliced inverse regression (SIR), one of the most popular methods for estimating the central space, often performs poorly when the structural dimension $d=\operatorname{dim}\left( \mathcal S \right)$ is large (e.g., $\geqs 5$). In this paper, we demonstrate that the generalized signal-noise-ratio (gSNR) tends to be extremely small for a general multiple-index model when $d$ is large. Then we determine the minimax rate for estimating the central space over a large class of high dimensional distributions with a large structural dimension $d$ (i.e., there is no constant upper bound on $d$) in the low gSNR regime. This result not only extends the existing minimax rate results for estimating the central space of distributions with fixed $d$ to that with a large $d$, but also clarifies that the degradation in SIR performance is caused by the decay of signal strength. The technical tools developed here might be of independent interest for studying other central space estimation methods.

We study the connections between sorting and the binary search tree (BST) model, with an aim towards showing that the fields are connected more deeply than is currently appreciated. While any BST can be used to sort by inserting the keys one-by-one, this is a very limited relationship and importantly says nothing about parallel sorting. We show what we believe to be the first formal relationship between the BST model and sorting. Namely, we show that a large class of sorting algorithms, which includes mergesort, quicksort, insertion sort, and almost every instance-optimal sorting algorithm, are equivalent in cost to offline BST algorithms. Our main theoretical tool is the geometric interpretation of the BST model introduced by Demaine et al., which finds an equivalence between searches on a BST and point sets in the plane satisfying a certain property. To give an example of the utility of our approach, we introduce the log-interleave bound, a measure of the information-theoretic complexity of a permutation $\pi$, which is within a $\lg \lg n$ multiplicative factor of a known lower bound in the BST model; we also devise a parallel sorting algorithm with polylogarithmic span that sorts a permutation $\pi$ using comparisons proportional to its log-interleave bound. Our aforementioned result on sorting and offline BST algorithms can be used to show existence of an offline BST algorithm whose cost is within a constant factor of the log-interleave bound of any permutation $\pi$.

We investigate expansions of Presburger arithmetic, i.e., the theory of the integers with addition and order, with additional structure related to exponentiation: either a function that takes a number to the power of $2$, or a predicate for the powers of $2$. The latter theory, denoted $\mathrm{PresPower}$, was introduced by B\"uchi as a first attempt at characterising the sets of tuples of numbers that can be expressed using finite automata; B\"uchi's method does not give an elementary upper bound, and the complexity of this theory has been open. The former theory, denoted as $\mathrm{PresExp}$, was shown decidable by Semenov; while the decision procedure for this theory differs radically from the automata-based method proposed by B\"uchi, the method is also non-elementary. And in fact, the theory with the power function has a non-elementary lower bound. In this paper, we show that while Semenov's and B\"uchi's approaches yield non-elementary blow-ups for $\mathrm{PresPower}$, the theory is in fact decidable in triply exponential time, similar to the best known quantifier-elimination algorithm for Presburger arithmetic. We also provide a $\mathrm{NExpTime}$ upper bound for the existential fragment of $\mathrm{PresExp}$, a step towards a finer-grained analysis of its complexity. Both these results are established by analysing a single parameterized satisfiability algorithm for $\mathrm{PresExp}$, which can be specialized to either the setting of $\mathrm{PresPower}$ or the existential theory of $\mathrm{PresExp}$. Besides the new upper bounds for the existential theory of $\mathrm{PresExp}$ and $\mathrm{PresPower}$, we believe our algorithm provides new intuition for the decidability of these theories, and for the features that lead to non-elementary blow-ups.

This paper explores the Ziv-Zakai bound (ZZB), which is a well-known Bayesian lower bound on the Minimum Mean Squared Error (MMSE). First, it is shown that the ZZB holds without any assumption on the distribution of the estimand, that is, the estimand does not necessarily need to have a probability density function. The ZZB is then further analyzed in the high-noise and low-noise regimes and shown to always tensorize. Finally, the tightness of the ZZB is investigated under several aspects, such as the number of hypotheses and the usefulness of the valley-filling function. In particular, a sufficient and necessary condition for the tightness of the bound with continuous inputs is provided, and it is shown that the bound is never tight for discrete input distributions with a support set that does not have an accumulation point at zero.

The weight distribution of error correction codes is a critical determinant of their error-correcting performance, making enumeration of utmost importance. In the case of polar codes, the minimum weight $\wm$ (which is equal to minimum distance $d$) is the only weight for which an explicit enumerator formula is currently available. Having closed-form weight enumerators for polar codewords with weights greater than the minimum weight not only simplifies the enumeration process but also provides valuable insights towards constructing better polar-like codes. In this paper, we contribute towards understanding the algebraic structure underlying higher weights by analyzing Minkowski sums of orbits. Our approach builds upon the lower triangular affine (LTA) group of decreasing monomial codes. Specifically, we propose a closed-form expression for the enumeration of codewords with weight $1.5\wm$. Our simulations demonstrate the potential for extending this method to higher weights.

This paper focuses on the expected difference in borrower's repayment when there is a change in the lender's credit decisions. Classical estimators overlook the confounding effects and hence the estimation error can be magnificent. As such, we propose another approach to construct the estimators such that the error can be greatly reduced. The proposed estimators are shown to be unbiased, consistent, and robust through a combination of theoretical analysis and numerical testing. Moreover, we compare the power of estimating the causal quantities between the classical estimators and the proposed estimators. The comparison is tested across a wide range of models, including linear regression models, tree-based models, and neural network-based models, under different simulated datasets that exhibit different levels of causality, different degrees of nonlinearity, and different distributional properties. Most importantly, we apply our approaches to a large observational dataset provided by a global technology firm that operates in both the e-commerce and the lending business. We find that the relative reduction of estimation error is strikingly substantial if the causal effects are accounted for correctly.

With the rapid increase of large-scale, real-world datasets, it becomes critical to address the problem of long-tailed data distribution (i.e., a few classes account for most of the data, while most classes are under-represented). Existing solutions typically adopt class re-balancing strategies such as re-sampling and re-weighting based on the number of observations for each class. In this work, we argue that as the number of samples increases, the additional benefit of a newly added data point will diminish. We introduce a novel theoretical framework to measure data overlap by associating with each sample a small neighboring region rather than a single point. The effective number of samples is defined as the volume of samples and can be calculated by a simple formula $(1-\beta^{n})/(1-\beta)$, where $n$ is the number of samples and $\beta \in [0,1)$ is a hyperparameter. We design a re-weighting scheme that uses the effective number of samples for each class to re-balance the loss, thereby yielding a class-balanced loss. Comprehensive experiments are conducted on artificially induced long-tailed CIFAR datasets and large-scale datasets including ImageNet and iNaturalist. Our results show that when trained with the proposed class-balanced loss, the network is able to achieve significant performance gains on long-tailed datasets.

北京阿比特科技有限公司