亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

In this paper, we consider a class of symmetry groups associated to communication channels, which can informally be viewed as the transformations of the set of inputs that ``commute'' with the action of the channel. These groups were first studied by Polyanskiy in (IEEEToIT 2013). We show the simple result that the input distribution that attains the maximum mutual information for a given channel is a ``fixed point'' of its group. We conjecture (and give empirical evidence) that the channel group of the deletion channel is extremely small (it contains a number of elements constant in the blocklength). We prove a special case of this conjecture. This serves as some formal justification for why the analysis of the binary deletion channel has proved much more difficult than its memoryless counterparts.

相關內容

The Gaussian kernel and its traditional normalizations (e.g., row-stochastic) are popular approaches for assessing similarities between data points, commonly used for manifold learning and clustering, as well as supervised and semi-supervised learning on graphs. In many practical situations, the data can be corrupted by noise that prohibits traditional affinity matrices from correctly assessing similarities, especially if the noise magnitudes vary considerably across the data, e.g., under heteroskedasticity or outliers. An alternative approach that provides a more stable behavior under noise is the doubly stochastic normalization of the Gaussian kernel. In this work, we investigate this normalization in a setting where points are sampled from an unknown density on a low-dimensional manifold embedded in high-dimensional space and corrupted by possibly strong, non-identically distributed, sub-Gaussian noise. We establish the pointwise concentration of the doubly stochastic affinity matrix and its scaling factors around certain population forms. We then utilize these results to develop several tools for robust inference. First, we derive a robust density estimator that can substantially outperform the standard kernel density estimator under high-dimensional noise. Second, we provide estimators for the pointwise noise magnitudes, the pointwise signal magnitudes, and the pairwise Euclidean distances between clean data points. Lastly, we derive robust graph Laplacian normalizations that approximate popular manifold Laplacians, including the Laplace Beltrami operator, showing that the local geometry of the manifold can be recovered under high-dimensional noise. We exemplify our results in simulations and on real single-cell RNA-sequencing data. In the latter, we show that our proposed normalizations are robust to technical variability associated with different cell types.

The ultimate purpose of the statistical analysis of ordinal patterns is to characterize the distribution of the features they induce. In particular, knowing the joint distribution of the pair Entropy-Statistical Complexity for a large class of time series models would allow statistical tests that are unavailable to date. Working in this direction, we characterize the asymptotic distribution of the empirical Shannon's Entropy for any model under which the true normalized Entropy is neither zero nor one. We obtain the asymptotic distribution from the Central Limit Theorem (assuming large time series), the Multivariate Delta Method, and a third-order correction of its mean value. We discuss the applicability of other results (exact, first-, and second-order corrections) regarding their accuracy and numerical stability. Within a general framework for building test statistics about Shannon's Entropy, we present a bilateral test that verifies if there is enough evidence to reject the hypothesis that two signals produce ordinal patterns with the same Shannon's Entropy. We applied this bilateral test to the daily maximum temperature time series from three cities (Dublin, Edinburgh, and Miami) and obtained sensible results.

Linear computation broadcast (LCBC) refers to a setting with $d$ dimensional data stored at a central server, where $K$ users, each with some prior linear side-information, wish to retrieve various linear combinations of the data. The goal is to determine the minimum amount of information that must be broadcast to satisfy all the users. The reciprocal of the optimal broadcast cost is the capacity of LCBC. The capacity is known for up to $K=3$ users. Since LCBC includes index coding as a special case, large $K$ settings of LCBC are at least as hard as the index coding problem. Instead of the general setting (all instances), by focusing on the generic setting (almost all instances) this work shows that the generic capacity of the symmetric LCBC (where every user has $m'$ dimensions of side-information and $m$ dimensions of demand) for large number of users ($K>d$ suffices) is $C_g=1/\Delta_g$, where $\Delta_g=\min\left\{\max\{0,d-m'\}, Km, \frac{dm}{m+m'}\right\}$ is the broadcast cost that is both achievable and unbeatable asymptotically almost surely for large $n$, among all LCBC instances with the given parameters $p,K,d,m,m'$. Relative to baseline schemes of random coding or separate transmissions, $C_g$ shows an extremal gain by a factor of $K$ as a function of number of users, and by a factor of $\approx d/4$ as a function of data dimensions, when optimized over remaining parameters. For arbitrary number of users, the generic capacity of the symmetric LCBC is characterized within a factor of $2$.

We introduce a two-person non-zero-sum random-turn game that is a variant of the stake-governed games introduced recently in [HP2022]. We call the new game the Trail of Lost Pennies. At time zero, a counter is placed at a given integer location: $X_0 = k \in \mathbb{Z}$, say. At the $i$-th turn (for $i \in \mathbb{N}_+$), Maxine and Mina place non-negative stakes, $a_i$ and $b_i$, for which each pays from her own savings. Maxine is declared to be the turn victor with probability $\tfrac{a_i}{a_i+b_i}$; otherwise, Mina is. If Maxine wins the turn, she will move the counter one place to the right, so that $X_i = X_{i-1} +1$; if Mina does so, the counter will move one place to the left, so that $X_i = X_{i-1} -1$. If $\liminf X_i = \infty$, then Maxine wins the game; if $\limsup X_i = -\infty$, then Mina does. (A special rule is needed to treat the remaining, indeterminate, case.) When Maxine wins, she receives a terminal payment of $m_\infty$, while Mina receives $n_\infty$. If Mina wins, these respective receipts are $m_{-\infty}$ and $n_{-\infty}$. The four terminal payment values are supposed to be real numbers that satisfy $m_\infty > m_{-\infty}$ and $n_\infty < n_{-\infty}$, where these bounds accord with the notion that Maxine wins when the counter ends far to the right, and that Mina does so when it reaches far to the left. Each player is motivated to offer stakes at each turn of the game, in order to secure the higher terminal payment that will arise from her victory; but since these stake amounts accumulate to act as a cost depleting the profit arising from victory, each player must also seek to control these expenses. In this article, we study the Trail of Lost Pennies, formulating strategies for the two players and defining and analysing Nash equilibria in the game.

The precision matrix that encodes conditional linear dependency relations among a set of variables forms an important object of interest in multivariate analysis. Sparse estimation procedures for precision matrices such as the graphical lasso (Glasso) gained popularity as they facilitate interpretability, thereby separating pairs of variables that are conditionally dependent from those that are independent (given all other variables). Glasso lacks, however, robustness to outliers. To overcome this problem, one typically applies a robust plug-in procedure where the Glasso is computed from a robust covariance estimate instead of the sample covariance, thereby providing protection against outliers. In this paper, we study such estimators theoretically, by deriving and comparing their influence function, sensitivity curves and asymptotic variances.

Variational Bayesian posterior inference often requires simplifying approximations such as mean-field parametrisation to ensure tractability. However, prior work has associated the variational mean-field approximation for Bayesian neural networks with underfitting in the case of small datasets or large model sizes. In this work, we show that invariances in the likelihood function of over-parametrised models contribute to this phenomenon because these invariances complicate the structure of the posterior by introducing discrete and/or continuous modes which cannot be well approximated by Gaussian mean-field distributions. In particular, we show that the mean-field approximation has an additional gap in the evidence lower bound compared to a purpose-built posterior that takes into account the known invariances. Importantly, this invariance gap is not constant; it vanishes as the approximation reverts to the prior. We proceed by first considering translation invariances in a linear model with a single data point in detail. We show that, while the true posterior can be constructed from a mean-field parametrisation, this is achieved only if the objective function takes into account the invariance gap. Then, we transfer our analysis of the linear model to neural networks. Our analysis provides a framework for future work to explore solutions to the invariance problem.

Parallel-in-time methods for partial differential equations (PDEs) have been the subject of intense development over recent decades, particularly for diffusion-dominated problems. It has been widely reported in the literature, however, that many of these methods perform quite poorly for advection-dominated problems. Here we analyze the particular iterative parallel-in-time algorithm of multigrid reduction-in-time (MGRIT) for discretizations of constant-wave-speed linear advection problems. We focus on common method-of-lines discretizations that employ upwind finite differences in space and Runge-Kutta methods in time. Using a convergence framework we developed in previous work, we prove for a subclass of these discretizations that, if using the standard approach of rediscretizing the fine-grid problem on the coarse grid, robust MGRIT convergence with respect to CFL number and coarsening factor is not possible. This poor convergence and non-robustness is caused, at least in part, by an inadequate coarse-grid correction for smooth Fourier modes known as characteristic components.We propose an alternative coarse-grid that provides a better correction of these modes. This coarse-grid operator is related to previous work and uses a semi-Lagrangian discretization combined with an implicitly treated truncation error correction. Theory and numerical experiments show the coarse-grid operator yields fast MGRIT convergence for many of the method-of-lines discretizations considered, including for both implicit and explicit discretizations of high order.

The absolute value equations (AVE) problem is an algebraic problem of solving Ax+|x|=b. So far, most of the research focused on methods for solving AVEs, but we address the problem itself by analysing properties of AVE and the corresponding solution set. In particular, we investigate topological properties of the solution set, such as convexity, boundedness, connectedness, or whether it consists of finitely many solutions. Further, we address problems related to nonnegativity of solutions such as solvability or unique solvability. AVE can be formulated by means of different optimization problems, and in this regard we are interested in how the solutions of AVE are related with optima, Karush-Kuhn-Tucker points and feasible solutions of these optimization problems. We characterize the matrix classes associated with the above mentioned properties and inspect the computational complexity of the recognition problem; some of the classes are polynomially recognizable, but some others are proved to be NP-hard. For the intractable cases, we propose various sufficient conditions. We also post new challenging problems that raised during the investigation of the problem.

Let $L$ be an $n\times n$ array whose top left $r\times r$ subarray is filled with $k$ different symbols, each occurring at most once in each row and at most once in each column. We establish necessary and sufficient conditions that ensure the remaining cells of $L$ can be filled such that each symbol occurs at most once in each row and at most once in each column, $L$ is symmetric with respect to the main diagonal, and each symbol occurs a prescribed number of times in $L$. The case where the prescribed number of times each symbol occurs is $n$ was solved by Cruse (J. Combin. Theory Ser. A 16 (1974), 18-22), and the case where the top left subarray is $r\times n$ and the symmetry is not required, was settled by Goldwasser et al. (J. Combin. Theory Ser. A 130 (2015), 26-41). Our result allows the entries of the main diagonal to be specified as well, which leads to an extension of the Andersen-Hoffman's Theorem (Annals of Disc. Math. 15 (1982) 9-26, European J. Combin. 4 (1983) 33-35).

This book develops an effective theory approach to understanding deep neural networks of practical relevance. Beginning from a first-principles component-level picture of networks, we explain how to determine an accurate description of the output of trained networks by solving layer-to-layer iteration equations and nonlinear learning dynamics. A main result is that the predictions of networks are described by nearly-Gaussian distributions, with the depth-to-width aspect ratio of the network controlling the deviations from the infinite-width Gaussian description. We explain how these effectively-deep networks learn nontrivial representations from training and more broadly analyze the mechanism of representation learning for nonlinear models. From a nearly-kernel-methods perspective, we find that the dependence of such models' predictions on the underlying learning algorithm can be expressed in a simple and universal way. To obtain these results, we develop the notion of representation group flow (RG flow) to characterize the propagation of signals through the network. By tuning networks to criticality, we give a practical solution to the exploding and vanishing gradient problem. We further explain how RG flow leads to near-universal behavior and lets us categorize networks built from different activation functions into universality classes. Altogether, we show that the depth-to-width ratio governs the effective model complexity of the ensemble of trained networks. By using information-theoretic techniques, we estimate the optimal aspect ratio at which we expect the network to be practically most useful and show how residual connections can be used to push this scale to arbitrary depths. With these tools, we can learn in detail about the inductive bias of architectures, hyperparameters, and optimizers.

北京阿比特科技有限公司