亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

This paper considers the performance of long Reed-Muller (RM) codes transmitted over binary memoryless symmetric (BMS) channels under bitwise maximum-a-posteriori decoding. Its main result is that the family of binary RM codes achieves capacity on any BMS channel with respect to bit-error rate. This resolves a long-standing open problem that connects information theory and error-correcting codes. In contrast with the earlier result for the binary erasure channel, the new proof does not rely on hypercontractivity. Instead, it combines a nesting property of RM codes with new information inequalities relating the generalized extrinsic information transfer function and the extrinsic minimum mean-squared error.

相關內容

《計算機信息》雜志發表高質量的論文,擴大了運籌學和計算的范圍,尋求有關理論、方法、實驗、系統和應用方面的原創研究論文、新穎的調查和教程論文,以及描述新的和有用的軟件工具的論文。官網鏈接: · INFORMS · 優化器 · 極大 · 二次規劃 ·
2021 年 12 月 27 日

In this paper, we study a general multi-cluster wireless powered communication network (WPCN) with user cooperation under harvest-then-transmit (HTT) protocol where the hybrid access point (HAP) as well as each user is equipped with multiple antennas. In the downlink phase of HTT, the HAP employs beamforming to transfer energy to the users. In the uplink phase, users in each cluster transmit their signals to the HAP and to their cluster heads (CHs). Afterward, the CHs first relay the signals of their cluster users and then transmit their own information signals to the HAP. The aim is to design the energy beamforming (EB) matrix, transmit covariance matrices of the users and time allocations among energy transfer and cooperation phases in order to optimize the max-min and sum throughputs of the network. The corresponding maximization problems are non-convex and NP-hard in general. We devise iterative algorithm based on alternating optimization (AO) and then the minorization-maximization (MM) technique is used to deal with the non-convex sub-problems with respect to (w.r.t.) the EB and covariance matrices in each iteration. We recast the resulting sub-problems as a convex second order cone programming (SOCP) and quadratic constraint quadratic programming (QCQP) for the max-min and sum throughput maximization problems, respectively. We also consider imperfect channel state information (CSI) at the HAP and CHs and non-linearity in energy harvesting (EH) circuits. Numerical examples show that the proposed cooperative method can effectively improve the achievable throughput in the multi-cluster wireless powered communication under various setups.

In this paper, we derive asymptotic expressions for the ergodic capacity of the multiple-input multiple-output (MIMO) keyhole channel at low SNR in independent and identically distributed (i.i.d.) Nakagami-$m$ fading conditions with perfect channel state information available at both the transmitter (CSI-T) and the receiver (CSI-R). We show that the low-SNR capacity of this keyhole channel scales proportionally as $\frac{\mathrm{SNR}}{4} \log^2 \left(1/{\mathrm{SNR}}\right)$. With this asymptotic low-SNR capacity formula, we find a very surprising result that contrary to popular belief, the capacity of the MIMO fading channel at low SNR increases in the presence of keyhole degenerate condition. Additionally, we show that a simple one-bit CSI-T based On-Off power scheme achieves this low-SNR capacity; surprisingly, it is robust against both moderate and severe fading conditions for a wide range of low SNR values. These results also extend to the Rayleigh keyhole MIMO channel as a special case.

Optical wireless communication (OWC) over atmospheric turbulence and pointing errors is a well-studied topic. Still, there is limited research on signal fading due to random fog in an outdoor environment for terrestrial wireless communications. In this paper, we analyze the performance of a decode-and-forward (DF) relaying under the combined effect of random fog, pointing errors, and atmospheric turbulence with a negligible line-of-sight (LOS) direct link. We consider a generalized model for the end-to-end channel with independent and not identically distributed (i.ni.d.) pointing errors, random fog with Gamma distributed attenuation coefficient, double generalized gamma (DGG) atmospheric turbulence, and asymmetrical distance between the source and destination. We develop density and distribution functions of signal-to-noise ratio (SNR) under the combined effect of random fog, pointing errors, and atmospheric turbulence (FPT) channel and distribution function for the combined channel with random fog and pointing errors (FP). Using the derived statistical results, we present analytical expressions of the outage probability, average SNR, ergodic rate, and average bit error rate (BER) for both FP and FPT channels in terms of OWC system parameters. We also develop simplified and asymptotic performance analysis to provide insight on the system behavior analytically under various practically relevant scenarios. We demonstrate the mutual effects of channel impairments and pointing errors on the OWC performance, and show that the relaying system provides significant performance improvement compared with the direct transmissions, especially when pointing errors and fog becomes more pronounced.

One of the key steps in Neural Architecture Search (NAS) is to estimate the performance of candidate architectures. Existing methods either directly use the validation performance or learn a predictor to estimate the performance. However, these methods can be either computationally expensive or very inaccurate, which may severely affect the search efficiency and performance. Moreover, as it is very difficult to annotate architectures with accurate performance on specific tasks, learning a promising performance predictor is often non-trivial due to the lack of labeled data. In this paper, we argue that it may not be necessary to estimate the absolute performance for NAS. On the contrary, we may need only to understand whether an architecture is better than a baseline one. However, how to exploit this comparison information as the reward and how to well use the limited labeled data remains two great challenges. In this paper, we propose a novel Contrastive Neural Architecture Search (CTNAS) method which performs architecture search by taking the comparison results between architectures as the reward. Specifically, we design and learn a Neural Architecture Comparator (NAC) to compute the probability of candidate architectures being better than a baseline one. Moreover, we present a baseline updating scheme to improve the baseline iteratively in a curriculum learning manner. More critically, we theoretically show that learning NAC is equivalent to optimizing the ranking over architectures. Extensive experiments in three search spaces demonstrate the superiority of our CTNAS over existing methods.

The Capsule Network is widely believed to be more robust than Convolutional Networks. However, there are no comprehensive comparisons between these two networks, and it is also unknown which components in the CapsNet affect its robustness. In this paper, we first carefully examine the special designs in CapsNet that differ from that of a ConvNet commonly used for image classification. The examination reveals five major new/different components in CapsNet: a transformation process, a dynamic routing layer, a squashing function, a marginal loss other than cross-entropy loss, and an additional class-conditional reconstruction loss for regularization. Along with these major differences, we conduct comprehensive ablation studies on three kinds of robustness, including affine transformation, overlapping digits, and semantic representation. The study reveals that some designs, which are thought critical to CapsNet, actually can harm its robustness, i.e., the dynamic routing layer and the transformation process, while others are beneficial for the robustness. Based on these findings, we propose enhanced ConvNets simply by introducing the essential components behind the CapsNet's success. The proposed simple ConvNets can achieve better robustness than the CapsNet.

Neural architecture search has attracted wide attentions in both academia and industry. To accelerate it, researchers proposed weight-sharing methods which first train a super-network to reuse computation among different operators, from which exponentially many sub-networks can be sampled and efficiently evaluated. These methods enjoy great advantages in terms of computational costs, but the sampled sub-networks are not guaranteed to be estimated precisely unless an individual training process is taken. This paper owes such inaccuracy to the inevitable mismatch between assembled network layers, so that there is a random error term added to each estimation. We alleviate this issue by training a graph convolutional network to fit the performance of sampled sub-networks so that the impact of random errors becomes minimal. With this strategy, we achieve a higher rank correlation coefficient in the selected set of candidates, which consequently leads to better performance of the final architecture. In addition, our approach also enjoys the flexibility of being used under different hardware constraints, since the graph convolutional network has provided an efficient lookup table of the performance of architectures in the entire search space.

Self-attention network (SAN) has recently attracted increasing interest due to its fully parallelized computation and flexibility in modeling dependencies. It can be further enhanced with multi-headed attention mechanism by allowing the model to jointly attend to information from different representation subspaces at different positions (Vaswani et al., 2017). In this work, we propose a novel convolutional self-attention network (CSAN), which offers SAN the abilities to 1) capture neighboring dependencies, and 2) model the interaction between multiple attention heads. Experimental results on WMT14 English-to-German translation task demonstrate that the proposed approach outperforms both the strong Transformer baseline and other existing works on enhancing the locality of SAN. Comparing with previous work, our model does not introduce any new parameters.

Asynchronous distributed machine learning solutions have proven very effective so far, but always assuming perfectly functioning workers. In practice, some of the workers can however exhibit Byzantine behavior, caused by hardware failures, software bugs, corrupt data, or even malicious attacks. We introduce \emph{Kardam}, the first distributed asynchronous stochastic gradient descent (SGD) algorithm that copes with Byzantine workers. Kardam consists of two complementary components: a filtering and a dampening component. The first is scalar-based and ensures resilience against $\frac{1}{3}$ Byzantine workers. Essentially, this filter leverages the Lipschitzness of cost functions and acts as a self-stabilizer against Byzantine workers that would attempt to corrupt the progress of SGD. The dampening component bounds the convergence rate by adjusting to stale information through a generic gradient weighting scheme. We prove that Kardam guarantees almost sure convergence in the presence of asynchrony and Byzantine behavior, and we derive its convergence rate. We evaluate Kardam on the CIFAR-100 and EMNIST datasets and measure its overhead with respect to non Byzantine-resilient solutions. We empirically show that Kardam does not introduce additional noise to the learning procedure but does induce a slowdown (the cost of Byzantine resilience) that we both theoretically and empirically show to be less than $f/n$, where $f$ is the number of Byzantine failures tolerated and $n$ the total number of workers. Interestingly, we also empirically observe that the dampening component is interesting in its own right for it enables to build an SGD algorithm that outperforms alternative staleness-aware asynchronous competitors in environments with honest workers.

In this work, we consider the distributed optimization of non-smooth convex functions using a network of computing units. We investigate this problem under two regularity assumptions: (1) the Lipschitz continuity of the global objective function, and (2) the Lipschitz continuity of local individual functions. Under the local regularity assumption, we provide the first optimal first-order decentralized algorithm called multi-step primal-dual (MSPD) and its corresponding optimal convergence rate. A notable aspect of this result is that, for non-smooth functions, while the dominant term of the error is in $O(1/\sqrt{t})$, the structure of the communication network only impacts a second-order term in $O(1/t)$, where $t$ is time. In other words, the error due to limits in communication resources decreases at a fast rate even in the case of non-strongly-convex objective functions. Under the global regularity assumption, we provide a simple yet efficient algorithm called distributed randomized smoothing (DRS) based on a local smoothing of the objective function, and show that DRS is within a $d^{1/4}$ multiplicative factor of the optimal convergence rate, where $d$ is the underlying dimension.

One of the most significant bottleneck in training large scale machine learning models on parameter server (PS) is the communication overhead, because it needs to frequently exchange the model gradients between the workers and servers during the training iterations. Gradient quantization has been proposed as an effective approach to reducing the communication volume. One key issue in gradient quantization is setting the number of bits for quantizing the gradients. Small number of bits can significantly reduce the communication overhead while hurts the gradient accuracies, and vise versa. An ideal quantization method would dynamically balance the communication overhead and model accuracy, through adjusting the number bits according to the knowledge learned from the immediate past training iterations. Existing methods, however, quantize the gradients either with fixed number of bits, or with predefined heuristic rules. In this paper we propose a novel adaptive quantization method within the framework of reinforcement learning. The method, referred to as MQGrad, formalizes the selection of quantization bits as actions in a Markov decision process (MDP) where the MDP states records the information collected from the past optimization iterations (e.g., the sequence of the loss function values). During the training iterations of a machine learning algorithm, MQGrad continuously updates the MDP state according to the changes of the loss function. Based on the information, MDP learns to select the optimal actions (number of bits) to quantize the gradients. Experimental results based on a benchmark dataset showed that MQGrad can accelerate the learning of a large scale deep neural network while keeping its prediction accuracies.

北京阿比特科技有限公司