亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

We propose a decomposition method for the spectral peaks in an observed frequency spectrum, which is efficiently acquired by utilizing the Fast Fourier Transform. In contrast to the traditional methods of waveform fitting on the spectrum, we optimize the problem from a more robust perspective. We model the peaks in spectrum as pseudo-symmetric functions, where the only constraint is a nonincreasing behavior around a central frequency when the distance increases. Our approach is more robust against arbitrary distortion, interference and noise on the spectrum that may be caused by an observation system. The time complexity of our method is linear, i.e., $O(N)$ per extracted spectral peak. Moreover, the decomposed spectral peaks show a pseudo-orthogonal behavior, where they conform to a power preserving equality.

相關內容

Non-rigid registration, which deforms a source shape in a non-rigid way to align with a target shape, is a classical problem in computer vision. Such problems can be challenging because of imperfect data (noise, outliers and partial overlap) and high degrees of freedom. Existing methods typically adopt the $\ell_{p}$ type robust norm to measure the alignment error and regularize the smoothness of deformation, and use a proximal algorithm to solve the resulting non-smooth optimization problem. However, the slow convergence of such algorithms limits their wide applications. In this paper, we propose a formulation for robust non-rigid registration based on a globally smooth robust norm for alignment and regularization, which can effectively handle outliers and partial overlaps. The problem is solved using the majorization-minimization algorithm, which reduces each iteration to a convex quadratic problem with a closed-form solution. We further apply Anderson acceleration to speed up the convergence of the solver, enabling the solver to run efficiently on devices with limited compute capability. Extensive experiments demonstrate the effectiveness of our method for non-rigid alignment between two shapes with outliers and partial overlaps, with quantitative evaluation showing that it outperforms state-of-the-art methods in terms of registration accuracy and computational speed. The source code is available at //github.com/yaoyx689/AMM_NRR.

Cell-free massive MIMO is a variant of multiuser MIMO and massive MIMO, in which the total number of antennas $LM$ is distributed among the $L$ remote radio units (RUs) in the system, enabling macrodiversity and joint processing. Due to pilot contamination and system scalability, each RU can only serve a limited number of users. Obtaining the optimal number of users simultaneously served on one resource block (RB) by the $L$ RUs regarding the sum spectral efficiency (SE) is not a simple challenge though, as many of the system parameters are intertwined. For example, the dimension $\tau_p$ of orthogonal Demodulation Reference Signal (DMRS) pilots limits the number of users that an RU can serve. Thus, depending on $\tau_p$, the optimal user load yielding the maximum sum SE will vary. Another key parameter is the users' uplink transmit power $P^{\rm ue}_{\rm tx}$, where a trade-off between users in outage, interference and energy inefficiency exists. We study the effect of multiple parameters in cell-free massive MIMO on the sum SE and user outage, as well as the performance of different levels of RU antenna distribution. We provide extensive numerical investigations to illuminate the behavior of the system SE with respect to the various parameters, including the effect of the system load, i.e., the number of active users to be served on any RB. The results show that in general a system with many RUs and few RU antennas yields the largest sum SE, where the benefits of distributed antennas reduce in very dense networks.

A recent goal in the theory of deep learning is to identify how neural networks can escape the "lazy training," or Neural Tangent Kernel (NTK) regime, where the network is coupled with its first order Taylor expansion at initialization. While the NTK is minimax optimal for learning dense polynomials (Ghorbani et al, 2021), it cannot learn features, and hence has poor sample complexity for learning many classes of functions including sparse polynomials. Recent works have thus aimed to identify settings where gradient based algorithms provably generalize better than the NTK. One such example is the "QuadNTK" approach of Bai and Lee (2020), which analyzes the second-order term in the Taylor expansion. Bai and Lee (2020) show that the second-order term can learn sparse polynomials efficiently; however, it sacrifices the ability to learn general dense polynomials. In this paper, we analyze how gradient descent on a two-layer neural network can escape the NTK regime by utilizing a spectral characterization of the NTK (Montanari and Zhong, 2020) and building on the QuadNTK approach. We first expand upon the spectral analysis to identify "good" directions in parameter space in which we can move without harming generalization. Next, we show that a wide two-layer neural network can jointly use the NTK and QuadNTK to fit target functions consisting of a dense low-degree term and a sparse high-degree term -- something neither the NTK nor the QuadNTK can do on their own. Finally, we construct a regularizer which encourages our parameter vector to move in the "good" directions, and show that gradient descent on the regularized loss will converge to a global minimizer, which also has low test error. This yields an end to end convergence and generalization guarantee with provable sample complexity improvement over both the NTK and QuadNTK on their own.

In this work we investigate a 1D evolution equation involving a divergence form operator where the diffusion coefficient inside the divergence is sign changing. Equivalently the evolution equation of interest can be interpreted as behaving locally like a heat equation, and involving a transmission condition at some interface that prescribes in particular a change of sign of the first order space derivatives across the interface. We especially focus on the construction of fundamental solutions for the evolution equation. As the second order operator involved in the evolution equation is not elliptic, this cannot be performed by standard tools for parabolic PDEs. However we manage in a first time to provide a spectral representation of the semigroup associated to the equation, which leads to a first expression of the fundamental solution. In a second time, examining the case when the diffusion coefficient is piecewise constant but remains positive, we do probabilistic computations involving the killed Skew Brownian Motion (SBM), that provide a certain explicit expression of the fundamental solution for the positive case. It turns out that this expression also provides a fundamental solution for the case when the coefficient is sign changing, and can be interpreted as defining a pseudo SBM. This pseudo SBM can be approached by a rescaled pseudo asymmetric random walk. We infer from these different results various approximation schemes that we test numerically.

Low-rank approximations are essential in modern data science. The interpolative decomposition provides one such approximation. Its distinguishing feature is that it reuses columns from the original matrix. This enables it to preserve matrix properties such as sparsity and non-negativity. It also helps save space in memory. In this work, we introduce two optimized algorithms to construct an interpolative decomposition along with numerical evidence that they outperform the current state of the art.

We study sequential decision making problems aimed at maximizing the expected total reward while satisfying a constraint on the expected total utility. We employ the natural policy gradient method to solve the discounted infinite-horizon optimal control problem for Constrained Markov Decision Processes (constrained MDPs). Specifically, we propose a new Natural Policy Gradient Primal-Dual (NPG-PD) method that updates the primal variable via natural policy gradient ascent and the dual variable via projected sub-gradient descent. Although the underlying maximization involves a nonconcave objective function and a nonconvex constraint set, under the softmax policy parametrization we prove that our method achieves global convergence with sublinear rates regarding both the optimality gap and the constraint violation. Such convergence is independent of the size of the state-action space, i.e., it is~dimension-free. Furthermore, for log-linear and general smooth policy parametrizations, we establish sublinear convergence rates up to a function approximation error caused by restricted policy parametrization. We also provide convergence and finite-sample complexity guarantees for two sample-based NPG-PD algorithms. Finally, we use computational experiments to showcase the merits and the effectiveness of our approach.

Analyzing time series in the frequency domain enables the development of powerful tools for investigating the second-order characteristics of multivariate stochastic processes. Parameters like the spectral density matrix and its inverse, the coherence or the partial coherence, encode comprehensively the complex linear relations between the component processes of the multivariate system. In this paper, we develop inference procedures for such parameters in a high-dimensional, time series setup. In particular, we first focus on the derivation of consistent estimators of the coherence and, more importantly, of the partial coherence which possess manageable limiting distributions that are suitable for testing purposes. Statistical tests of the hypothesis that the maximum over frequencies of the coherence, respectively, of the partial coherence, do not exceed a prespecified threshold value are developed. Our approach allows for testing hypotheses for individual coherences and/or partial coherences as well as for multiple testing of large sets of such parameters. In the latter case, a consistent procedure to control the false discovery rate is developed. The finite sample performance of the inference procedures proposed is investigated by means of simulations and applications to the construction of graphical interaction models for brain connectivity based on EEG data are presented.

Hierarchical matrices provide a powerful representation for significantly reducing the computational complexity associated with dense kernel matrices. For general kernel functions, interpolation-based methods are widely used for the efficient construction of hierarchical matrices. In this paper, we present a fast hierarchical data reduction (HiDR) procedure with $O(n)$ complexity for the memory-efficient construction of hierarchical matrices with nested bases where $n$ is the number of data points. HiDR aims to reduce the given data in a hierarchical way so as to obtain $O(1)$ representations for all nearfield and farfield interactions. Based on HiDR, a linear complexity $\mathcal{H}^2$ matrix construction algorithm is proposed. The use of data-driven methods enables {better efficiency than other general-purpose methods} and flexible computation without accessing the kernel function. Experiments demonstrate significantly improved memory efficiency of the proposed data-driven method compared to interpolation-based methods over a wide range of kernels. Though the method is not optimized for any special kernel, benchmark experiments for the Coulomb kernel show that the proposed general-purpose algorithm offers competitive performance for hierarchical matrix construction compared to several state-of-the-art algorithms for the Coulomb kernel.

Gradient based meta-learning methods are prone to overfit on the meta-training set, and this behaviour is more prominent with large and complex networks. Moreover, large networks restrict the application of meta-learning models on low-power edge devices. While choosing smaller networks avoid these issues to a certain extent, it affects the overall generalization leading to reduced performance. Clearly, there is an approximately optimal choice of network architecture that is best suited for every meta-learning problem, however, identifying it beforehand is not straightforward. In this paper, we present MetaDOCK, a task-specific dynamic kernel selection strategy for designing compressed CNN models that generalize well on unseen tasks in meta-learning. Our method is based on the hypothesis that for a given set of similar tasks, not all kernels of the network are needed by each individual task. Rather, each task uses only a fraction of the kernels, and the selection of the kernels per task can be learnt dynamically as a part of the inner update steps. MetaDOCK compresses the meta-model as well as the task-specific inner models, thus providing significant reduction in model size for each task, and through constraining the number of active kernels for every task, it implicitly mitigates the issue of meta-overfitting. We show that for the same inference budget, pruned versions of large CNN models obtained using our approach consistently outperform the conventional choices of CNN models. MetaDOCK couples well with popular meta-learning approaches such as iMAML. The efficacy of our method is validated on CIFAR-fs and mini-ImageNet datasets, and we have observed that our approach can provide improvements in model accuracy of up to 2% on standard meta-learning benchmark, while reducing the model size by more than 75%.

Since hardware resources are limited, the objective of training deep learning models is typically to maximize accuracy subject to the time and memory constraints of training and inference. We study the impact of model size in this setting, focusing on Transformer models for NLP tasks that are limited by compute: self-supervised pretraining and high-resource machine translation. We first show that even though smaller Transformer models execute faster per iteration, wider and deeper models converge in significantly fewer steps. Moreover, this acceleration in convergence typically outpaces the additional computational overhead of using larger models. Therefore, the most compute-efficient training strategy is to counterintuitively train extremely large models but stop after a small number of iterations. This leads to an apparent trade-off between the training efficiency of large Transformer models and the inference efficiency of small Transformer models. However, we show that large models are more robust to compression techniques such as quantization and pruning than small models. Consequently, one can get the best of both worlds: heavily compressed, large models achieve higher accuracy than lightly compressed, small models.

北京阿比特科技有限公司