亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

We study the problem of learning nonparametric distributions in a finite mixture, and establish tight bounds on the sample complexity for learning the component distributions in such models. Namely, we are given i.i.d. samples from a pdf $f$ where $$ f=w_1f_1+w_2f_2, \quad w_1+w_2=1, \quad w_1,w_2>0 $$ and we are interested in learning each component $f_i$. Without any assumptions on $f_i$, this problem is ill-posed. In order to identify the components $f_i$, we assume that each $f_i$ can be written as a convolution of a Gaussian and a compactly supported density $\nu_i$ with $\text{supp}(\nu_1)\cap \text{supp}(\nu_2)=\emptyset$. Our main result shows that $(\frac{1}{\varepsilon})^{\Omega(\log\log \frac{1}{\varepsilon})}$ samples are required for estimating each $f_i$. The proof relies on a quantitative Tauberian theorem that yields a fast rate of approximation with Gaussians, which may be of independent interest. To show this is tight, we also propose an algorithm that uses $(\frac{1}{\varepsilon})^{O(\log\log \frac{1}{\varepsilon})}$ samples to estimate each $f_i$. Unlike existing approaches to learning latent variable models based on moment-matching and tensor methods, our proof instead involves a delicate analysis of an ill-conditioned linear system via orthogonal functions. Combining these bounds, we conclude that the optimal sample complexity of this problem properly lies in between polynomial and exponential, which is not common in learning theory.

相關內容

We study the composite convex optimization problems with a Quasi-Self-Concordant smooth component. This problem class naturally interpolates between classic Self-Concordant functions and functions with Lipschitz continuous Hessian. Previously, the best complexity bounds for this problem class were associated with trust-region schemes and implementations of a ball-minimization oracle. In this paper, we show that for minimizing Quasi-Self-Concordant functions we can use instead the basic Newton Method with Gradient Regularization. For unconstrained minimization, it only involves a simple matrix inversion operation (solving a linear system) at each step. We prove a fast global linear rate for this algorithm, matching the complexity bound of the trust-region scheme, while our method remains especially simple to implement. Then, we introduce the Dual Newton Method, and based on it, develop the corresponding Accelerated Newton Scheme for this problem class, which further improves the complexity factor of the basic method. As a direct consequence of our results, we establish fast global linear rates of simple variants of the Newton Method applied to several practical problems, including Logistic Regression, Soft Maximum, and Matrix Scaling, without requiring additional assumptions on strong or uniform convexity for the target objective.

The theory of finite simple groups is a (rather unexplored) area likely to provide interesting computational problems and modelling tools useful in a cryptographic context. In this note, we review some applications of finite non-abelian simple groups to cryptography and discuss different scenarios in which this theory is clearly central, providing the relevant definitions to make the material accessible to both cryptographers and group theorists, in the hope of stimulating further interaction between these two (non-disjoint) communities. In particular, we look at constructions based on various group-theoretic factorization problems, review group theoretical hash functions, and discuss fully homomorphic encryption using simple groups. The Hidden Subgroup Problem is also briefly discussed in this context.

We present a novel deep learning method for computing eigenvalues of the fractional Schr\"odinger operator. Our approach combines a newly developed loss function with an innovative neural network architecture that incorporates prior knowledge of the problem. These improvements enable our method to handle both high-dimensional problems and problems posed on irregular bounded domains. We successfully compute up to the first 30 eigenvalues for various fractional Schr\"odinger operators. As an application, we share a conjecture to the fractional order isospectral problem that has not yet been studied.

Software systems continuously evolve due to new functionalities, requirements, or maintenance activities. In the context of software evolution, software refactoring has gained a strategic relevance. The space of possible software refactoring is usually very large, as it is given by the combinations of different refactoring actions that can produce software system alternatives. Multi-objective algorithms have shown the ability to discover alternatives by pursuing different objectives simultaneously. Performance of such algorithms in the context of software model refactoring is of paramount importance. Therefore, in this paper, we conduct a performance analysis of three genetic algorithms to compare them in terms of performance and quality of solutions. Our results show that there are significant differences in performance among the algorithms (e.g., PESA2 seems to be the fastest one, while NSGA-II shows the least memory usage).

It is shown that the psychometric test reliability, based on any true-score model with randomly sampled items and conditionally independent errors, converges to 1 as the test length goes to infinity, assuming some fairly general regularity conditions. The asymptotic rate of convergence is given by the Spearman-Brown formula, and for this it is not needed that the items are parallel, or latent unidimensional, or even finite dimensional. Simulations with the 2-parameter logistic item response theory model reveal that there can be a positive bias in the reliability of short multidimensional tests, meaning that applying the Spearman-Brown formula in these cases would lead to overprediction of the reliability that will result from lengthening the tests. For short unidimensional tests under the 2-parameter logistic model the reliabilities are almost unbiased, meaning that application of the Spearman-Brown formula in these cases leads to predictions that are approximately unbiased.

We study the problem of estimating the derivatives of a regression function, which has a wide range of applications as a key nonparametric functional of unknown functions. Standard analysis may be tailored to specific derivative orders, and parameter tuning remains a daunting challenge particularly for high-order derivatives. In this article, we propose a simple plug-in kernel ridge regression (KRR) estimator in nonparametric regression with random design that is broadly applicable for multi-dimensional support and arbitrary mixed-partial derivatives. We provide a non-asymptotic analysis to study the behavior of the proposed estimator in a unified manner that encompasses the regression function and its derivatives, leading to two error bounds for a general class of kernels under the strong $L_\infty$ norm. In a concrete example specialized to kernels with polynomially decaying eigenvalues, the proposed estimator recovers the minimax optimal rate up to a logarithmic factor for estimating derivatives of functions in H\"older and Sobolev classes. Interestingly, the proposed estimator achieves the optimal rate of convergence with the same choice of tuning parameter for any order of derivatives. Hence, the proposed estimator enjoys a \textit{plug-in property} for derivatives in that it automatically adapts to the order of derivatives to be estimated, enabling easy tuning in practice. Our simulation studies show favorable finite sample performance of the proposed method relative to several existing methods and corroborate the theoretical findings on its minimax optimality.

Activation functions are the linchpins of deep learning, profoundly influencing both the representational capacity and training dynamics of neural networks. They shape not only the nature of representations but also optimize convergence rates and enhance generalization potential. Appreciating this critical role, we present the Linear Oscillation (LoC) activation function, defined as $f(x) = x \times \sin(\alpha x + \beta)$. Distinct from conventional activation functions which primarily introduce non-linearity, LoC seamlessly blends linear trajectories with oscillatory deviations. The nomenclature ``Linear Oscillation'' is a nod to its unique attribute of infusing linear activations with harmonious oscillations, capturing the essence of the 'Importance of Confusion'. This concept of ``controlled confusion'' within network activations is posited to foster more robust learning, particularly in contexts that necessitate discerning subtle patterns. Our empirical studies reveal that, when integrated into diverse neural architectures, the LoC activation function consistently outperforms established counterparts like ReLU and Sigmoid. The stellar performance exhibited by the avant-garde Vision Transformer model using LoC further validates its efficacy. This study illuminates the remarkable benefits of the LoC over other prominent activation functions. It champions the notion that intermittently introducing deliberate complexity or ``confusion'' during training can spur more profound and nuanced learning. This accentuates the pivotal role of judiciously selected activation functions in shaping the future of neural network training.

We provide methods which recover planar scene geometry by utilizing the transient histograms captured by a class of close-range time-of-flight (ToF) distance sensor. A transient histogram is a one dimensional temporal waveform which encodes the arrival time of photons incident on the ToF sensor. Typically, a sensor processes the transient histogram using a proprietary algorithm to produce distance estimates, which are commonly used in several robotics applications. Our methods utilize the transient histogram directly to enable recovery of planar geometry more accurately than is possible using only proprietary distance estimates, and consistent recovery of the albedo of the planar surface, which is not possible with proprietary distance estimates alone. This is accomplished via a differentiable rendering pipeline, which simulates the transient imaging process, allowing direct optimization of scene geometry to match observations. To validate our methods, we capture 3,800 measurements of eight planar surfaces from a wide range of viewpoints, and show that our method outperforms the proprietary-distance-estimate baseline by an order of magnitude in most scenarios. We demonstrate a simple robotics application which uses our method to sense the distance to and slope of a planar surface from a sensor mounted on the end effector of a robot arm.

Bregman proximal point algorithm (BPPA) has witnessed emerging machine learning applications, yet its theoretical understanding has been largely unexplored. We study the computational properties of BPPA through learning linear classifiers with separable data, and demonstrate provable algorithmic regularization of BPPA. For any BPPA instantiated with a fixed Bregman divergence, we provide a lower bound of the margin obtained by BPPA with respect to an arbitrarily chosen norm. The obtained margin lower bound differs from the maximal margin by a multiplicative factor, which inversely depends on the condition number of the distance-generating function measured in the dual norm. We show that the dependence on the condition number is tight, thus demonstrating the importance of divergence in affecting the quality of the learned classifiers. We then extend our findings to mirror descent, for which we establish similar connections between the margin and Bregman divergence, together with a non-asymptotic analysis. Numerical experiments on both synthetic and real-world datasets are provided to support our theoretical findings. To the best of our knowledge, the aforementioned findings appear to be new in the literature of algorithmic regularization.

In this paper we investigate computational properties of the Diophantine problem for spherical equations in some classes of finite groups. We classify the complexity of different variations of the problem, e.g., when $G$ is fixed and when $G$ is a part of the input. When the group $G$ is constant or given as multiplication table, we show that the problem always can be solved in polynomial time. On the other hand, for the permutation groups $S_n$ (with $n$ part of the input), the problem is NP-complete. The situation for matrix groups is quite involved: while we exhibit sequences of 2-by-2 matrices where the problem is NP-complete, in the full group $GL(2,p)$ ($p$ prime and part of the input) it can be solved in polynomial time. We also find a similar behaviour with subgroups of matrices of arbitrary dimension over a constant ring.

北京阿比特科技有限公司