亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Recently, there has been a growing interest in efficient numerical algorithms based on tensor networks and low-rank techniques to approximate high-dimensional functions and the numerical solution to high-dimensional PDEs. In this paper, we propose a new tensor rank reduction method that leverages coordinate flows and can greatly increase the efficiency of high-dimensional tensor approximation algorithms. The idea is very simple: given a multivariate function, determine a coordinate transformation so that the function in the new coordinate system has smaller tensor rank. We restrict our analysis to linear coordinate transformations, which give rise to a new class of functions that we refer to as tensor ridge functions. By leveraging coordinate flows and tensor ridge functions, we develop an optimization method based on Riemannian gradient descent for determining a quasi-optimal linear coordinate transformation for tensor rank reduction. The theoretical results we present for rank reduction via linear coordinate transformations can be generalized to larger classes of nonlinear transformations. We demonstrate the effectiveness of the proposed new tensor rank reduction method on prototype function approximation problems, and in computing the numerical solution of the Liouville equation in dimensions three and five.

相關內容

Distributed privacy-preserving regression schemes have been developed and extended in various fields, where multiparty collaboratively and privately run optimization algorithms, e.g., Gradient Descent, to learn a set of optimal parameters. However, traditional Gradient-Descent based methods fail to solve problems which contains objective functions with L1 regularization, such as Lasso regression. In this paper, we present Federated Coordinate Descent, a new distributed scheme called FCD, to address this issue securely under multiparty scenarios. Specifically, through secure aggregation and added perturbations, our scheme guarantees that: (1) no local information is leaked to other parties, and (2) global model parameters are not exposed to cloud servers. The added perturbations can eventually be eliminated by each party to derive a global model with high performance. We show that the FCD scheme fills the gap of multiparty secure Coordinate Descent methods and is applicable for general linear regressions, including linear, ridge and lasso regressions. Theoretical security analysis and experimental results demonstrate that FCD can be performed effectively and efficiently, and provide as low MAE measure as centralized methods under tasks of three types of linear regressions on real-world UCI datasets.

The Strong Exponential Time Hypothesis (SETH) asserts that for every $\varepsilon>0$ there exists $k$ such that $k$-SAT requires time $(2-\varepsilon)^n$. The field of fine-grained complexity has leveraged SETH to prove quite tight conditional lower bounds for dozens of problems in various domains and complexity classes, including Edit Distance, Graph Diameter, Hitting Set, Independent Set, and Orthogonal Vectors. Yet, it has been repeatedly asked in the literature whether SETH-hardness results can be proven for other fundamental problems such as Hamiltonian Path, Independent Set, Chromatic Number, MAX-$k$-SAT, and Set Cover. In this paper, we show that fine-grained reductions implying even $\lambda^n$-hardness of these problems from SETH for any $\lambda>1$, would imply new circuit lower bounds: super-linear lower bounds for Boolean series-parallel circuits or polynomial lower bounds for arithmetic circuits (each of which is a four-decade open question). We also extend this barrier result to the class of parameterized problems. Namely, for every $\lambda>1$ we conditionally rule out fine-grained reductions implying SETH-based lower bounds of $\lambda^k$ for a number of problems parameterized by the solution size $k$. Our main technical tool is a new concept called polynomial formulations. In particular, we show that many problems can be represented by relatively succinct low-degree polynomials, and that any problem with such a representation cannot be proven SETH-hard (without proving new circuit lower bounds).

Outlier detection (OD) is a key learning task for finding rare and deviant data samples, with many time-critical applications such as fraud detection and intrusion detection. In this work, we propose TOD, the first tensor-based system for efficient and scalable outlier detection on distributed multi-GPU machines. A key idea behind TOD is decomposing complex OD applications into a small collection of basic tensor algebra operators. This decomposition enables TOD to accelerate OD computations by leveraging recent advances in deep learning infrastructure in both hardware and software. Moreover, to deploy memory-intensive OD applications on modern GPUs with limited on-device memory, we introduce two key techniques. First, provable quantization speeds up OD computations and reduces its memory footprint by automatically performing specific floating-point operations in lower precision while provably guaranteeing no accuracy loss. Second, to exploit the aggregated compute resources and memory capacity of multiple GPUs, we introduce automatic batching, which decomposes OD computations into small batches for both sequential execution on a single GPU and parallel execution on multiple GPUs. TOD supports a diverse set of OD algorithms. Extensive evaluation on 11 real and 3 synthetic OD datasets shows that TOD is on average 10.9x faster than the leading CPU-based OD system PyOD (with a maximum speedup of 38.9x), and can handle much larger datasets than existing GPU-based OD systems. In addition, TOD allows easy integration of new OD operators, enabling fast prototyping of emerging and yet-to-discovered OD algorithms.

In this paper, we investigate the matrix estimation problem in the multi-response regression model with measurement errors. A nonconvex error-corrected estimator based on a combination of the amended loss function and the nuclear norm regularizer is proposed to estimate the matrix parameter. Then under the (near) low-rank assumption, we analyse statistical and computational theoretical properties of global solutions of the nonconvex regularized estimator from a general point of view. In the statistical aspect, we establish the nonasymptotic recovery bound for any global solution of the nonconvex estimator, under restricted strong convexity on the loss function. In the computational aspect, we solve the nonconvex optimization problem via the proximal gradient method. The algorithm is proved to converge to a near-global solution and achieve a linear convergence rate. In addition, we also verify sufficient conditions for the general results to be held, in order to obtain probabilistic consequences for specific types of measurement errors, including the additive noise and missing data. Finally, theoretical consequences are demonstrated by several numerical experiments on corrupted errors-in-variables multi-response regression models. Simulation results reveal excellent consistency with our theory under high-dimensional scaling.

We present an algorithm for compressing the radiosity view factor model commonly used in radiation heat transfer and computer graphics. We use a format inspired by the hierarchical off-diagonal low rank format, where elements are recursively partitioned using a quadtree or octree and blocks are compressed using a sparse singular value decomposition -- the hierarchical matrix is assembled using dynamic programming. The motivating application is time-dependent thermal modeling on vast planetary surfaces, with a focus on permanently shadowed craters which receive energy through indirect irradiance. In this setting, shape models are comprised of a large number of triangular facets which conform to a rough surface. At each time step, a quadratic number of triangle-to-triangle scattered fluxes must be summed; that is, as the sun moves through the sky, we must solve the same view factor system of equations for a potentially unlimited number of time-varying righthand sides. We first conduct numerical experiments with a synthetic spherical cap-shaped crater, where the equilibrium temperature is analytically available. We also test our implementation with triangle meshes of planetary surfaces derived from digital elevation models recovered by orbiting spacecrafts. Our results indicate that the compressed view factor matrix can be assembled in quadratic time, which is comparable to the time it takes to assemble the full view matrix itself. Memory requirements during assembly are reduced by a large factor. Finally, for a range of compression tolerances, the size of the compressed view factor matrix and the speed of the resulting matrix vector product both scale linearly (as opposed to quadratically for the full matrix), resulting in orders of magnitude savings in processing time and memory space.

The increasing size of recently proposed Neural Networks makes it hard to implement them on embedded devices, where memory, battery and computational power are a non-trivial bottleneck. For this reason during the last years network compression literature has been thriving and a large number of solutions has been been published to reduce both the number of operations and the parameters involved with the models. Unfortunately, most of these reducing techniques are actually heuristic methods and usually require at least one re-training step to recover the accuracy. The need of procedures for model reduction is well-known also in the fields of Verification and Performances Evaluation, where large efforts have been devoted to the definition of quotients that preserve the observable underlying behaviour. In this paper we try to bridge the gap between the most popular and very effective network reduction strategies and formal notions, such as lumpability, introduced for verification and evaluation of Markov Chains. Elaborating on lumpability we propose a pruning approach that reduces the number of neurons in a network without using any data or fine-tuning, while completely preserving the exact behaviour. Relaxing the constraints on the exact definition of the quotienting method we can give a formal explanation of some of the most common reduction techniques.

Stochastic kriging has been widely employed for simulation metamodeling to predict the response surface of complex simulation models. However, its use is limited to cases where the design space is low-dimensional because, in general, the sample complexity (i.e., the number of design points required for stochastic kriging to produce an accurate prediction) grows exponentially in the dimensionality of the design space. The large sample size results in both a prohibitive sample cost for running the simulation model and a severe computational challenge due to the need to invert large covariance matrices. Based on tensor Markov kernels and sparse grid experimental designs, we develop a novel methodology that dramatically alleviates the curse of dimensionality. We show that the sample complexity of the proposed methodology grows only slightly in the dimensionality, even under model misspecification. We also develop fast algorithms that compute stochastic kriging in its exact form without any approximation schemes. We demonstrate via extensive numerical experiments that our methodology can handle problems with a design space of more than 10,000 dimensions, improving both prediction accuracy and computational efficiency by orders of magnitude relative to typical alternative methods in practice.

We study the problem of high-dimensional sparse linear regression in a distributed setting under both computational and communication constraints. Specifically, we consider a star topology network whereby several machines are connected to a fusion center, with whom they can exchange relatively short messages. Each machine holds noisy samples from a linear regression model with the same unknown sparse $d$-dimensional vector of regression coefficients $\theta$. The goal of the fusion center is to estimate the vector $\theta$ and its support using few computations and limited communication at each machine. In this work, we consider distributed algorithms based on Orthogonal Matching Pursuit (OMP) and theoretically study their ability to exactly recover the support of $\theta$. We prove that under certain conditions, even at low signal-to-noise-ratios where individual machines are unable to detect the support of $\theta$, distributed-OMP methods correctly recover it with total communication sublinear in $d$. In addition, we present simulations that illustrate the performance of distributed OMP-based algorithms and show that they perform similarly to more sophisticated and computationally intensive methods, and in some cases even outperform them.

Variational Bayesian posterior inference often requires simplifying approximations such as mean-field parametrisation to ensure tractability. However, prior work has associated the variational mean-field approximation for Bayesian neural networks with underfitting in the case of small datasets or large model sizes. In this work, we show that invariances in the likelihood function of over-parametrised models contribute to this phenomenon because these invariances complicate the structure of the posterior by introducing discrete and/or continuous modes which cannot be well approximated by Gaussian mean-field distributions. In particular, we show that the mean-field approximation has an additional gap in the evidence lower bound compared to a purpose-built posterior that takes into account the known invariances. Importantly, this invariance gap is not constant; it vanishes as the approximation reverts to the prior. We proceed by first considering translation invariances in a linear model with a single data point in detail. We show that, while the true posterior can be constructed from a mean-field parametrisation, this is achieved only if the objective function takes into account the invariance gap. Then, we transfer our analysis of the linear model to neural networks. Our analysis provides a framework for future work to explore solutions to the invariance problem.

With the rapid increase of large-scale, real-world datasets, it becomes critical to address the problem of long-tailed data distribution (i.e., a few classes account for most of the data, while most classes are under-represented). Existing solutions typically adopt class re-balancing strategies such as re-sampling and re-weighting based on the number of observations for each class. In this work, we argue that as the number of samples increases, the additional benefit of a newly added data point will diminish. We introduce a novel theoretical framework to measure data overlap by associating with each sample a small neighboring region rather than a single point. The effective number of samples is defined as the volume of samples and can be calculated by a simple formula $(1-\beta^{n})/(1-\beta)$, where $n$ is the number of samples and $\beta \in [0,1)$ is a hyperparameter. We design a re-weighting scheme that uses the effective number of samples for each class to re-balance the loss, thereby yielding a class-balanced loss. Comprehensive experiments are conducted on artificially induced long-tailed CIFAR datasets and large-scale datasets including ImageNet and iNaturalist. Our results show that when trained with the proposed class-balanced loss, the network is able to achieve significant performance gains on long-tailed datasets.

北京阿比特科技有限公司