亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

We are concerned with the problem of decomposing the parameter space of a parametric system of polynomial equations, and possibly some polynomial inequality constraints, with respect to the number of real solutions that the system attains. Previous studies apply a two step approach to this problem, where first the discriminant variety of the system is computed via a Groebner Basis (GB), and then a Cylindrical Algebraic Decomposition (CAD) of this is produced to give the desired computation. However, even on some reasonably small applied examples this process is too expensive, with computation of the discriminant variety alone infeasible. In this paper we develop new approaches to build the discriminant variety using resultant methods (the Dixon resultant and a new method using iterated univariate resultants). This reduces the complexity compared to GB and allows for a previous infeasible example to be tackled. We demonstrate the benefit by giving a symbolic solution to a problem from population dynamics -- the analysis of the steady states of three connected populations which exhibit Allee effects - which previously could only be tackled numerically.

相關內容

We study a new two-time-scale stochastic gradient method for solving optimization problems, where the gradients are computed with the aid of an auxiliary variable under samples generated by time-varying Markov random processes parameterized by the underlying optimization variable. These time-varying samples make gradient directions in our update biased and dependent, which can potentially lead to the divergence of the iterates. In our two-time-scale approach, one scale is to estimate the true gradient from these samples, which is then used to update the estimate of the optimal solution. While these two iterates are implemented simultaneously, the former is updated "faster" (using bigger step sizes) than the latter (using smaller step sizes). Our first contribution is to characterize the finite-time complexity of the proposed two-time-scale stochastic gradient method. In particular, we provide explicit formulas for the convergence rates of this method under different structural assumptions, namely, strong convexity, convexity, the Polyak-Lojasiewicz condition, and general non-convexity. We apply our framework to two problems in control and reinforcement learning. First, we look at the standard online actor-critic algorithm over finite state and action spaces and derive a convergence rate of O(k^(-2/5)), which recovers the best known rate derived specifically for this problem. Second, we study an online actor-critic algorithm for the linear-quadratic regulator and show that a convergence rate of O(k^(-2/3)) is achieved. This is the first time such a result is known in the literature. Finally, we support our theoretical analysis with numerical simulations where the convergence rates are visualized.

Operator learning for complex nonlinear operators is increasingly common in modeling physical systems. However, training machine learning methods to learn such operators requires a large amount of expensive, high-fidelity data. In this work, we present a composite Deep Operator Network (DeepONet) for learning using two datasets with different levels of fidelity, to accurately learn complex operators when sufficient high-fidelity data is not available. Additionally, we demonstrate that the presence of low-fidelity data can improve the predictions of physics-informed learning with DeepONets.

Covariance estimation for matrix-valued data has received an increasing interest in applications. Unlike previous works that rely heavily on matrix normal distribution assumption and the requirement of fixed matrix size, we propose a class of distribution-free regularized covariance estimation methods for high-dimensional matrix data under a separability condition and a bandable covariance structure. Under these conditions, the original covariance matrix is decomposed into a Kronecker product of two bandable small covariance matrices representing the variability over row and column directions. We formulate a unified framework for estimating bandable covariance, and introduce an efficient algorithm based on rank one unconstrained Kronecker product approximation. The convergence rates of the proposed estimators are established, and the derived minimax lower bound shows our proposed estimator is rate-optimal under certain divergence regimes of matrix size. We further introduce a class of robust covariance estimators and provide theoretical guarantees to deal with heavy-tailed data. We demonstrate the superior finite-sample performance of our methods using simulations and real applications from a gridded temperature anomalies dataset and a S&P 500 stock data analysis.

We propose a novel framework for learning a low-dimensional representation of data based on nonlinear dynamical systems, which we call dynamical dimension reduction (DDR). In the DDR model, each point is evolved via a nonlinear flow towards a lower-dimensional subspace; the projection onto the subspace gives the low-dimensional embedding. Training the model involves identifying the nonlinear flow and the subspace. Following the equation discovery method, we represent the vector field that defines the flow using a linear combination of dictionary elements, where each element is a pre-specified linear/nonlinear candidate function. A regularization term for the average total kinetic energy is also introduced and motivated by optimal transport theory. We prove that the resulting optimization problem is well-posed and establish several properties of the DDR method. We also show how the DDR method can be trained using a gradient-based optimization method, where the gradients are computed using the adjoint method from optimal control theory. The DDR method is implemented and compared on synthetic and example datasets to other dimension reductions methods, including PCA, t-SNE, and Umap.

Stability certification and identification of the stabilizable operating region of a dynamical system are two important concerns to ensure its operational safety/security and robustness. With the advent of machine-learning tools, these issues are especially important for systems with machine-learned components in the feedback loop. Here, in presence of unknown discrete variation (DV) of its parameters within a bounded range, a system controlled by a static feedback controller in which the closed-loop (CL) equilibria are subject to variation-induced drift is equivalently represented using a class of time-invariant systems, each with the same control policy. To develop a general theory for stability and stabilizability of such a class of neural-network (NN) controlled nonlinear systems, a Lyapunov-based convex stability certificate is proposed and is further used to devise an estimate of a local Lipschitz upper bound for the NN and a corresponding operating domain in the state space containing an initialization set, starting from where the CL local asymptotic stability of each system in the class is guaranteed, while the trajectory of the original system remains confined to the domain if the DV of the parameters satisfies a certain quasi-stationarity condition. To compute such a robustly stabilizing NN controller, a stability-guaranteed training (SGT) algorithm is also proposed. The effectiveness of the proposed framework is demonstrated using illustrative examples.

The stochastic gradient Langevin Dynamics is one of the most fundamental algorithms to solve sampling problems and non-convex optimization appearing in several machine learning applications. Especially, its variance reduced versions have nowadays gained particular attention. In this paper, we study two variants of this kind, namely, the Stochastic Variance Reduced Gradient Langevin Dynamics and the Stochastic Recursive Gradient Langevin Dynamics. We prove their convergence to the objective distribution in terms of KL-divergence under the sole assumptions of smoothness and Log-Sobolev inequality which are weaker conditions than those used in prior works for these algorithms. With the batch size and the inner loop length set to $\sqrt{n}$, the gradient complexity to achieve an $\epsilon$-precision is $\tilde{O}((n+dn^{1/2}\epsilon^{-1})\gamma^2 L^2\alpha^{-2})$, which is an improvement from any previous analyses. We also show some essential applications of our result to non-convex optimization.

Given a matrix $A$ and vector $b$ with polynomial entries in $d$ real variables $\delta=(\delta_1,\ldots,\delta_d)$ we consider the following notion of feasibility: the pair $(A,b)$ is locally feasible if there exists an open neighborhood $U$ of $0$ such that for every $\delta\in U$ there exists $x$ satisfying $A(\delta)x\ge b(\delta)$ entry-wise. For $d=1$ we construct a polynomial time algorithm for deciding local feasibility. For $d \ge 2$ we show local feasibility is NP-hard. As an application (which was the primary motivation for this work) we give a computer-assisted proof of ergodicity of the following elementary 1D cellular automaton: given the current state $\eta_t \in \{0,1\}^{\mathbb{Z}}$ the next state $\eta_{t+1}(n)$ at each vertex $n\in \mathbb{Z}$ is obtained by $\eta_{t+1}(n)= \text{NAND}\big(\text{BSC}_\delta(\eta_t(n-1)), \text{BSC}_\delta(\eta_t(n))\big)$. Here the binary symmetric channel $\text{BSC}_\delta$ takes a bit as input and flips it with probability $\delta$ (and leaves it unchanged with probability $1-\delta$). We also consider the problem of broadcasting information on the 2D-grid of noisy binary-symmetric channels $\text{BSC}_\delta$, where each node may apply an arbitrary processing function to its input bits. We prove that there exists $\delta_0'>0$ such that for all noise levels $0<\delta<\delta_0'$ it is impossible to broadcast information for any processing function, as conjectured in Makur, Mossel, Polyanskiy (ISIT 2021).

Multi-fidelity models are of great importance due to their capability of fusing information coming from different simulations and sensors. In the context of Gaussian process regression we can exploit low-fidelity models to better capture the latent manifold thus improving the accuracy of the model. We focus on the approximation of high-dimensional scalar functions with low intrinsic dimensionality. By introducing a low dimensional bias in a chain of Gaussian processes with different fidelities we can fight the curse of dimensionality affecting these kind of quantities of interest, especially for many-query applications. In particular we seek a gradient-based reduction of the parameter space through linear active subspaces or a nonlinear transformation of the input space. Then we build a low-fidelity response surface based on such reduction, thus enabling multi-fidelity Gaussian process regression without the need of running new simulations with simplified physical models. This has a great potential in the data scarcity regime affecting many engineering applications. In this work we present a new multi-fidelity approach -- starting from the preliminary analysis conducted in Romor et al. 2020 -- involving active subspaces and nonlinear level-set learning method. The proposed numerical method is tested on two high-dimensional benchmark functions, and on a more complex car aerodynamics problem. We show how a low intrinsic dimensionality bias can increase the accuracy of Gaussian process response surfaces.

In 1954, Alston S. Householder published Principles of Numerical Analysis, one of the first modern treatments on matrix decomposition that favored a (block) LU decomposition-the factorization of a matrix into the product of lower and upper triangular matrices. And now, matrix decomposition has become a core technology in machine learning, largely due to the development of the back propagation algorithm in fitting a neural network. The sole aim of this survey is to give a self-contained introduction to concepts and mathematical tools in numerical linear algebra and matrix analysis in order to seamlessly introduce matrix decomposition techniques and their applications in subsequent sections. However, we clearly realize our inability to cover all the useful and interesting results concerning matrix decomposition and given the paucity of scope to present this discussion, e.g., the separated analysis of the Euclidean space, Hermitian space, Hilbert space, and things in the complex domain. We refer the reader to literature in the field of linear algebra for a more detailed introduction to the related fields.

Since deep neural networks were developed, they have made huge contributions to everyday lives. Machine learning provides more rational advice than humans are capable of in almost every aspect of daily life. However, despite this achievement, the design and training of neural networks are still challenging and unpredictable procedures. To lower the technical thresholds for common users, automated hyper-parameter optimization (HPO) has become a popular topic in both academic and industrial areas. This paper provides a review of the most essential topics on HPO. The first section introduces the key hyper-parameters related to model training and structure, and discusses their importance and methods to define the value range. Then, the research focuses on major optimization algorithms and their applicability, covering their efficiency and accuracy especially for deep learning networks. This study next reviews major services and toolkits for HPO, comparing their support for state-of-the-art searching algorithms, feasibility with major deep learning frameworks, and extensibility for new modules designed by users. The paper concludes with problems that exist when HPO is applied to deep learning, a comparison between optimization algorithms, and prominent approaches for model evaluation with limited computational resources.

北京阿比特科技有限公司