亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

A new variant of Newton's method for empirical risk minimization is studied, where at each iteration of the optimization algorithm, the gradient and Hessian of the objective function are replaced by robust estimators taken from existing literature on robust mean estimation for multivariate data. After proving a general theorem about the convergence of successive iterates to a small ball around the population-level minimizer, consequences of the theory in generalized linear models are studied when data are generated from Huber's epsilon-contamination model and/or heavytailed distributions. An algorithm for obtaining robust Newton directions based on the conjugate gradient method is also proposed, which may be more appropriate for high-dimensional settings, and conjectures about the convergence of the resulting algorithm are offered. Compared to robust gradient descent, the proposed algorithm enjoys the faster rates of convergence for successive iterates often achieved by second-order algorithms for convex problems, i.e., quadratic convergence in a neighborhood of the optimum, with a stepsize that may be chosen adaptively via backtracking linesearch.

相關內容

Besov priors are nonparametric priors that can model spatially inhomogeneous functions. They are routinely used in inverse problems and imaging, where they exhibit attractive sparsity-promoting and edge-preserving features. A recent line of work has initiated the study of their asymptotic frequentist convergence properties. In the present paper, we consider the theoretical recovery performance of the posterior distributions associated to Besov-Laplace priors in the density estimation model, under the assumption that the observations are generated by a possibly spatially inhomogeneous true density belonging to a Besov space. We improve on existing results and show that carefully tuned Besov-Laplace priors attain optimal posterior contraction rates. Furthermore, we show that hierarchical procedures involving a hyper-prior on the regularity parameter lead to adaptation to any smoothness level.

A new hybridizable discontinuous Galerkin method, named the CHDG method, is proposed for solving time-harmonic scalar wave propagation problems. This method relies on a standard discontinuous Galerkin scheme with upwind numerical fluxes and high-order polynomial bases. Auxiliary unknowns corresponding to characteristic variables are defined at the interface between the elements, and the physical fields are eliminated to obtain a reduced system. The reduced system can be written as a fixed-point problem that can be solved with stationary iterative schemes. Numerical results with 2D benchmarks are presented to study the performance of the approach. Compared to the standard HDG approach, the properties of the reduced system are improved with CHDG, which is more suited for iterative solution procedures. The condition number of the reduced system is smaller with CHDG than with the standard HDG method. Iterative solution procedures with CGNR or GMRES required smaller numbers of iterations with CHDG.

Over the past decades, cognitive neuroscientists and behavioral economists have recognized the value of describing the process of decision making in detail and modeling the emergence of decisions over time. For example, the time it takes to decide can reveal more about an agent's true hidden preferences than only the decision itself. Similarly, data that track the ongoing decision process such as eye movements or neural recordings contain critical information that can be exploited, even if no decision is made. Here, we argue that artificial intelligence (AI) research would benefit from a stronger focus on insights about how decisions emerge over time and incorporate related process data to improve AI predictions in general and human-AI interactions in particular. First, we introduce a highly established computational framework that assumes decisions to emerge from the noisy accumulation of evidence, and we present related empirical work in psychology, neuroscience, and economics. Next, we discuss to what extent current approaches in multi-agent AI do or do not incorporate process data and models of decision making. Finally, we outline how a more principled inclusion of the evidence-accumulation framework into the training and use of AI can help to improve human-AI interactions in the future.

A general class of the almost instantaneous fixed-to-variable-length (AIFV) codes is proposed, which contains every possible binary code we can make when allowing finite bits of decoding delay. The contribution of the paper lies in the following. (i) Introducing $N$-bit-delay AIFV codes, constructed by multiple code trees with higher flexibility than the conventional AIFV codes. (ii) Proving that the proposed codes can represent any uniquely-encodable and uniquely-decodable variable-to-variable length codes. (iii) Showing how to express codes as multiple code trees with minimum decoding delay. (iv) Formulating the constraints of decodability as the comparison of intervals in the real number line. The theoretical results in this paper are expected to be useful for further study on AIFV codes.

We utilize a discrete version of the notion of degree of freedom to prove a sharp min-entropy-variance inequality for integer valued log-concave random variables. More specifically, we show that the geometric distribution minimizes the min-entropy within the class of log-concave probability sequences with fixed variance. As an application, we obtain a discrete R\'enyi entropy power inequality in the log-concave case, which improves a result of Bobkov, Marsiglietti and Melbourne (2022).

We give a near-optimal sample-pass trade-off for pure exploration in multi-armed bandits (MABs) via multi-pass streaming algorithms: any streaming algorithm with sublinear memory that uses the optimal sample complexity of $O(\frac{n}{\Delta^2})$ requires $\Omega(\frac{\log{(1/\Delta)}}{\log\log{(1/\Delta)}})$ passes. Here, $n$ is the number of arms and $\Delta$ is the reward gap between the best and the second-best arms. Our result matches the $O(\log(\frac{1}{\Delta}))$-pass algorithm of Jin et al. [ICML'21] (up to lower order terms) that only uses $O(1)$ memory and answers an open question posed by Assadi and Wang [STOC'20].

Next Point-of-Interest (POI) recommendation is a critical task in location-based services that aim to provide personalized suggestions for the user's next destination. Previous works on POI recommendation have laid focused on modeling the user's spatial preference. However, existing works that leverage spatial information are only based on the aggregation of users' previous visited positions, which discourages the model from recommending POIs in novel areas. This trait of position-based methods will harm the model's performance in many situations. Additionally, incorporating sequential information into the user's spatial preference remains a challenge. In this paper, we propose Diff-POI: a Diffusion-based model that samples the user's spatial preference for the next POI recommendation. Inspired by the wide application of diffusion algorithm in sampling from distributions, Diff-POI encodes the user's visiting sequence and spatial character with two tailor-designed graph encoding modules, followed by a diffusion-based sampling strategy to explore the user's spatial visiting trends. We leverage the diffusion process and its reversed form to sample from the posterior distribution and optimized the corresponding score function. We design a joint training and inference framework to optimize and evaluate the proposed Diff-POI. Extensive experiments on four real-world POI recommendation datasets demonstrate the superiority of our Diff-POI over state-of-the-art baseline methods. Further ablation and parameter studies on Diff-POI reveal the functionality and effectiveness of the proposed diffusion-based sampling strategy for addressing the limitations of existing methods.

We introduce a new class of Discontinuous Galerkin (DG) methods for solving nonlinear conservation laws on unstructured Voronoi meshes that use a nonconforming Virtual Element basis defined within each polygonal control volume. The basis functions are evaluated as an L2 projection of the virtual basis which remains unknown, along the lines of the Virtual Element Method (VEM). Contrarily to the VEM approach, the new basis functions lead to a nonconforming representation of the solution with discontinuous data across the element boundaries, as typically employed in DG discretizations. To improve the condition number of the resulting mass matrix, an orthogonalization of the full basis is proposed. The discretization in time is carried out following the ADER (Arbitrary order DERivative Riemann problem) methodology, which yields one-step fully discrete schemes that make use of a coupled space-time representation of the numerical solution. The space-time basis functions are constructed as a tensor product of the virtual basis in space and a one-dimensional Lagrange nodal basis in time. The resulting space-time stiffness matrix is stabilized by an extension of the dof-dof stabilization technique adopted in the VEM framework, hence allowing an element-local space-time Galerkin finite element predictor to be evaluated. The novel methods are referred to as VEM-DG schemes, and they are arbitrarily high order accurate in space and time. The new VEM-DG algorithms are rigorously validated against a series of benchmarks in the context of compressible Euler and Navier-Stokes equations. Numerical results are verified with respect to literature reference solutions and compared in terms of accuracy and computational efficiency to those obtained using a standard modal DG scheme with Taylor basis functions. An analysis of the condition number of the mass and space-time stiffness matrix is also forwarded.

Model-based sequential approaches to discrete "black-box" optimization, including Bayesian optimization techniques, often access the same points multiple times for a given objective function in interest, resulting in many steps to find the global optimum. Here, we numerically study the effect of a postprocessing method on Bayesian optimization that strictly prohibits duplicated samples in the dataset. We find the postprocessing method significantly reduces the number of sequential steps to find the global optimum, especially when the acquisition function is of maximum a posterior estimation. Our results provide a simple but general strategy to solve the slow convergence of Bayesian optimization for high-dimensional problems.

In this paper, two novel classes of implicit exponential Runge-Kutta (ERK) methods are studied for solving highly oscillatory systems. Firstly, we analyze the symplectic conditions for two kinds of exponential integrators and obtain the symplectic method. In order to effectively solve highly oscillatory problems, we try to design the highly accurate implicit ERK integrators. By comparing the Taylor series expansion of numerical solution with exact solution, it can be verified that the order conditions of two new kinds of exponential methods are identical to classical Runge-Kutta (RK) methods, which implies that using the coefficients of RK methods, some highly accurate numerical methods are directly formulated. Furthermore, we also investigate the linear stability properties for these exponential methods. Finally, numerical results not only display the long time energy preservation of the symplectic method, but also present the accuracy and efficiency of these formulated methods in comparison with standard ERK methods.

北京阿比特科技有限公司