亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Coverage of an inaccessible or challenging region with potential health and safety hazards, such as in a volcanic region, is difficult yet crucial from scientific and meteorological perspectives. Areas contained within the region often provide valuable information of varying importance. We present an algorithm to optimally cover a volcanic region in Hawai`i with an unmanned aerial vehicle (UAV). The target region is assigned with a nonuniform coverage importance score distribution. For a specified battery capacity of the UAV, the optimization problem seeks the path that maximizes the total coverage area and the accumulated importance score while penalizing the revisiting of the same area. Trajectories are generated offline for the UAV based on the available power and coverage information map. The optimal trajectory minimizes the unspent battery power while enforcing that the UAV returns to its starting location. This multi-objective optimization problem is solved by using sequential quadratic programming. The details of the competitive optimization problem are discussed along with the analysis and simulation results to demonstrate the applicability of the proposed algorithm.

相關內容

Min-max optimization problems arise in several key machine learning setups, including adversarial learning and generative modeling. In their general form, in absence of convexity/concavity assumptions, finding pure equilibria of the underlying two-player zero-sum game is computationally hard [Daskalakis et al., 2021]. In this work we focus instead in finding mixed equilibria, and consider the associated lifted problem in the space of probability measures. By adding entropic regularization, our main result establishes global convergence towards the global equilibrium by using simultaneous gradient ascent-descent with respect to the Wasserstein metric -- a dynamics that admits efficient particle discretization in high-dimensions, as opposed to entropic mirror descent. We complement this positive result with a related entropy-regularized loss which is not bilinear but still convex-concave in the Wasserstein geometry, and for which simultaneous dynamics do not converge yet timescale separation does. Taken together, these results showcase the benign geometry of bilinear games in the space of measures, enabling particle dynamics with global qualitative convergence guarantees.

Real-time planning for a combined problem of target assignment and path planning for multiple agents, also known as the unlabeled version of Multi-Agent Path Finding (MAPF), is crucial for high-level coordination in multi-agent systems, e.g., pattern formation by robot swarms. This paper studies two aspects of unlabeled-MAPF: (1) offline scenario: solving large instances by centralized approaches with small computation time, and (2) online scenario: executing unlabeled-MAPF despite timing uncertainties of real robots. For this purpose, we propose TSWAP, a novel sub-optimal complete algorithm, which takes an arbitrary initial target assignment then repeats one-timestep path planning with target swapping. TSWAP can adapt to both offline and online scenarios. We empirically demonstrate that Offline TSWAP is highly scalable; providing near-optimal solutions while reducing runtime by orders of magnitude compared to existing approaches. In addition, we present the benefits of Online TSWAP, such as delay tolerance, through real-robot demos.

Age of Information (AoI), which measures the time elapsed since the generation of the last received packet at the destination, is a new metric for real-time status update tracking applications. In this paper, we consider a status-update system in which a source node samples updates and sends them to an edge server over a delay channel. The received updates are processed by the server with an infinite buffer and then delivered to a destination. The channel can send only one update at a time, and the server can process one at a time as well. The source node applies generate-at-will model according to the state of the channel, the edge server, and the buffer. We aim to minimize the average AoI with \emph{independent and identically distributed} transmission time and processing time. We consider three online scheduling policies. The first one is the optimal long wait policy, under which the source node only transmits a new packet after the old one is delivered. Secondly, we propose a peak age threshold policy, under which the source node determines the sending time based on the estimated peak age of information (PAoI). Finally, we improve the peak age threshold policy by considering a postponed plan to reduce the waiting time in the buffer. The AoI performance under these policies is illustrated by numerical results with different parameters.

We initiate a systematic study of algorithms that are both differentially private and run in sublinear time for several problems in which the goal is to estimate natural graph parameters. Our main result is a differentially-private $(1+\rho)$-approximation algorithm for the problem of computing the average degree of a graph, for every $\rho>0$. The running time of the algorithm is roughly the same as its non-private version proposed by Goldreich and Ron (Sublinear Algorithms, 2005). We also obtain the first differentially-private sublinear-time approximation algorithms for the maximum matching size and the minimum vertex cover size of a graph. An overarching technique we employ is the notion of coupled global sensitivity of randomized algorithms. Related variants of this notion of sensitivity have been used in the literature in ad-hoc ways. Here we formalize the notion and develop it as a unifying framework for privacy analysis of randomized approximation algorithms.

Motivated by robust dynamic resource allocation in operations research, we study the Online Learning to Transport (OLT) problem where the decision variable is a probability measure, an infinite-dimensional object. We draw connections between online learning, optimal transport, and partial differential equations through an insight called the minimal selection principle, originally studied in the Wasserstein gradient flow setting by Ambrosio et al. (2005). This allows us to extend the standard online learning framework to the infinite-dimensional setting seamlessly. Based on our framework, we derive a novel method called the minimal selection or exploration (MSoE) algorithm to solve OLT problems using mean-field approximation and discretization techniques. In the displacement convex setting, the main theoretical message underpinning our approach is that minimizing transport cost over time (via the minimal selection principle) ensures optimal cumulative regret upper bounds. On the algorithmic side, our MSoE algorithm applies beyond the displacement convex setting, making the mathematical theory of optimal transport practically relevant to non-convex settings common in dynamic resource allocation.

Offroad vehicle movement has to contend with uneven and uncertain terrain which present challenges to path planning and motion control for both manned and unmanned ground vehicles. Knowledge of terrain properties can allow a vehicle to adapt its control and motion planning algorithms. Terrain properties, however, can change on time scales of days or even hours, necessitating their online estimation. The kinematics and, in particular the oscillations experienced by an offroad vehicle carry a signature of the terrain properties. These terrain properties can thus be estimated from proprioceptive sensing of the vehicle dynamics with an appropriate model and estimation algorithm. In this paper, we show that knowledge of the vertical dynamics of a vehicle due to its suspension can enable faster and more accurate estimation of terrain parameters. The paper considers a five degree of freedom model that combines the well known half-car and bicycle models. We show through simulation that the sinkage exponent, a parameter that can significantly influence the wheel forces from the terrain and thus greatly impact the vehicle trajectory, can be estimated from measurements of the vehicle's linear acceleration and rotational velocity, which can be readily obtained from an onboard IMU. We show that modelling the vertical vehicle dynamics can lead to significant improvement in both the estimation of terrain parameters and the prediction of the vehicle trajectory.

Retrosynthetic planning is a fundamental problem in chemistry for finding a pathway of reactions to synthesize a target molecule. Recently, search algorithms have shown promising results for solving this problem by using deep neural networks (DNNs) to expand their candidate solutions, i.e., adding new reactions to reaction pathways. However, the existing works on this line are suboptimal; the retrosynthetic planning problem requires the reaction pathways to be (a) represented by real-world reactions and (b) executable using "building block" molecules, yet the DNNs expand reaction pathways without fully incorporating such requirements. Motivated by this, we propose an end-to-end framework for directly training the DNNs towards generating reaction pathways with the desirable properties. Our main idea is based on a self-improving procedure that trains the model to imitate successful trajectories found by itself. We also propose a novel reaction augmentation scheme based on a forward reaction model. Our experiments demonstrate that our scheme significantly improves the success rate of solving the retrosynthetic problem from 86.84% to 96.32% while maintaining the performance of DNN for predicting valid reactions.

Interpretation of Deep Neural Networks (DNNs) training as an optimal control problem with nonlinear dynamical systems has received considerable attention recently, yet the algorithmic development remains relatively limited. In this work, we make an attempt along this line by reformulating the training procedure from the trajectory optimization perspective. We first show that most widely-used algorithms for training DNNs can be linked to the Differential Dynamic Programming (DDP), a celebrated second-order trajectory optimization algorithm rooted in the Approximate Dynamic Programming. In this vein, we propose a new variant of DDP that can accept batch optimization for training feedforward networks, while integrating naturally with the recent progress in curvature approximation. The resulting algorithm features layer-wise feedback policies which improve convergence rate and reduce sensitivity to hyper-parameter over existing methods. We show that the algorithm is competitive against state-ofthe-art first and second order methods. Our work opens up new avenues for principled algorithmic design built upon the optimal control theory.

Querying graph structured data is a fundamental operation that enables important applications including knowledge graph search, social network analysis, and cyber-network security. However, the growing size of real-world data graphs poses severe challenges for graph databases to meet the response-time requirements of the applications. Planning the computational steps of query processing - Query Planning - is central to address these challenges. In this paper, we study the problem of learning to speedup query planning in graph databases towards the goal of improving the computational-efficiency of query processing via training queries.We present a Learning to Plan (L2P) framework that is applicable to a large class of query reasoners that follow the Threshold Algorithm (TA) approach. First, we define a generic search space over candidate query plans, and identify target search trajectories (query plans) corresponding to the training queries by performing an expensive search. Subsequently, we learn greedy search control knowledge to imitate the search behavior of the target query plans. We provide a concrete instantiation of our L2P framework for STAR, a state-of-the-art graph query reasoner. Our experiments on benchmark knowledge graphs including DBpedia, YAGO, and Freebase show that using the query plans generated by the learned search control knowledge, we can significantly improve the speed of STAR with negligible loss in accuracy.

The Deep Q-Network proposed by Mnih et al. [2015] has become a benchmark and building point for much deep reinforcement learning research. However, replicating results for complex systems is often challenging since original scientific publications are not always able to describe in detail every important parameter setting and software engineering solution. In this paper, we present results from our work reproducing the results of the DQN paper. We highlight key areas in the implementation that were not covered in great detail in the original paper to make it easier for researchers to replicate these results, including termination conditions and gradient descent algorithms. Finally, we discuss methods for improving the computational performance and provide our own implementation that is designed to work with a range of domains, and not just the original Arcade Learning Environment [Bellemare et al., 2013].

北京阿比特科技有限公司