亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

We consider prediction with expert advice when data are generated from distributions varying arbitrarily within an unknown constraint set. This semi-adversarial setting includes (at the extremes) the classical i.i.d. setting, when the unknown constraint set is restricted to be a singleton, and the unconstrained adversarial setting, when the constraint set is the set of all distributions. The Hedge algorithm -- long known to be minimax (rate) optimal in the adversarial regime -- was recently shown to be simultaneously minimax optimal for i.i.d. data. In this work, we propose to relax the i.i.d. assumption by seeking adaptivity at all levels of a natural ordering on constraint sets. We provide matching upper and lower bounds on the minimax regret at all levels, show that Hedge with deterministic learning rates is suboptimal outside of the extremes, and prove that one can adaptively obtain minimax regret at all levels. We achieve this optimal adaptivity using the follow-the-regularized-leader (FTRL) framework, with a novel adaptive regularization scheme that implicitly scales as the square root of the entropy of the current predictive distribution, rather than the entropy of the initial predictive distribution. Finally, we provide novel technical tools to study the statistical performance of FTRL along the semi-adversarial spectrum.

相關內容

We analyze the finite element discretization of distributed elliptic optimal control problems with variable energy regularization, where the usual $L^2(\Omega)$ norm regularization term with a constant regularization parameter $\varrho$ is replaced by a suitable representation of the energy norm in $H^{-1}(\Omega)$ involving a variable, mesh-dependent regularization parameter $\varrho(x)$. It turns out that the error between the computed finite element state $\widetilde{u}_{\varrho h}$ and the desired state $\bar{u}$ (target) is optimal in the $L^2(\Omega)$ norm provided that $\varrho(x)$ behaves like the local mesh size squared. This is especially important when adaptive meshes are used in order to approximate discontinuous target functions. The adaptive scheme can be driven by the computable and localizable error norm $\| \widetilde{u}_{\varrho h} - \bar{u}\|_{L^2(\Omega)}$ between the finite element state $\widetilde{u}_{\varrho h}$ and the target $\bar{u}$. The numerical results not only illustrate our theoretical findings, but also show that the iterative solvers for the discretized reduced optimality system are very efficient and robust.

Conventional multi-user multiple-input multiple-output (MU-MIMO) mainly focused on Gaussian signaling, independent and identically distributed (IID) channels, and a limited number of users. It will be laborious to cope with the heterogeneous requirements in next-generation wireless communications, such as various transmission data, complicated communication scenarios, and massive user access. Therefore, this paper studies a generalized MU-MIMO (GMU-MIMO) system with more practical constraints, i.e., non-Gaussian signaling, non-IID channel, and massive users and antennas. These generalized assumptions bring new challenges in theory and practice. For example, there is no accurate capacity analysis for GMU-MIMO. In addition, it is unclear how to achieve the capacity optimal performance with practical complexity. To address these challenges, a unified framework is proposed to derive the GMU-MIMO capacity and design a capacity optimal transceiver, which jointly considers encoding, modulation, detection, and decoding. Group asymmetry is developed to make a tradeoff between user rate allocation and implementation complexity. Specifically, the capacity region of group asymmetric GMU-MIMO is characterized by using the celebrated mutual information and minimum mean-square error (MMSE) lemma and the MMSE optimality of orthogonal approximate message passing (OAMP)/vector AMP (VAMP). Furthermore, a theoretically optimal multi-user OAMP/VAMP receiver and practical multi-user low-density parity-check (MU-LDPC) codes are proposed to achieve the capacity region of group asymmetric GMU-MIMO. Numerical results verify that the gaps between theoretical detection thresholds of the proposed framework with optimized MU-LDPC codes and QPSK modulation and the sum capacity of GMU-MIMO are about 0.2 dB. Moreover, their finite-length performances are about 1~2 dB away from the associated sum capacity.

We study the offline reinforcement learning (RL) in the face of unmeasured confounders. Due to the lack of online interaction with the environment, offline RL is facing the following two significant challenges: (i) the agent may be confounded by the unobserved state variables; (ii) the offline data collected a prior does not provide sufficient coverage for the environment. To tackle the above challenges, we study the policy learning in the confounded MDPs with the aid of instrumental variables. Specifically, we first establish value function (VF)-based and marginalized importance sampling (MIS)-based identification results for the expected total reward in the confounded MDPs. Then by leveraging pessimism and our identification results, we propose various policy learning methods with the finite-sample suboptimality guarantee of finding the optimal in-class policy under minimal data coverage and modeling assumptions. Lastly, our extensive theoretical investigations and one numerical study motivated by the kidney transplantation demonstrate the promising performance of the proposed methods.

The significant presence of demand charges in electric bills motivates large-load customers to utilize energy storage to reduce the peak procurement from the grid. We herein study the problem of energy storage allocation for peak minimization, under the online setting where irrevocable decisions are sequentially made without knowing future demands. The problem is uniquely challenging due to (i) the coupling of online decisions across time imposed by the inventory constraints and (ii) the noncumulative nature of the peak procurement. We apply the CR-Pursuit framework and address the challenges unique to our minimization problem to design an online algorithm achieving the optimal competitive ratio (CR) among all online algorithms. We show that the optimal CR can be computed in polynomial time by solving a linear number of linear-fractional problems. More importantly, we generalize our approach to develop an \emph{anytime-optimal} online algorithm that achieves the best possible CR at any epoch, given the inputs and online decisions so far. The algorithm retains the optimal worst-case performance and attains adaptive average-case performance. Trace-driven simulations show that our algorithm can decrease the peak demand by an extra 19% compared to baseline alternatives under typical settings.

Prophet inequalities for rewards maximization are fundamental results from optimal stopping theory with several applications to mechanism design and online optimization. We study the cost minimization counterpart of the classical prophet inequality, where one is facing a sequence of costs $X_1, X_2, \dots, X_n$ in an online manner and must ''stop'' at some point and take the last cost seen. Given that the $X_i$'s are independent, drawn from known distributions, the goal is to devise a stopping strategy $S$ (online algorithm) that minimizes the expected cost. We first observe that if the $X_i$'s are not identically distributed, then no strategy can achieve a bounded approximation, no matter if the arrival order is adversarial or random. This leads us to consider the case where the $X_i$'s are I.I.D.. For the I.I.D. case, we give a complete characterization of the optimal stopping strategy. We show that it achieves a (distribution-dependent) constant-factor approximation to the prophet's cost for almost all distributions and that this constant is tight. In particular, for distributions for which the integral of the hazard rate is a polynomial $H(x) = \sum_{i=1}^k a_i x^{d_i}$, where $d_1 < \dots < d_k$, the approximation factor is $\lambda(d_1)$, a decreasing function of $d_1$. Furthermore, for MHR distributions, we show that this constant is at most $2$, and this is again tight. We also analyze single-threshold strategies for the cost prophet inequality problem. We design a threshold that achieves a $\operatorname{O}(\operatorname{polylog}n)$-factor approximation, where the exponent in the logarithmic factor is a distribution-dependent constant, and we show a matching lower bound. We believe that our results are of independent interest for analyzing approximately optimal (posted price-style) mechanisms for procuring items.

A goodness-of-fit test for one-parameter count distributions with finite second moment is proposed. The test statistic is derived from the $L_1$-distance of a function of the probability generating function of the model under the null hypothesis and that of the random variable actually generating data, when the latter belongs to a suitable wide class of alternatives. The test statistic has a rather simple form and it is asymptotically normally distributed under the null hypothesis, allowing a straightforward implementation of the test. Moreover, the test is consistent for alternative distributions belonging to the class, but also for all the alternative distributions whose probability of zero is different from that under the null hypothesis. Thus, the use of the test is proposed and investigated also for alternatives not in the class. The finite-sample properties of the test are assessed by means of an extensive simulation study.

In multirotor systems, guaranteeing safety while considering unknown disturbances is essential for robust trajectory planning. Computing the forward reachable set (FRS), the set of all possible states with bounded disturbances, can be a viable solution to find robust and collision-free trajectories. However, in many cases, the FRS is not calculated in real time and is too conservative to be used in actual applications. In this paper, we mitigate these problems by applying a nonlinear disturbance observer (NDOB) and an adaptive controller to the multirotor system. We formulate the FRS of the closed-loop system combined with the adaptive controller in augmented state space by exploiting the Hamilton-Jacobi reachability analysis and then present the ellipsoidal approximation in a closed-form expression to compute the small FRS in real time. Moreover, tighter disturbance bounds in the prediction horizon are inferred from the NDOB so that a much smaller FRS can be generated. Numerical examples validate the computational efficiency and the smaller scale of the proposed FRS compared to the baseline.

In this paper, we propose a novel architecture for direct extractive speech-to-speech summarization, ESSumm, which is an unsupervised model without dependence on intermediate transcribed text. Different from previous methods with text presentation, we are aimed at generating a summary directly from speech without transcription. First, a set of smaller speech segments are extracted based on speech signal's acoustic features. For each candidate speech segment, a distance-based summarization confidence score is designed for latent speech representation measure. Specifically, we leverage the off-the-shelf self-supervised convolutional neural network to extract the deep speech features from raw audio. Our approach automatically predicts the optimal sequence of speech segments that capture the key information with a target summary length. Extensive results on two well-known meeting datasets (AMI and ICSI corpora) show the effectiveness of our direct speech-based method to improve the summarization quality with untranscribed data. We also observe that our unsupervised speech-based method even performs on par with recent transcript-based summarization approaches, where extra speech recognition is required.

We consider a potential outcomes model in which interference may be present between any two units but the extent of interference diminishes with spatial distance. The causal estimand is the global average treatment effect, which compares outcomes under the counterfactuals that all or no units are treated. We study a class of designs in which space is partitioned into clusters that are randomized into treatment and control. For each design, we estimate the treatment effect using a Horvitz-Thompson estimator that compares the average outcomes of units with all or no neighbors treated, where the neighborhood radius is of the same order as the cluster size dictated by the design. We derive the estimator's rate of convergence as a function of the design and degree of interference and use this to obtain estimator-design pairs that achieve near-optimal rates of convergence under relatively minimal assumptions on interference. We prove that the estimators are asymptotically normal and provide a variance estimator. For practical implementation of the designs, we suggest partitioning space using clustering algorithms.

With the rapid increase of large-scale, real-world datasets, it becomes critical to address the problem of long-tailed data distribution (i.e., a few classes account for most of the data, while most classes are under-represented). Existing solutions typically adopt class re-balancing strategies such as re-sampling and re-weighting based on the number of observations for each class. In this work, we argue that as the number of samples increases, the additional benefit of a newly added data point will diminish. We introduce a novel theoretical framework to measure data overlap by associating with each sample a small neighboring region rather than a single point. The effective number of samples is defined as the volume of samples and can be calculated by a simple formula $(1-\beta^{n})/(1-\beta)$, where $n$ is the number of samples and $\beta \in [0,1)$ is a hyperparameter. We design a re-weighting scheme that uses the effective number of samples for each class to re-balance the loss, thereby yielding a class-balanced loss. Comprehensive experiments are conducted on artificially induced long-tailed CIFAR datasets and large-scale datasets including ImageNet and iNaturalist. Our results show that when trained with the proposed class-balanced loss, the network is able to achieve significant performance gains on long-tailed datasets.

北京阿比特科技有限公司