亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

The $h$-index is a metric used to measure the impact of a user in a publication setting, such as a member of a social network with many highly liked posts or a researcher in an academic domain with many highly cited publications. Specifically, the $h$-index of a user is the largest integer $h$ such that at least $h$ publications of the user have at least $h$ units of positive feedback. We design an algorithm that, given query access to the $n$ publications of a user and each publication's corresponding positive feedback number, outputs a $(1\pm \varepsilon)$-approximation of the $h$-index of this user with probability at least $1-\delta$ in time \[ O(\frac{n \cdot \ln{(1/\delta)}}{\varepsilon^2 \cdot h}), \] where $h$ is the actual $h$-index which is unknown to the algorithm a-priori. We then design a novel lower bound technique that allows us to prove that this bound is in fact asymptotically optimal for this problem in all parameters $n,h,\varepsilon,$ and $\delta$. Our work is one of the first in sublinear time algorithms that addresses obtaining asymptotically optimal bounds, especially in terms of the error and confidence parameters. As such, we focus on designing novel techniques for this task. In particular, our lower bound technique seems quite general -- to showcase this, we also use our approach to prove an asymptotically optimal lower bound for the problem of estimating the number of triangles in a graph in sublinear time, which now is also optimal in the error and confidence parameters. This result improves upon prior lower bounds of Eden, Levi, Ron, and Seshadhri (FOCS'15) for this problem, as well as multiple follow-ups that extended this lower bound to other subgraph counting problems.

相關內容

In the modern paradigm of federated learning, a large number of users are involved in a global learning task, in a collaborative way. They alternate local computations and two-way communication with a distant orchestrating server. Communication, which can be slow and costly, is the main bottleneck in this setting. To reduce the communication load and therefore accelerate distributed gradient descent, two strategies are popular: 1) communicate less frequently; that is, perform several iterations of local computations between the communication rounds; and 2) communicate compressed information instead of full-dimensional vectors. In this paper, we propose the first algorithm for distributed optimization and federated learning, which harnesses these two strategies jointly and converges linearly to an exact solution, with a doubly accelerated rate: our algorithm benefits from the two acceleration mechanisms provided by local training and compression, namely a better dependency on the condition number of the functions and on the dimension of the model, respectively.

In practical applications, data is used to make decisions in two steps: estimation and optimization. First, a machine learning model estimates parameters for a structural model relating decisions to outcomes. Second, a decision is chosen to optimize the structural model's predicted outcome as if its parameters were correctly estimated. Due to its flexibility and simple implementation, this ``estimate-then-optimize'' approach is often used for data-driven decision-making. Errors in the estimation step can lead estimate-then-optimize to sub-optimal decisions that result in regret, i.e., a difference in value between the decision made and the best decision available with knowledge of the structural model's parameters. We provide a novel bound on this regret for smooth and unconstrained optimization problems. Using this bound, in settings where estimated parameters are linear transformations of sub-Gaussian random vectors, we provide a general procedure for experimental design to minimize the regret resulting from estimate-then-optimize. We demonstrate our approach on simple examples and a pandemic control application.

We consider the problem of estimating (diagonally dominant) M-matrices as precision matrices in Gaussian graphical models. Such models have received increasing attention in recent years, and have shown interesting properties, e.g., the maximum likelihood estimator exists with as little as two observations regardless of the underlying dimension. In this paper, we propose an adaptive estimation method, which consists of multiple stages: In the first stage, we solve an $\ell_1$-regularized maximum likelihood estimation problem, which leads to an initial estimate; in the subsequent stages, we iteratively refine the initial estimate by solving a sequence of weighted $\ell_1$-regularized problems. We further establish the theoretical guarantees on the estimation error, which consists of optimization error and statistical error. The optimization error decays to zero at a linear rate, indicating that the estimate is refined iteratively in subsequent stages, and the statistical error characterizes the statistical rate. The proposed method outperforms state-of-the-art methods in estimating precision matrices and identifying graph edges, as evidenced by synthetic and financial time-series data sets.

Queries with aggregation and arithmetic operations, as well as incomplete data, are common in real-world database, but we lack a good understanding of how they should interact. On the one hand, systems based on SQL provide ad-hoc rules for numerical nulls, on the other, theoretical research largely concentrates on the standard notions of certain and possible answers. In the presence of numerical attributes and aggregates, however, these answers are often meaningless, returning either too little or too much. Our goal is to define a principled framework for databases with numerical nulls and answering queries with arithmetic and aggregations over them. Towards this goal, we assume that missing values in numerical attributes are given by probability distributions associated with marked nulls. This yields a model of probabilistic bag databases in which tuples are not necessarily independent, since nulls can repeat. We provide a general compositional framework for query answering, and then concentrate on queries that resemble standard SQL with arithmetic and aggregation. We show that these queries are measurable, and that their outputs have a finite representation. Moreover, since the classical forms of answers provide little information in the numerical setting, we look at the probability that numerical values in output tuples belong to specific intervals. Even though their exact computation is intractable, we show efficient approximation algorithms to compute such probabilities.

In machine learning, the selection of a promising model from a potentially large number of competing models and the assessment of its generalization performance are critical tasks that need careful consideration. Typically, model selection and evaluation are strictly separated endeavors, splitting the sample at hand into a training, validation, and evaluation set, and only compute a single confidence interval for the prediction performance of the final selected model. We however propose an algorithm how to compute valid lower confidence bounds for multiple models that have been selected based on their prediction performances in the evaluation set by interpreting the selection problem as a simultaneous inference problem. We use bootstrap tilting and a maxT-type multiplicity correction. The approach is universally applicable for any combination of prediction models, any model selection strategy, and any prediction performance measure that accepts weights. We conducted various simulation experiments which show that our proposed approach yields lower confidence bounds that are at least comparably good as bounds from standard approaches, and that reliably reach the nominal coverage probability. In addition, especially when sample size is small, our proposed approach yields better performing prediction models than the default selection of only one model for evaluation does.

We develop a linear time algorithm for finding the diameter of an AT-free graph. Furthermore, we update the definition of polar sets and develop new properties of polar sets for (weak) dominating pair graphs. We prove that the problems of finding simplicial vertices and finding triangles in general graphs can be accomplished in O(n^2) based on existing reductions of these problems to the problem of finding diameter in AT-free graphs. We improve the best-known run-time complexities of several graph theoretical problems.

The Unsplittable Flow on a Path (UFP) problem has garnered considerable attention as a challenging combinatorial optimization problem with notable practical implications. Steered by its pivotal applications in power engineering, the present work formulates a novel generalization of UFP, wherein demands and capacities in the input instance are monotone step functions over the set of edges. As an initial step towards tackling this generalization, we draw on and extend ideas from prior research to devise a quasi-polynomial time approximation scheme (QPTAS) under the premise that the demands and capacities lie in a quasi-polynomial range. Second, retaining the same assumption, an efficient logarithmic approximation is introduced for the single-source variant of the problem. Finally, we round up the contributions by designing a (kind of) black-box reduction that, under some mild conditions, allows to translate LP-based approximation algorithms for the studied problem into their counterparts for the Alternating Current Optimal Power Flow (AC OPF) problem -- a fundamental workflow in operation and control of power systems.

We prove new lower bounds for statistical estimation tasks under the constraint of $(\varepsilon, \delta)$-differential privacy. First, we provide tight lower bounds for private covariance estimation of Gaussian distributions. We show that estimating the covariance matrix in Frobenius norm requires $\Omega(d^2)$ samples, and in spectral norm requires $\Omega(d^{3/2})$ samples, both matching upper bounds up to logarithmic factors. We prove these bounds via our main technical contribution, a broad generalization of the fingerprinting method to exponential families. Additionally, using the private Assouad method of Acharya, Sun, and Zhang, we show a tight $\Omega(d/(\alpha^2 \varepsilon))$ lower bound for estimating the mean of a distribution with bounded covariance to $\alpha$-error in $\ell_2$-distance. Prior known lower bounds for all these problems were either polynomially weaker or held under the stricter condition of $(\varepsilon,0)$-differential privacy.

The problem of fair division of indivisible goods has been receiving much attention recently. The prominent metric of envy-freeness can always be satisfied in the divisible goods setting (see for example \cite{BT95}), but often cannot be satisfied in the indivisible goods setting. This has led to many relaxations thereof being introduced. We study the existence of {\em maximin share (MMS)} allocations, which is one such relaxation. Previous work has shown that MMS allocations are guaranteed to exist for all instances with $n$ players and $m$ goods if $m \leq n+4$. We extend this guarantee to the case of $m = n+5$ and show that the same guarantee fails for $m = n+6$.

This paper focuses on the expected difference in borrower's repayment when there is a change in the lender's credit decisions. Classical estimators overlook the confounding effects and hence the estimation error can be magnificent. As such, we propose another approach to construct the estimators such that the error can be greatly reduced. The proposed estimators are shown to be unbiased, consistent, and robust through a combination of theoretical analysis and numerical testing. Moreover, we compare the power of estimating the causal quantities between the classical estimators and the proposed estimators. The comparison is tested across a wide range of models, including linear regression models, tree-based models, and neural network-based models, under different simulated datasets that exhibit different levels of causality, different degrees of nonlinearity, and different distributional properties. Most importantly, we apply our approaches to a large observational dataset provided by a global technology firm that operates in both the e-commerce and the lending business. We find that the relative reduction of estimation error is strikingly substantial if the causal effects are accounted for correctly.

北京阿比特科技有限公司