亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Temporal graphs have been recently introduced to model changes to a given network that occur throughout a fixed period of time. We introduce and investigate the Temporal $\Delta$ Independent Set problem, a temporal variant of the well known Independent Set problem. This problem is e.g. motivated in the context of finding conflict-free schedules for maximum subsets of tasks, that have certain (changing) constraints on each day they need to be performed. We are specifically interested in the case where each task needs to be performed in a certain time-interval on each day and two tasks are in conflict on a day if their time-intervals overlap on that day. This leads us to considering Temporal $\Delta$ Independent Set on the restricted class of temporal unit interval graphs, i.e. temporal graphs where each layer is unit interval. We present several hardness results for this problem, as well as two algorithms: The first is an constant-factor approximation algorithm for instances where $\tau$, the total number of time steps (layers) of the temporal graph, and $\Delta$, a parameter that allows us to model some tolerance in the conflicts, are constants. For the second result we use the notion of order preservation for temporal unit interval graphs that, informally, requires the intervals of every layer to obey a common ordering. We provide an FPT algorithm parameterized by the size of minimum vertex deletion set to order preservation.

相關內容

Consider a set $P$ of $n$ points in $\mathbb{R}^d$. In the discrete median line segment problem, the objective is to find a line segment bounded by a pair of points in $P$ such that the sum of the Euclidean distances from $P$ to the line segment is minimized. In the continuous median line segment problem, a real number $\ell>0$ is given, and the goal is to locate a line segment of length $\ell$ in $\mathbb{R}^d$ such that the sum of the Euclidean distances between $P$ and the line segment is minimized. We show how to compute $(1+\epsilon\Delta)$- and $(1+\epsilon)$-approximations to a discrete median line segment in time $O(n\epsilon^{-2d}\log n)$ and $O(n^2\epsilon^{-d})$, respectively, where $\Delta$ is the spread of line segments spanned by pairs of points. While developing our algorithms, by using the principle of pair decomposition, we derive new data structures that allow us to quickly approximate the sum of the distances from a set of points to a given line segment or point. To our knowledge, our utilization of pair decompositions for solving minsum facility location problems is the first of its kind; it is versatile and easily implementable. We prove that it is impossible to construct a continuous median line segment for $n\geq3$ non-collinear points in the plane by using only ruler and compass. In view of this, we present an $O(n^d\epsilon^{-d})$-time algorithm for approximating a continuous median line segment in $\mathbb{R}^d$ within a factor of $1+\epsilon$. The algorithm is based upon generalizing the point-segment pair decomposition from the discrete to the continuous domain. Last but not least, we give an $(1+\epsilon)$-approximation algorithm, whose time complexity is sub-quadratic in $n$, for solving the constrained median line segment problem in $\mathbb{R}^2$ where an endpoint or the slope of the median line segment is given at input.

Generative adversarial networks (GANs) are so complex that the existing learning theories do not provide a satisfactory explanation for why GANs have great success in practice. The same situation also remains largely open for deep neural networks. To fill this gap, we introduce a Lipschitz theory to analyze generalization. We demonstrate its simplicity by analyzing generalization and consistency of overparameterized neural networks. We then use this theory to derive Lipschitz-based generalization bounds for GANs. Our bounds show that penalizing the Lipschitz constant of the GAN loss can improve generalization. This result answers the long mystery of why the popular use of Lipschitz constraint for GANs often leads to great success, empirically without a solid theory. Finally but surprisingly, we show that, when using Dropout or spectral normalization, both \emph{truly deep} neural networks and GANs can generalize well without the curse of dimensionality.

In this paper, we investigate the problem of computing Bayesian estimators using Langevin Monte-Carlo type approximation. The novelty of this paper is to consider together the statistical and numerical counterparts (in a general log-concave setting). More precisely, we address the following question: given $n$ observations in $\mathbb{R}^q$ distributed under an unknown probability $\mathbb{P}_{\theta^\star}$ with $\theta^\star \in \mathbb{R}^d$ , what is the optimal numerical strategy and its cost for the approximation of $\theta^\star$ with the Bayesian posterior mean? To answer this question, we establish some quantitative statistical bounds related to the underlying Poincar\'e constant of the model and establish new results about the numerical approximation of Gibbs measures by Cesaro averages of Euler schemes of (over-damped) Langevin diffusions. These last results include in particular some quantitative controls in the weakly convex case based on new bounds on the solution of the related Poisson equation of the diffusion.

We study norm-based uniform convergence bounds for neural networks, aiming at a tight understanding of how these are affected by the architecture and type of norm constraint, for the simple class of scalar-valued one-hidden-layer networks, and inputs bounded in Euclidean norm. We begin by proving that in general, controlling the spectral norm of the hidden layer weight matrix is insufficient to get uniform convergence guarantees (independent of the network width), while a stronger Frobenius norm control is sufficient, extending and improving on previous work. Motivated by the proof constructions, we identify and analyze two important settings where a mere spectral norm control turns out to be sufficient: First, when the network's activation functions are sufficiently smooth (with the result extending to deeper networks); and second, for certain types of convolutional networks. In the latter setting, we study how the sample complexity is additionally affected by parameters such as the amount of overlap between patches and the overall number of patches.

Benign overfitting demonstrates that overparameterized models can perform well on test data while fitting noisy training data. However, it only considers the final min-norm solution in linear regression, which ignores the algorithm information and the corresponding training procedure. In this paper, we generalize the idea of benign overfitting to the whole training trajectory instead of the min-norm solution and derive a time-variant bound based on the trajectory analysis. Starting from the time-variant bound, we further derive a time interval that suffices to guarantee a consistent generalization error for a given feature covariance. Unlike existing approaches, the newly proposed generalization bound is characterized by a time-variant effective dimension of feature covariance. By introducing the time factor, we relax the strict assumption on the feature covariance matrix required in previous benign overfitting under the regimes of overparameterized linear regression with gradient descent. This paper extends the scope of benign overfitting, and experiment results indicate that the proposed bound accords better with empirical evidence.

In this paper we study temporal design problems of undirected temporally connected graphs. The basic setting of these optimization problems is as follows: given an undirected graph $G$, what is the smallest number $|\lambda|$ of time-labels that we need to add to the edges of $G$ such that the resulting temporal graph $(G,\lambda)$ is temporally connected? As we prove, this basic problem, called MINIMUM LABELING, can be optimally solved in polynomial time, thus resolving an open question. The situation becomes however more complicated if we strengthen, or even if we relax a bit, the requirement of temporal connectivity of $(G,\lambda)$. One way to strengthen the temporal connectivity requirements is to upper-bound the allowed age (i.e., maximum label) of the obtained temporal graph $(G,\lambda)$. On the other hand, we can relax temporal connectivity by only requiring that there exists a temporal path between any pair of ``important'' vertices which lie in a subset $R\subseteq V$, which we call the terminals. This relaxed problem, called MINIMUM STEINER LABELING, resembles the problem STEINER TREE in static (i.e., non-temporal) graphs; however, as it turns out, STEINER TREE is not a special case of MINIMUM STEINER LABELING. We prove that MINIMUM STEINER LABELING is NP-hard and in FPT with respect to the number $|R|$ of terminals. Moreover, we prove that, adding the age restriction makes the above problems strictly harder (unless P=NP or W[1]=FPT). More specifically, we prove that the age-restricted version of MINIMUM LABELING becomes NP-complete on undirected graphs, while the age-restricted version of MINIMUM STEINER LABELING becomes W[1]-hard with respect to the number $|R|$ of terminals.

This paper studies the consistency and statistical inference of simulated Ising models in the high dimensional background. Our estimators are based on the Markov chain Monte Carlo maximum likelihood estimation (MCMC-MLE) method penalized by the Elastic-net. Under mild conditions that ensure a specific convergence rate of MCMC method, the $\ell_{1}$ consistency of Elastic-net-penalized MCMC-MLE is proved. We further propose a decorrelated score test based on the decorrelated score function and prove the asymptotic normality of the score function without the influence of many nuisance parameters under the assumption that accelerates the convergence of the MCMC method. The one-step estimator for a single parameter of interest is purposed by linearizing the decorrelated score function to solve its root, as well as its normality and confidence interval for the true value, therefore, be established. Finally, we use different algorithms to control the false discovery rate (FDR) via traditional p-values and novel e-values.

We study full Bayesian procedures for high-dimensional linear regression. We adopt data-dependent empirical priors introduced in [1]. In their paper, these priors have nice posterior contraction properties and are easy to compute. Our paper extend their theoretical results to the case of unknown error variance . Under proper sparsity assumption, we achieve model selection consistency, posterior contraction rates as well as Bernstein von-Mises theorem by analyzing multivariate t-distribution.

Modern neural network architectures can leverage large amounts of data to generalize well within the training distribution. However, they are less capable of systematic generalization to data drawn from unseen but related distributions, a feat that is hypothesized to require compositional reasoning and reuse of knowledge. In this work, we present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules, which we call \emph{functions}. Inputs to the model are routed through a sequence of functions in a way that is end-to-end learned. The proposed architecture can flexibly compose computation along width and depth, and lends itself well to capacity extension after training. To demonstrate the versatility of Neural Interpreters, we evaluate it in two distinct settings: image classification and visual abstract reasoning on Raven Progressive Matrices. In the former, we show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner. In the latter, we find that Neural Interpreters are competitive with respect to the state-of-the-art in terms of systematic generalization

We study the link between generalization and interference in temporal-difference (TD) learning. Interference is defined as the inner product of two different gradients, representing their alignment. This quantity emerges as being of interest from a variety of observations about neural networks, parameter sharing and the dynamics of learning. We find that TD easily leads to low-interference, under-generalizing parameters, while the effect seems reversed in supervised learning. We hypothesize that the cause can be traced back to the interplay between the dynamics of interference and bootstrapping. This is supported empirically by several observations: the negative relationship between the generalization gap and interference in TD, the negative effect of bootstrapping on interference and the local coherence of targets, and the contrast between the propagation rate of information in TD(0) versus TD($\lambda$) and regression tasks such as Monte-Carlo policy evaluation. We hope that these new findings can guide the future discovery of better bootstrapping methods.

北京阿比特科技有限公司