亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

For any small positive real $\varepsilon$ and integer $t > \frac{1}{\varepsilon}$, we build a graph with a vertex deletion set of size $t$ to a tree, and twin-width greater than $2^{(1-\varepsilon) t}$. In particular, this shows that the twin-width is sometimes exponential in the treewidth, in the so-called oriented twin-width and grid number, and that adding an apex may multiply the twin-width by at least $2-\varepsilon$. Except for the one in oriented twin-width, these lower bounds are essentially tight.

相關內容

We establish the following two main results on order types of points in general position in the plane (realizable simple planar order types, realizable uniform acyclic oriented matroids of rank $3$): (a) The number of extreme points in an $n$-point order type, chosen uniformly at random from all such order types, is on average $4+o(1)$. For labeled order types, this number has average $4- \frac{8}{n^2 - n +2}$ and variance at most $3$. (b) The (labeled) order types read off a set of $n$ points sampled independently from the uniform measure on a convex planar domain, smooth or polygonal, or from a Gaussian distribution are concentrated, i.e. such sampling typically encounters only a vanishingly small fraction of all order types of the given size. Result (a) generalizes to arbitrary dimension $d$ for labeled order types with the average number of extreme points $2d+o(1)$ and constant variance. We also discuss to what extent our methods generalize to the abstract setting of uniform acyclic oriented matroids. Moreover, our methods allow to show the following relative of the Erd\H{o}s-Szekeres theorem: for any fixed $k$, as $n \to \infty$, a proportion $1 - O(1/n)$ of the $n$-point simple order types contain a triangle enclosing a convex $k$-chain over an edge. For the unlabeled case in (a), we prove that for any antipodal, finite subset of the $2$-dimensional sphere, the group of orientation preserving bijections is cyclic, dihedral or one of $A_4$, $S_4$ or $A_5$ (and each case is possible). These are the finite subgroups of $SO(3)$ and our proof follows the lines of their characterization by Felix Klein.

Neural networks with the Rectified Linear Unit (ReLU) nonlinearity are described by a vector of parameters $\theta$, and realized as a piecewise linear continuous function $R_{\theta}: x \in \mathbb R^{d} \mapsto R_{\theta}(x) \in \mathbb R^{k}$. Natural scalings and permutations operations on the parameters $\theta$ leave the realization unchanged, leading to equivalence classes of parameters that yield the same realization. These considerations in turn lead to the notion of identifiability -- the ability to recover (the equivalence class of) $\theta$ from the sole knowledge of its realization $R_{\theta}$. The overall objective of this paper is to introduce an embedding for ReLU neural networks of any depth, $\Phi(\theta)$, that is invariant to scalings and that provides a locally linear parameterization of the realization of the network. Leveraging these two key properties, we derive some conditions under which a deep ReLU network is indeed locally identifiable from the knowledge of the realization on a finite set of samples $x_{i} \in \mathbb R^{d}$. We study the shallow case in more depth, establishing necessary and sufficient conditions for the network to be identifiable from a bounded subset $\mathcal X \subseteq \mathbb R^{d}$.

We consider a high-dimensional mean estimation problem over a binary hidden Markov model, which illuminates the interplay between memory in data, sample size, dimension, and signal strength in statistical inference. In this model, an estimator observes $n$ samples of a $d$-dimensional parameter vector $\theta_{*}\in\mathbb{R}^{d}$, multiplied by a random sign $ S_i $ ($1\le i\le n$), and corrupted by isotropic standard Gaussian noise. The sequence of signs $\{S_{i}\}_{i\in[n]}\in\{-1,1\}^{n}$ is drawn from a stationary homogeneous Markov chain with flip probability $\delta\in[0,1/2]$. As $\delta$ varies, this model smoothly interpolates two well-studied models: the Gaussian Location Model for which $\delta=0$ and the Gaussian Mixture Model for which $\delta=1/2$. Assuming that the estimator knows $\delta$, we establish a nearly minimax optimal (up to logarithmic factors) estimation error rate, as a function of $\|\theta_{*}\|,\delta,d,n$. We then provide an upper bound to the case of estimating $\delta$, assuming a (possibly inaccurate) knowledge of $\theta_{*}$. The bound is proved to be tight when $\theta_{*}$ is an accurately known constant. These results are then combined to an algorithm which estimates $\theta_{*}$ with $\delta$ unknown a priori, and theoretical guarantees on its error are stated.

Practical data assimilation algorithms often contain hyper-parameters, which may arise due to, for instance, the use of certain auxiliary techniques like covariance inflation and localization in an ensemble Kalman filter, the re-parameterization of certain quantities such as model and/or observation error covariance matrices, and so on. Given the richness of the established assimilation algorithms, and the abundance of the approaches through which hyper-parameters are introduced to the assimilation algorithms, one may ask whether it is possible to develop a sound and generic method to efficiently choose various types of (sometimes high-dimensional) hyper-parameters. This work aims to explore a feasible, although likely partial, answer to this question. Our main idea is built upon the notion that a data assimilation algorithm with hyper-parameters can be considered as a parametric mapping that links a set of quantities of interest (e.g., model state variables and/or parameters) to a corresponding set of predicted observations in the observation space. As such, the choice of hyper-parameters can be recast as a parameter estimation problem, in which our objective is to tune the hyper-parameters in such a way that the resulted predicted observations can match the real observations to a good extent. From this perspective, we propose a hyper-parameter estimation workflow and investigate the performance of this workflow in an ensemble Kalman filter. In a series of experiments, we observe that the proposed workflow works efficiently even in the presence of a relatively large amount (up to $10^3$) of hyper-parameters, and exhibits reasonably good and consistent performance under various conditions.

Two recent and seemingly-unrelated techniques for proving mixing bounds for Markov chains are: (i) the framework of Spectral Independence, introduced by Anari, Liu and Oveis Gharan, and its numerous extensions, which have given rise to several breakthroughs in the analysis of mixing times of discrete Markov chains and (ii) the Stochastic Localization technique which has proven useful in establishing mixing and expansion bounds for both log-concave measures and for measures on the discrete hypercube. In this paper, we introduce a framework which connects ideas from both techniques. Our framework unifies, simplifies and extends those two techniques. In its center is the concept of a localization scheme which, to every probability measure, assigns a martingale of probability measures which localize in space as time evolves. As it turns out, to every such scheme corresponds a Markov chain, and many chains of interest appear naturally in this framework. This viewpoint provides tools for deriving mixing bounds for the dynamics through the analysis of the corresponding localization process. Generalizations of concepts of Spectral Independence and Entropic Independence naturally arise from our definitions, and in particular we recover the main theorems in the spectral and entropic independence frameworks via simple martingale arguments (completely bypassing the need to use the theory of high-dimensional expanders). We demonstrate the strength of our proposed machinery by giving short and (arguably) simpler proofs to many mixing bounds in the recent literature, including giving the first $O(n \log n)$ bound for the mixing time of Glauber dynamics on the hardcore-model (of arbitrary degree) in the tree-uniqueness regime.

Higher-order tensor data are prevailing in a wide range of fields including high-resolution videos, multimodality imaging such as MRI and fMRI scans, commercial networks, engineering such as signal processing, and elsewhere. Tucker decomposition may be the most general low-rank approximation method among versatile decompositions of higher-order tensors owning to its strong compression ability, whilst statistical properties of the induced Tucker tensor factor model (TuTFaM) remains a big challenge and yet critical before it provides justification for applications in machine learning and beyond. Existing theoretical developments mainly focus on the field of time series with the assumption of strong auto-correlation among temporally ordered observations, which is ineffective for independent and weakly dependent tensor observations. Under quite mild assumptions, this article kicks off matricization of raw weakly correlated tensor observations within the TuTFaM setting, and proposes two sets of PCA based estimation procedures, moPCA and its refinement IPmoPCA, the latter of which is enhanced in rate of convergence. We develop their asymptotic behaviors, including mainly convergence rates and asymptotic distributions of estimators of loading matrices, latent tensor factors and signal parts. The theoretical results can reduce to those in low-order tensor factor models in existing literature. The proposed approaches outperform existing auto-covariance based methods for tensor time series in terms of effects of estimation and tensor reconstruction, in both simulation experiments and two real data examples.

We show that many graphs with bounded treewidth can be described as subgraphs of the strong product of a graph with smaller treewidth and a bounded-size complete graph. To this end, define the "underlying treewidth" of a graph class $\mathcal{G}$ to be the minimum non-negative integer $c$ such that, for some function $f$, for every graph ${G \in \mathcal{G}}$ there is a graph $H$ with ${\text{tw}(H) \leq c}$ such that $G$ is isomorphic to a subgraph of ${H \boxtimes K_{f(\text{tw}(G))}}$. We introduce disjointed coverings of graphs and show they determine the underlying treewidth of any graph class. Using this result, we prove that the class of planar graphs has underlying treewidth 3; the class of $K_{s,t}$-minor-free graphs has underlying treewidth $s$ (for ${t \geq \max\{s,3\}}$); and the class of $K_t$-minor-free graphs has underlying treewidth ${t-2}$. In general, we prove that a monotone class has bounded underlying treewidth if and only if it excludes some fixed topological minor. We also study the underlying treewidth of graph classes defined by an excluded subgraph or excluded induced subgraph. We show that the class of graphs with no $H$ subgraph has bounded underlying treewidth if and only if every component of $H$ is a subdivided star, and that the class of graphs with no induced $H$ subgraph has bounded underlying treewidth if and only if every component of $H$ is a star.

Measuring the stability of conclusions derived from Ordinary Least Squares linear regression is critically important, but most metrics either only measure local stability (i.e. against infinitesimal changes in the data), or are only interpretable under statistical assumptions. Recent work proposes a simple, global, finite-sample stability metric: the minimum number of samples that need to be removed so that rerunning the analysis overturns the conclusion, specifically meaning that the sign of a particular coefficient of the estimated regressor changes. However, besides the trivial exponential-time algorithm, the only approach for computing this metric is a greedy heuristic that lacks provable guarantees under reasonable, verifiable assumptions; the heuristic provides a loose upper bound on the stability and also cannot certify lower bounds on it. We show that in the low-dimensional regime where the number of covariates is a constant but the number of samples is large, there are efficient algorithms for provably estimating (a fractional version of) this metric. Applying our algorithms to the Boston Housing dataset, we exhibit regression analyses where we can estimate the stability up to a factor of $3$ better than the greedy heuristic, and analyses where we can certify stability to dropping even a majority of the samples.

Solving the time-dependent Schr\"odinger equation is an important application area for quantum algorithms. We consider Schr\"odinger's equation in the semi-classical regime. Here the solutions exhibit strong multiple-scale behavior due to a small parameter $\hbar$, in the sense that the dynamics of the quantum states and the induced observables can occur on different spatial and temporal scales. Such a Schr\"odinger equation finds many applications, including in Born-Oppenheimer molecular dynamics and Ehrenfest dynamics. This paper considers quantum analogues of pseudo-spectral (PS) methods on classical computers. Estimates on the gate counts in terms of $\hbar$ and the precision $\varepsilon$ are obtained. It is found that the number of required qubits, $m$, scales only logarithmically with respect to $\hbar$. When the solution has bounded derivatives up to order $\ell$, the symmetric Trotting method has gate complexity $\mathcal{O}\Big({ (\varepsilon \hbar)^{-\frac12} \mathrm{polylog}(\varepsilon^{-\frac{3}{2\ell}} \hbar^{-1-\frac{1}{2\ell}})}\Big),$ provided that the diagonal unitary operators in the pseudo-spectral methods can be implemented with $\mathrm{poly}(m)$ operations. When physical observables are the desired outcomes, however, the step size in the time integration can be chosen independently of $\hbar$. The gate complexity in this case is reduced to $\mathcal{O}\Big({\varepsilon^{-\frac12} \mathrm{polylog}( \varepsilon^{-\frac3{2\ell}} \hbar^{-1} )}\Big),$ with $\ell$ again indicating the smoothness of the solution.

Bonnet et al. (FOCS 2020) introduced the graph invariant twin-width and showed that many NP-hard problems are tractable for graphs of bounded twin-width, generalizing similar results for other width measures, including treewidth and clique-width. In this paper, we investigate the use of twin-width for solving the propositional satisfiability problem (SAT) and propositional model counting. We particularly focus on Bounded-ones Weighted Model Counting (BWMC), which takes as input a CNF formula $F$ along with a bound $k$ and asks for the weighted sum of all models with at most $k$ positive literals. BWMC generalizes not only SAT but also (weighted) model counting. We develop the notion of "signed" twin-width of CNF formulas and establish that BWMC is fixed-parameter tractable when parameterized by the certified signed twin-width of $F$ plus $k$. We show that this result is tight: it is neither possible to drop the bound $k$ nor use the vanilla twin-width instead if one wishes to retain fixed-parameter tractability, even for the easier problem SAT. Our theoretical results are complemented with an empirical evaluation and comparison of signed twin-width on various classes of CNF formulas.

北京阿比特科技有限公司