亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

With the rapid expansion of graphs and networks and the growing magnitude of data from all areas of science, effective treatment and compression schemes of context-dependent data is extremely desirable. A particularly interesting direction is to compress the data while keeping the "structural information" only and ignoring the concrete labelings. Under this direction, Choi and Szpankowski introduced the structures (unlabeled graphs) which allowed them to compute the structural entropy of the Erd\H{o}s--R\'enyi random graph model. Moreover, they also provided an asymptotically optimal compression algorithm that (asymptotically) achieves this entropy limit and runs in expectation in linear time. In this paper, we consider the Stochastic Block Models with an arbitrary number of parts. Indeed, we define a partitioned structural entropy for Stochastic Block Models, which generalizes the structural entropy for unlabeled graphs and encodes the partition information as well. We then compute the partitioned structural entropy of the Stochastic Block Models, and provide a compression scheme that asymptotically achieves this entropy limit.

相關內容

We present a novel class of locally conservative, entropy stable and well-balanced discontinuous Galerkin (DG) methods for the nonlinear shallow water equation with a non-flat bottom topography. The major novelty of our work is the use of velocity field as an independent solution unknown in the DG scheme, which is closely related to the entropy variable approach to entropy stable schemes for system of conservation laws proposed by Tadmor [22] back in 1986, where recall that velocity is part of the entropy variable for the shallow water equations. Due to the use of velocity as an independent solution unknown, no specific numerical quadrature rules are needed to achieve entropy stability of our scheme on general unstructured meshes in two dimensions. The proposed DG semi-discretization is then carefully combined with the classical explicit strong stability preserving Runge-Kutta (SSP-RK) time integrators [13] to yield a locally conservative, well-balanced, and positivity preserving fully discrete scheme. Here the positivity preservation property is enforced with the help of a simple scaling limiter. In the fully discrete scheme, we re-introduce discharge as an auxiliary unknown variable. In doing so, standard slope limiting procedures can be applied on the conservative variables (water height and discharge) without violating the local conservation property. Here we apply a characteristic-wise TVB limiter [5] on the conservative variables using the Fu-Shu troubled cell indicator [10] in each inner stage of the Runge-Kutta time stepping to suppress numerical oscillations.

We perform scalable approximate inference in continuous-depth Bayesian neural networks. In this model class, uncertainty about separate weights in each layer gives hidden units that follow a stochastic differential equation. We demonstrate gradient-based stochastic variational inference in this infinite-parameter setting, producing arbitrarily-flexible approximate posteriors. We also derive a novel gradient estimator that approaches zero variance as the approximate posterior over weights approaches the true posterior. This approach brings continuous-depth Bayesian neural nets to a competitive comparison against discrete-depth alternatives, while inheriting the memory-efficient training and tunable precision of Neural ODEs.

Relative entropy coding (REC) algorithms encode a sample from a target distribution $Q$ using a proposal distribution $P$, such that the expected codelength is $\mathcal{O}(D_{KL}[Q \,||\, P])$. REC can be seamlessly integrated with existing learned compression models since, unlike entropy coding, it does not assume discrete $Q$ or $P$, and does not require quantisation. However, general REC algorithms require an intractable $\Omega(e^{D_{KL}[Q \,||\, P]})$ runtime. We introduce AS* and AD* coding, two REC algorithms based on A* sampling. We prove that, for continuous distributions over $\mathbb{R}$, if the density ratio is unimodal, AS* has $\mathcal{O}(D_{\infty}[Q \,||\, P])$ expected runtime, where $D_{\infty}[Q \,||\, P]$ is the R\'enyi $\infty$-divergence. We provide experimental evidence that AD* also has $\mathcal{O}(D_{\infty}[Q \,||\, P])$ expected runtime. We prove that AS* and AD* achieve an expected codelength of $\mathcal{O}(D_{KL}[Q \,||\, P])$. Further, we introduce DAD*, an approximate algorithm based on AD* which retains its favourable runtime and has bias similar to that of alternative methods. Focusing on VAEs, we propose the IsoKL VAE (IKVAE), which can be used with DAD* to further improve compression efficiency. We evaluate A* coding with (IK)VAEs on MNIST, showing that it can losslessly compress images near the theoretically optimal limit.

Stochastic gradient Markov Chain Monte Carlo (SGMCMC) is considered the gold standard for Bayesian inference in large-scale models, such as Bayesian neural networks. Since practitioners face speed versus accuracy tradeoffs in these models, variational inference (VI) is often the preferable option. Unfortunately, VI makes strong assumptions on both the factorization and functional form of the posterior. In this work, we propose a new non-parametric variational approximation that makes no assumptions about the approximate posterior's functional form and allows practitioners to specify the exact dependencies the algorithm should respect or break. The approach relies on a new Langevin-type algorithm that operates on a modified energy function, where parts of the latent variables are averaged over samples from earlier iterations of the Markov chain. This way, statistical dependencies can be broken in a controlled way, allowing the chain to mix faster. This scheme can be further modified in a "dropout" manner, leading to even more scalability. We test our scheme for ResNet-20 on CIFAR-10, SVHN, and FMNIST. In all cases, we find improvements in convergence speed and/or final accuracy compared to SG-MCMC and VI.

This study introduces a novel computational framework for Robust Topology Optimization (RTO) considering imprecise random field parameters. Unlike the worst-case approach, the present method provides upper and lower bounds for the mean and standard deviation of compliance as well as the optimized topological layouts of a structure for various scenarios. In the proposed approach, the imprecise random field variables are determined utilizing parameterized p-boxes with different confidence intervals. The Karhunen-Lo\`eve (K-L) expansion is extended to provide a spectral description of the imprecise random field. The linear superposition method in conjunction with a linear combination of orthogonal functions is employed to obtain explicit mathematical expressions for the first and second order statistical moments of the structural compliance. Then, an interval sensitivity analysis is carried out, applying the Orthogonal Similarity Transformation (OST) method with the boundaries of each of the intermediate variable searched efficiently at every iteration using a Combinatorial Approach (CA). Finally, the validity, accuracy, and applicability of the work are rigorously checked by comparing the outputs of the proposed approach with those obtained using the particle swarm optimization (PSO) and Quasi-Monte-Carlo Simulation (QMCS) methods. Three different numerical examples with imprecise random field loads are presented to show the effectiveness and feasibility of the study.

This paper contains new identification results for undirected weighted stochastic blockmodels. They are sharper than the ones available to date and the arguments underlying them are constructive. A nonparametric estimation framework that is computationally attractive is presented and the associated distribution theory is derived. Numerical experiments are reported on.

Functional constrained optimization is becoming more and more important in machine learning and operations research. Such problems have potential applications in risk-averse machine learning, semisupervised learning, and robust optimization among others. In this paper, we first present a novel Constraint Extrapolation (ConEx) method for solving convex functional constrained problems, which utilizes linear approximations of the constraint functions to define the extrapolation (or acceleration) step. We show that this method is a unified algorithm that achieves the best-known rate of convergence for solving different functional constrained convex composite problems, including convex or strongly convex, and smooth or nonsmooth problems with a stochastic objective and/or stochastic constraints. Many of these rates of convergence were in fact obtained for the first time in the literature. In addition, ConEx is a single-loop algorithm that does not involve any penalty subproblems. Contrary to existing primal-dual methods, it does not require the projection of Lagrangian multipliers into a (possibly unknown) bounded set. Second, for nonconvex functional constrained problems, we introduce a new proximal point method that transforms the initial nonconvex problem into a sequence of convex problems by adding quadratic terms to both the objective and constraints. Under a certain MFCQ-type assumption, we establish the convergence and rate of convergence of this method to KKT points when the convex subproblems are solved exactly or inexactly. For large-scale and stochastic problems, we present a more practical proximal point method in which the approximate solutions of the subproblems are computed by the aforementioned ConEx method. To the best of our knowledge, most of these convergence and complexity results of the proximal point method for nonconvex problems also seem to be new in the literature.

In structured prediction, the goal is to jointly predict many output variables that together encode a structured object -- a path in a graph, an entity-relation triple, or an ordering of objects. Such a large output space makes learning hard and requires vast amounts of labeled data. Different approaches leverage alternate sources of supervision. One approach -- entropy regularization -- posits that decision boundaries should lie in low-probability regions. It extracts supervision from unlabeled examples, but remains agnostic to the structure of the output space. Conversely, neuro-symbolic approaches exploit the knowledge that not every prediction corresponds to a valid structure in the output space. Yet, they does not further restrict the learned output distribution. This paper introduces a framework that unifies both approaches. We propose a loss, neuro-symbolic entropy regularization, that encourages the model to confidently predict a valid object. It is obtained by restricting entropy regularization to the distribution over only valid structures. This loss is efficiently computed when the output constraint is expressed as a tractable logic circuit. Moreover, it seamlessly integrates with other neuro-symbolic losses that eliminate invalid predictions. We demonstrate the efficacy of our approach on a series of semi-supervised and fully-supervised structured-prediction experiments, where we find that it leads to models whose predictions are more accurate and more likely to be valid.

Learning low-dimensional representations for entities and relations in knowledge graphs using contrastive estimation represents a scalable and effective method for inferring connectivity patterns. A crucial aspect of contrastive learning approaches is the choice of corruption distribution that generates hard negative samples, which force the embedding model to learn discriminative representations and find critical characteristics of observed data. While earlier methods either employ too simple corruption distributions, i.e. uniform, yielding easy uninformative negatives or sophisticated adversarial distributions with challenging optimization schemes, they do not explicitly incorporate known graph structure resulting in suboptimal negatives. In this paper, we propose Structure Aware Negative Sampling (SANS), an inexpensive negative sampling strategy that utilizes the rich graph structure by selecting negative samples from a node's k-hop neighborhood. Empirically, we demonstrate that SANS finds high-quality negatives that are highly competitive with SOTA methods, and requires no additional parameters nor difficult adversarial optimization.

Owing to the recent advances in "Big Data" modeling and prediction tasks, variational Bayesian estimation has gained popularity due to their ability to provide exact solutions to approximate posteriors. One key technique for approximate inference is stochastic variational inference (SVI). SVI poses variational inference as a stochastic optimization problem and solves it iteratively using noisy gradient estimates. It aims to handle massive data for predictive and classification tasks by applying complex Bayesian models that have observed as well as latent variables. This paper aims to decentralize it allowing parallel computation, secure learning and robustness benefits. We use Alternating Direction Method of Multipliers in a top-down setting to develop a distributed SVI algorithm such that independent learners running inference algorithms only require sharing the estimated model parameters instead of their private datasets. Our work extends the distributed SVI-ADMM algorithm that we first propose, to an ADMM-based networked SVI algorithm in which not only are the learners working distributively but they share information according to rules of a graph by which they form a network. This kind of work lies under the umbrella of `deep learning over networks' and we verify our algorithm for a topic-modeling problem for corpus of Wikipedia articles. We illustrate the results on latent Dirichlet allocation (LDA) topic model in large document classification, compare performance with the centralized algorithm, and use numerical experiments to corroborate the analytical results.

北京阿比特科技有限公司