亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

The choice of activation functions and their motivation is a long-standing issue within the neural network community. Neuronal representations within artificial neural networks are commonly understood as logits, representing the log-odds score of presence of features within the stimulus. We derive logit-space operators equivalent to probabilistic Boolean logic-gates AND, OR, and XNOR for independent probabilities. Such theories are important to formalize more complex dendritic operations in real neurons, and these operations can be used as activation functions within a neural network, introducing probabilistic Boolean-logic as the core operation of the neural network. Since these functions involve taking multiple exponents and logarithms, they are computationally expensive and not well suited to be directly used within neural networks. Consequently, we construct efficient approximations named $\text{AND}_\text{AIL}$ (the AND operator Approximate for Independent Logits), $\text{OR}_\text{AIL}$, and $\text{XNOR}_\text{AIL}$, which utilize only comparison and addition operations, have well-behaved gradients, and can be deployed as activation functions in neural networks. Like MaxOut, $\text{AND}_\text{AIL}$ and $\text{OR}_\text{AIL}$ are generalizations of ReLU to two-dimensions. While our primary aim is to formalize dendritic computations within a logit-space probabilistic-Boolean framework, we deploy these new activation functions, both in isolation and in conjunction to demonstrate their effectiveness on a variety of tasks including image classification, transfer learning, abstract reasoning, and compositional zero-shot learning.

相關內容

神經網絡(Neural Networks)是世界上三個最古老的神經建模學會的檔案期刊:國際神經網絡學會(INNS)、歐洲神經網絡學會(ENNS)和日本神經網絡學會(JNNS)。神經網絡提供了一個論壇,以發展和培育一個國際社會的學者和實踐者感興趣的所有方面的神經網絡和相關方法的計算智能。神經網絡歡迎高質量論文的提交,有助于全面的神經網絡研究,從行為和大腦建模,學習算法,通過數學和計算分析,系統的工程和技術應用,大量使用神經網絡的概念和技術。這一獨特而廣泛的范圍促進了生物和技術研究之間的思想交流,并有助于促進對生物啟發的計算智能感興趣的跨學科社區的發展。因此,神經網絡編委會代表的專家領域包括心理學,神經生物學,計算機科學,工程,數學,物理。該雜志發表文章、信件和評論以及給編輯的信件、社論、時事、軟件調查和專利信息。文章發表在五個部分之一:認知科學,神經科學,學習系統,數學和計算分析、工程和應用。 官網地址:

In statistics, independent, identically distributed random samples do not carry a natural ordering, and their statistics are typically invariant with respect to permutations of their order. Thus, an $n$-sample in a space $M$ can be considered as an element of the quotient space of $M^n$ modulo the permutation group. The present paper takes this definition of sample space and the related concept of orbit types as a starting point for developing a geometric perspective on statistics. We aim at deriving a general mathematical setting for studying the behavior of empirical and population means in spaces ranging from smooth Riemannian manifolds to general stratified spaces. We fully describe the orbifold and path-metric structure of the sample space when $M$ is a manifold or path-metric space, respectively. These results are non-trivial even when $M$ is Euclidean. We show that the infinite sample space exists in a Gromov-Hausdorff type sense and coincides with the Wasserstein space of probability distributions on $M$. We exhibit Fr\'echet means and $k$-means as metric projections onto 1-skeleta or $k$-skeleta in Wasserstein space, and we define a new and more general notion of polymeans. This geometric characterization via metric projections applies equally to sample and population means, and we use it to establish asymptotic properties of polymeans such as consistency and asymptotic normality.

The purpose of this paper is to examine the sampling problem through Euler discretization, where the potential function is assumed to be a mixture of locally smooth distributions and weakly dissipative. We introduce $\alpha_{G}$-mixture locally smooth and $\alpha_{H}$-mixture locally Hessian smooth, which are novel and typically satisfied with a mixture of distributions. Under our conditions, we prove the convergence in Kullback-Leibler (KL) divergence with the number of iterations to reach $\epsilon$-neighborhood of a target distribution in only polynomial dependence on the dimension. The convergence rate is improved when the potential is $1$-smooth and $\alpha_{H}$-mixture locally Hessian smooth. Our result for the non-strongly convex outside the ball of radius $R$ is obtained by convexifying the non-convex domains. In addition, we provide some nice theoretical properties of $p$-generalized Gaussian smoothing and prove the convergence in the $L_{\beta}$-Wasserstein distance for stochastic gradients in a general setting.

Summation-by-parts (SBP) operators are popular building blocks for systematically developing stable and high-order accurate numerical methods for time-dependent differential equations. The main idea behind existing SBP operators is that the solution is assumed to be well approximated by polynomials up to a certain degree, and the SBP operator should therefore be exact for them. However, polynomials might not provide the best approximation for some problems, and other approximation spaces may be more appropriate. In this paper, a theory for SBP operators based on general function spaces is developed. We demonstrate that most of the established results for polynomial-based SBP operators carry over to this general class of SBP operators. Our findings imply that the concept of SBP operators can be applied to a significantly larger class of methods than currently known. We exemplify the general theory by considering trigonometric, exponential, and radial basis functions.

Local search is an effective method for solving large-scale combinatorial optimization problems, and it has made remarkable progress in recent years through several subtle mechanisms. In this paper, we found two ways to improve the local search algorithms in solving Pseudo-Boolean Optimization(PBO): Firstly, some of those mechanisms such as unit propagation are merely used in solving MaxSAT before, which can be generalized to solve PBO as well; Secondly, the existing local search algorithms utilize the heuristic on variables, so-called score, to mainly guide the search. We attempt to gain more insights into the clause, as it plays the role of a middleman who builds a bridge between variables and the given formula. Hence, we first extended the combination of unit propagation-based decimation algorithm to PBO problem, giving a further generalized definition of unit clause for PBO problem, and apply it to the existing solver LS-PBO for constructing an initial assignment; then, we introduced a new heuristic on clauses, dubbed care, to set a higher priority for the clauses that are less satisfied in current iterations. Experiments on three real-world application benchmarks including minimum-width confidence band, wireless sensor network optimization, and seating arrangement problems show that our algorithm DeciLS-PBO has a promising performance compared to the state-of-the-art algorithms.

We say that $\Gamma$, the boundary of a bounded Lipschitz domain, is locally dilation invariant if, at each $x\in \Gamma$, $\Gamma$ is either locally $C^1$ or locally coincides (in some coordinate system centred at $x$) with a Lipschitz graph $\Gamma_x$ such that $\Gamma_x=\alpha_x\Gamma_x$, for some $\alpha_x\in (0,1)$. In this paper we study, for such $\Gamma$, the essential spectrum of $D_\Gamma$, the double-layer (or Neumann-Poincar\'e) operator of potential theory, on $L^2(\Gamma)$. We show, via localisation and Floquet-Bloch-type arguments, that this essential spectrum %of $D_\Gamma$ %on such $\Gamma$ is the union of the spectra of related continuous families of operators $K_t$, for $t\in [-\pi,\pi]$; moreover, each $K_t$ is compact if $\Gamma$ is $C^1$ except at finitely many points. For the 2D case where, additionally, $\Gamma$ is piecewise analytic, we construct convergent sequences of approximations to the essential spectrum of $D_\Gamma$; each approximation is the union of the eigenvalues of finitely many finite matrices arising from Nystr\"om-method approximations to the operators $K_t$. Through error estimates with explicit constants, we also construct functionals that determine whether any particular locally-dilation-invariant piecewise-analytic $\Gamma$ satisfies the well-known spectral radius conjecture, that the essential spectral radius of $D_\Gamma$ on $L^2(\Gamma)$ is $<1/2$ for all Lipschitz $\Gamma$. We illustrate this theory with examples; for each we show that the essential spectral radius is $<1/2$, providing additional support for the conjecture. We also, via new results on the invariance of the essential spectral radius under locally-conformal $C^{1,\beta}$ diffeomorphisms, show that the spectral radius conjecture holds for all Lipschitz curvilinear polyhedra.

Peridynamic (PD) theory is significant and promising in engineering and materials science; however, it imposes challenges owing to the enormous computational cost caused by its nonlocality. Our main contribution, which overcomes the restrictions of the existing fast method, is a general computational framework for the linear bond-based peridynamic models based on the meshfree method, called the matrix-structure-based fast method (MSBFM), which is suitable for the general case, including 2D/3D problems, and static/dynamic issues, as well as problems with general boundary conditions, in particular, problems with crack propagation. Consequently, we provide a general calculation flow chart. The proposed computational framework is practical and easily embedded into the existing computational algorithm. With this framework, the computational cost is reduced from $O(N^2)$ to $O(N\log N)$, and the storage request is reduced from $O(N^2)$ to $O(N)$, where N is the degree of freedom. Finally, the vast reduction of the computational and memory requirement is verified by numerical examples.

Quantum error correction codes (QECC) are a key component for realizing the potential of quantum computing. QECC, as its classical counterpart (ECC), enables the reduction of error rates, by distributing quantum logical information across redundant physical qubits, such that errors can be detected and corrected. In this work, we efficiently train novel deep quantum error decoders. We resolve the quantum measurement collapse by augmenting syndrome decoding to predict an initial estimate of the system noise, which is then refined iteratively through a deep neural network. The logical error rates calculated over finite fields are directly optimized via a differentiable objective, enabling efficient decoding under the constraints imposed by the code. Finally, our architecture is extended to support faulty syndrome measurement, to allow efficient decoding over repeated syndrome sampling. The proposed method demonstrates the power of neural decoders for QECC by achieving state-of-the-art accuracy, outperforming, for a broad range of topological codes, the existing neural and classical decoders, which are often computationally prohibitive.

We consider Group Control by Adding Individuals (GCAI) in the setting of group identification for two procedural rules -- the consensus-start-respecting rule and the liberal-start-respecting rule. It is known that GCAI for both rules are NP-hard, but whether they are fixed-parameter tractable with respect to the number of distinguished individuals remained open. We resolve both open problems in the affirmative. In addition, we strengthen the NP-hardness of GCAI by showing that, with respect to the natural parameter the number of added individuals, GCAI for both rules are W[2]-hard. Notably, the W[2]-hardness for the liberal-start-respecting rule holds even when restricted to a very special case where the qualifications of individuals satisfy the so-called consecutive ones property. However, for the consensus-start-respecting rule, the problem becomes polynomial-time solvable in this special case. We also study a dual restriction where the disqualifications of individuals fulfill the consecutive ones property, and show that under this restriction GCAI for both rules turn out to be polynomial-time solvable. Our reductions for showing W[2]-hardness also imply several lower bounds concerning kernelization and exact algorithms.

Residual networks (ResNets) have displayed impressive results in pattern recognition and, recently, have garnered considerable theoretical interest due to a perceived link with neural ordinary differential equations (neural ODEs). This link relies on the convergence of network weights to a smooth function as the number of layers increases. We investigate the properties of weights trained by stochastic gradient descent and their scaling with network depth through detailed numerical experiments. We observe the existence of scaling regimes markedly different from those assumed in neural ODE literature. Depending on certain features of the network architecture, such as the smoothness of the activation function, one may obtain an alternative ODE limit, a stochastic differential equation or neither of these. These findings cast doubts on the validity of the neural ODE model as an adequate asymptotic description of deep ResNets and point to an alternative class of differential equations as a better description of the deep network limit.

With the rapid increase of large-scale, real-world datasets, it becomes critical to address the problem of long-tailed data distribution (i.e., a few classes account for most of the data, while most classes are under-represented). Existing solutions typically adopt class re-balancing strategies such as re-sampling and re-weighting based on the number of observations for each class. In this work, we argue that as the number of samples increases, the additional benefit of a newly added data point will diminish. We introduce a novel theoretical framework to measure data overlap by associating with each sample a small neighboring region rather than a single point. The effective number of samples is defined as the volume of samples and can be calculated by a simple formula $(1-\beta^{n})/(1-\beta)$, where $n$ is the number of samples and $\beta \in [0,1)$ is a hyperparameter. We design a re-weighting scheme that uses the effective number of samples for each class to re-balance the loss, thereby yielding a class-balanced loss. Comprehensive experiments are conducted on artificially induced long-tailed CIFAR datasets and large-scale datasets including ImageNet and iNaturalist. Our results show that when trained with the proposed class-balanced loss, the network is able to achieve significant performance gains on long-tailed datasets.

北京阿比特科技有限公司