亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Equivariance of linear neural network layers is well studied. In this work, we relax the equivariance condition to only be true in a projective sense. We propose a way to construct a projectively equivariant neural network through building a standard equivariant network where the linear group representations acting on each intermediate feature space are "multiplicatively modified lifts" of projective group representations. By theoretically studying the relation of projectively and linearly equivariant linear layers, we show that our approach is the most general possible when building a network out of linear layers. The theory is showcased in two simple experiments.

相關內容

In this work, we present a nonlinear dynamics perspective on generating and connecting gaits for energetically conservative models of legged systems. In particular, we show that the set of conservative gaits constitutes a connected space of locally defined 1D submanifolds in the gait space. These manifolds are coordinate-free parameterized by energy level. We present algorithms for identifying such families of gaits through the use of numerical continuation methods, generating sets and bifurcation points. To this end, we also introduce several details for the numerical implementation. Most importantly, we establish the necessary condition for the Delassus' matrix to preserve energy across impacts. An important application of our work is with simple models of legged locomotion that are often able to capture the complexity of legged locomotion with just a few degrees of freedom and a small number of physical parameters. We demonstrate the efficacy of our framework on a one-legged hopper with four degrees of freedom.

We analyze the capabilities of Transformer language models on learning discrete algorithms. To this end, we introduce two new tasks demanding the composition of several discrete sub-tasks. On both training LLaMA models from scratch and prompting on GPT-4 and Gemini we measure learning compositions of learned primitives. We observe that the compositional capabilities of state-of-the-art Transformer language models are very limited and sample-wise scale worse than relearning all sub-tasks for a new algorithmic composition. We also present a theorem in complexity theory, showing that gradient descent on memorizing feedforward models can be exponentially data inefficient.

In this critical survey, we analyze typical claims on the relationship between explainable AI (XAI) and fairness to disentangle the multidimensional relationship between these two concepts. Based on a systematic literature review and a subsequent qualitative content analysis, we identify seven archetypal claims from 175 papers on the alleged fairness benefits of XAI. We present crucial caveats with respect to these claims and provide an entry point for future discussions around the potentials and limitations of XAI for specific fairness desiderata. Importantly, we notice that claims are often (i) vague and simplistic, (ii) lacking normative grounding, or (iii) poorly aligned with the actual capabilities of XAI. We encourage to conceive XAI not as an ethical panacea but as one of many tools to approach the multidimensional, sociotechnical challenge of algorithmic fairness. Moreover, when making a claim about XAI and fairness, we emphasize the need to be more specific about what kind of XAI method is used and which fairness desideratum it refers to, how exactly it enables fairness, and who is the stakeholder that benefits from XAI.

The biological functions of proteins often depend on dynamic structural ensembles. In this work, we develop a flow-based generative modeling approach for learning and sampling the conformational landscapes of proteins. We repurpose highly accurate single-state predictors such as AlphaFold and ESMFold and fine-tune them under a custom flow matching framework to obtain sequence-conditoned generative models of protein structure called AlphaFlow and ESMFlow. When trained and evaluated on the PDB, our method provides a superior combination of precision and diversity compared to AlphaFold with MSA subsampling. When further trained on ensembles from all-atom MD, our method accurately captures conformational flexibility, positional distributions, and higher-order ensemble observables for unseen proteins. Moreover, our method can diversify a static PDB structure with faster wall-clock convergence to certain equilibrium properties than replicate MD trajectories, demonstrating its potential as a proxy for expensive physics-based simulations. Code is available at //github.com/bjing2016/alphaflow.

In recent years, new regularization methods based on (deep) neural networks have shown very promising empirical performance for the numerical solution of ill-posed problems, such as in medical imaging and imaging science. Due to the nonlinearity of neural networks, these methods often lack satisfactory theoretical justification. In this work, we rigorously discuss the convergence of a successful unsupervised approach that utilizes untrained convolutional neural networks to represent solutions to linear ill-posed problems. Untrained neural networks have particular appeal for many applications because they do not require paired training data. The regularization property of the approach relies solely on the architecture of the neural network instead. Due to the vast over-parameterization of the employed neural network, suitable early stopping is essential for the success of the method. We establish that the classical discrepancy principle is an adequate method for early stopping of two-layer untrained convolutional neural networks learned by gradient descent, and furthermore, it yields an approximation with minimax optimal convergence rates. Numerical results are also presented to illustrate the theoretical findings.

In this work, we present some new results for compressed sensing and phase retrieval. For compressed sensing, it is shown that if the unknown $n$-dimensional vector can be expressed as a linear combination of $s$ unknown Vandermonde vectors (with Fourier vectors as a special case) and the measurement matrix is a Vandermonde matrix, exact recovery of the vector with $2s$ measurements and $O(\mathrm{poly}(s))$ complexity is possible when $n \geq 2s$. From these results, a measurement matrix is constructed from which it is possible to recover $s$-sparse $n$-dimensional vectors for $n \geq 2s$ with as few as $2s$ measurements and with a recovery algorithm of $O(\mathrm{poly}(s))$ complexity. In the second part of the work, these results are extended to the challenging problem of phase retrieval. The most significant discovery in this direction is that if the unknown $n$-dimensional vector is composed of $s$ frequencies with at least one being non-harmonic, $n \geq 4s - 1$ and we take at least $8s-3$ Fourier measurements, there are, remarkably, only two possible vectors producing the observed measurement values and they are easily obtainable from each other. The two vectors can be found by an algorithm with only $O(\mathrm{poly}(s))$ complexity. An immediate application of the new result is construction of a measurement matrix from which it is possible to recover almost all $s$-sparse $n$-dimensional signals (up to a global phase) from $O(s)$ magnitude-only measurements and $O(\mathrm{poly}(s))$ recovery complexity when $n \geq 4s - 1$.

We investigate the contraction properties of locally differentially private mechanisms. More specifically, we derive tight upper bounds on the divergence between $PK$ and $QK$ output distributions of an $\epsilon$-LDP mechanism $K$ in terms of a divergence between the corresponding input distributions $P$ and $Q$, respectively. Our first main technical result presents a sharp upper bound on the $\chi^2$-divergence $\chi^2(PK}\|QK)$ in terms of $\chi^2(P\|Q)$ and $\varepsilon$. We also show that the same result holds for a large family of divergences, including KL-divergence and squared Hellinger distance. The second main technical result gives an upper bound on $\chi^2(PK\|QK)$ in terms of total variation distance $\mathsf{TV}(P, Q)$ and $\epsilon$. We then utilize these bounds to establish locally private versions of the van Trees inequality, Le Cam's, Assouad's, and the mutual information methods, which are powerful tools for bounding minimax estimation risks. These results are shown to lead to better privacy analyses than the state-of-the-arts in several statistical problems such as entropy and discrete distribution estimation, non-parametric density estimation, and hypothesis testing.

Large Language Models (LLMs) have shown excellent generalization capabilities that have led to the development of numerous models. These models propose various new architectures, tweaking existing architectures with refined training strategies, increasing context length, using high-quality training data, and increasing training time to outperform baselines. Analyzing new developments is crucial for identifying changes that enhance training stability and improve generalization in LLMs. This survey paper comprehensively analyses the LLMs architectures and their categorization, training strategies, training datasets, and performance evaluations and discusses future research directions. Moreover, the paper also discusses the basic building blocks and concepts behind LLMs, followed by a complete overview of LLMs, including their important features and functions. Finally, the paper summarizes significant findings from LLM research and consolidates essential architectural and training strategies for developing advanced LLMs. Given the continuous advancements in LLMs, we intend to regularly update this paper by incorporating new sections and featuring the latest LLM models.

The concept of causality plays an important role in human cognition . In the past few decades, causal inference has been well developed in many fields, such as computer science, medicine, economics, and education. With the advancement of deep learning techniques, it has been increasingly used in causal inference against counterfactual data. Typically, deep causal models map the characteristics of covariates to a representation space and then design various objective optimization functions to estimate counterfactual data unbiasedly based on the different optimization methods. This paper focuses on the survey of the deep causal models, and its core contributions are as follows: 1) we provide relevant metrics under multiple treatments and continuous-dose treatment; 2) we incorporate a comprehensive overview of deep causal models from both temporal development and method classification perspectives; 3) we assist a detailed and comprehensive classification and analysis of relevant datasets and source code.

We consider the problem of explaining the predictions of graph neural networks (GNNs), which otherwise are considered as black boxes. Existing methods invariably focus on explaining the importance of graph nodes or edges but ignore the substructures of graphs, which are more intuitive and human-intelligible. In this work, we propose a novel method, known as SubgraphX, to explain GNNs by identifying important subgraphs. Given a trained GNN model and an input graph, our SubgraphX explains its predictions by efficiently exploring different subgraphs with Monte Carlo tree search. To make the tree search more effective, we propose to use Shapley values as a measure of subgraph importance, which can also capture the interactions among different subgraphs. To expedite computations, we propose efficient approximation schemes to compute Shapley values for graph data. Our work represents the first attempt to explain GNNs via identifying subgraphs explicitly and directly. Experimental results show that our SubgraphX achieves significantly improved explanations, while keeping computations at a reasonable level.

北京阿比特科技有限公司