亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Recently, Bessa et al. (PODS 2023) showed that sketches based on coordinated weighted sampling theoretically and empirically outperform popular linear sketching methods like Johnson-Lindentrauss projection and CountSketch for the ubiquitous problem of inner product estimation. We further develop this finding by introducing and analyzing two alternative sampling-based methods. In contrast to the computationally expensive algorithm in Bessa et al., our methods run in linear time (to compute the sketch) and perform better in practice, significantly beating linear sketching on a variety of tasks. For example, they provide state-of-the-art results for estimating the correlation between columns in unjoined tables, a problem that we show how to reduce to inner product estimation in a black-box way. While based on known sampling techniques (threshold and priority sampling) we introduce significant new theoretical analysis to prove approximation guarantees for our methods.

相關內容

Large language models (LLMs) often contain misleading content, emphasizing the need to align them with human values to ensure secure AI systems. Reinforcement learning from human feedback (RLHF) has been employed to achieve this alignment. However, it encompasses two main drawbacks: (1) RLHF exhibits complexity, instability, and sensitivity to hyperparameters in contrast to SFT. (2) Despite massive trial-and-error, multiple sampling is reduced to pair-wise contrast, thus lacking contrasts from a macro perspective. In this paper, we propose Preference Ranking Optimization (PRO) as an efficient SFT algorithm to directly fine-tune LLMs for human alignment. PRO extends the pair-wise contrast to accommodate preference rankings of any length. By iteratively contrasting candidates, PRO instructs the LLM to prioritize the best response while progressively ranking the rest responses. In this manner, PRO effectively transforms human alignment into aligning the probability ranking of n responses generated by LLM with the preference ranking of humans towards these responses. Experiments have shown that PRO outperforms baseline algorithms, achieving comparable results to ChatGPT and human responses through automatic-based, reward-based, GPT-4, and human evaluations.

We study Dorfman's classical group testing protocol in a novel setting where individual specimen statuses are modeled as exchangeable random variables. We are motivated by infectious disease screening. In that case, specimens which arrive together for testing often originate from the same community and so their statuses may exhibit positive correlation. Dorfman's protocol screens a population of n specimens for a binary trait by partitioning it into non-overlapping groups, testing these, and only individually retesting the specimens of each positive group. The partition is chosen to minimize the expected number of tests under a probabilistic model of specimen statuses. We relax the typical assumption that these are independent and identically distributed and instead model them as exchangeable random variables. In this case, their joint distribution is symmetric in the sense that it is invariant under permutations. We give a characterization of such distributions in terms of a function q where q(h) is the marginal probability that any group of size h tests negative. We use this interpretable representation to show that the set partitioning problem arising in Dorfman's protocol can be reduced to an integer partitioning problem and efficiently solved. We apply these tools to an empirical dataset from the COVID-19 pandemic. The methodology helps explain the unexpectedly high empirical efficiency reported by the original investigators.

A core tension in the study of plurality elections is the clash between the classic Hotelling-Downs model, which predicts that two office-seeking candidates should position themselves at the median voter's policy, and the empirical observation that real-world democracies often have two major parties with divergent policies. Motivated by this tension and drawing from bounded rationality, we introduce a dynamic model of candidate positioning based on a simple behavioral heuristic: candidates imitate the policy of previous winners. The resulting model is closely connected to evolutionary replicator dynamics and exhibits complex behavior, despite its simplicity. For uniformly-distributed voters, we prove that when there are $k = 2$, $3$, or $4$ candidates per election, any symmetric candidate distribution converges over time to a concentration of candidates at the center. With $k \ge 5$, however, we prove that the candidate distribution does not converge to the center. For initial distributions without any extreme candidates, we prove a stronger statement than non-convergence, showing that the density in an interval around the center goes to zero when $k \ge 5$. As a matter of robustness, our conclusions are qualitatively unchanged if a small fraction of candidates are not winner-copiers and are instead positioned uniformly at random. Beyond our theoretical analysis, we illustrate our results in simulation; for five or more candidates, we find a tendency towards the emergence of two clusters, a mechanism suggestive of Duverger's Law, the empirical finding that plurality leads to two-party systems. Our simulations also explore several variations of the model, including non-uniform voter distributions and other forms of noise, which exhibit similar convergence patterns. Finally, we discuss the relationship between our model and prior work on strategic equilibria of candidate positioning games.

We present a study of a kernel-based two-sample test statistic related to the Maximum Mean Discrepancy (MMD) in the manifold data setting, assuming that high-dimensional observations are close to a low-dimensional manifold. We characterize the test level and power in relation to the kernel bandwidth, the number of samples, and the intrinsic dimensionality of the manifold. Specifically, when data densities $p$ and $q$ are supported on a $d$-dimensional sub-manifold ${M}$ embedded in an $m$-dimensional space and are H\"older with order $\beta$ (up to 2) on ${M}$, we prove a guarantee of the test power for finite sample size $n$ that exceeds a threshold depending on $d$, $\beta$, and $\Delta_2$ the squared $L^2$-divergence between $p$ and $q$ on the manifold, and with a properly chosen kernel bandwidth $\gamma$. For small density departures, we show that with large $n$ they can be detected by the kernel test when $\Delta_2$ is greater than $n^{- { 2 \beta/( d + 4 \beta ) }}$ up to a certain constant and $\gamma$ scales as $n^{-1/(d+4\beta)}$. The analysis extends to cases where the manifold has a boundary and the data samples contain high-dimensional additive noise. Our results indicate that the kernel two-sample test has no curse-of-dimensionality when the data lie on or near a low-dimensional manifold. We validate our theory and the properties of the kernel test for manifold data through a series of numerical experiments.

We show the effectiveness of automatic differentiation in efficiently and correctly computing and controlling the spectrum of implicitly linear operators, a rich family of layer types including all standard convolutional and dense layers. We provide the first clipping method which is correct for general convolution layers, and illuminate the representational limitation that caused correctness issues in prior work. We study the effect of the batch normalization layers when concatenated with convolutional layers and show how our clipping method can be applied to their composition. By comparing the accuracy and performance of our algorithms to the state-of-the-art methods, using various experiments, we show they are more precise and efficient and lead to better generalization and adversarial robustness. We provide the code for using our methods at //github.com/Ali-E/FastClip.

Bayesian Generative AI (BayesGen-AI) methods are developed and applied to Bayesian computation. BayesGen-AI reconstructs the posterior distribution by directly modeling the parameter of interest as a mapping (a.k.a. deep learner) from a large simulated dataset. This provides a generator that we can evaluate at the observed data and provide draws from the posterior distribution. This method applies to all forms of Bayesian inference including parametric models, likelihood-free models, prediction and maximum expected utility problems. Bayesian computation is then equivalent to high dimensional non-parametric regression. Bayes Gen-AI main advantage is that it is density-free and therefore provides an alternative to Markov Chain Monte Carlo. It has a number of advantages over vanilla generative adversarial networks (GAN) and approximate Bayesian computation (ABC) methods due to the fact that the generator is simpler to learn than a GAN architecture and is more flexible than kernel smoothing implicit in ABC methods. Design of the Network Architecture requires careful selection of features (a.k.a. dimension reduction) and nonlinear architecture for inference. As a generic architecture, we propose a deep quantile neural network and a uniform base distribution at which to evaluate the generator. To illustrate our methodology, we provide two real data examples, the first in traffic flow prediction and the second in building a surrogate for satellite drag data-set. Finally, we conclude with directions for future research.

We propose a novel data-driven linear inverse model, called Colored-LIM, to extract the linear dynamics and diffusion matrix that define a linear stochastic process driven by an Ornstein-Uhlenbeck colored-noise. The Colored-LIM is a new variant of the classical linear inverse model (LIM) which relies on the white noise assumption. Similar to LIM, the Colored-LIM approximates the linear dynamics from a finite realization of a stochastic process and then solves the diffusion matrix based on, for instance, a generalized fluctuation-dissipation relation, which can be done by solving a system of linear equations. The main difficulty is that in practice, the colored-noise process can be hardly observed while it is correlated to the stochastic process of interest. Nevertheless, we show that the local behavior of the correlation function of the observable encodes the dynamics of the stochastic process and the diffusive behavior of the colored-noise. In this article, we review the classical LIM and develop Colored-LIM with a mathematical background and rigorous derivations. In the numerical experiments, we examine the performance of both LIM and Colored-LIM. Finally, we discuss some false attempts to build a linear inverse model for colored-noise driven processes, and investigate the potential misuse and its consequence of LIM in the appendices.

2D-based Industrial Anomaly Detection has been widely discussed, however, multimodal industrial anomaly detection based on 3D point clouds and RGB images still has many untouched fields. Existing multimodal industrial anomaly detection methods directly concatenate the multimodal features, which leads to a strong disturbance between features and harms the detection performance. In this paper, we propose Multi-3D-Memory (M3DM), a novel multimodal anomaly detection method with hybrid fusion scheme: firstly, we design an unsupervised feature fusion with patch-wise contrastive learning to encourage the interaction of different modal features; secondly, we use a decision layer fusion with multiple memory banks to avoid loss of information and additional novelty classifiers to make the final decision. We further propose a point feature alignment operation to better align the point cloud and RGB features. Extensive experiments show that our multimodal industrial anomaly detection model outperforms the state-of-the-art (SOTA) methods on both detection and segmentation precision on MVTec-3D AD dataset. Code is available at //github.com/nomewang/M3DM.

Recent contrastive representation learning methods rely on estimating mutual information (MI) between multiple views of an underlying context. E.g., we can derive multiple views of a given image by applying data augmentation, or we can split a sequence into views comprising the past and future of some step in the sequence. Contrastive lower bounds on MI are easy to optimize, but have a strong underestimation bias when estimating large amounts of MI. We propose decomposing the full MI estimation problem into a sum of smaller estimation problems by splitting one of the views into progressively more informed subviews and by applying the chain rule on MI between the decomposed views. This expression contains a sum of unconditional and conditional MI terms, each measuring modest chunks of the total MI, which facilitates approximation via contrastive bounds. To maximize the sum, we formulate a contrastive lower bound on the conditional MI which can be approximated efficiently. We refer to our general approach as Decomposed Estimation of Mutual Information (DEMI). We show that DEMI can capture a larger amount of MI than standard non-decomposed contrastive bounds in a synthetic setting, and learns better representations in a vision domain and for dialogue generation.

We study the problem of textual relation embedding with distant supervision. To combat the wrong labeling problem of distant supervision, we propose to embed textual relations with global statistics of relations, i.e., the co-occurrence statistics of textual and knowledge base relations collected from the entire corpus. This approach turns out to be more robust to the training noise introduced by distant supervision. On a popular relation extraction dataset, we show that the learned textual relation embedding can be used to augment existing relation extraction models and significantly improve their performance. Most remarkably, for the top 1,000 relational facts discovered by the best existing model, the precision can be improved from 83.9% to 89.3%.

北京阿比特科技有限公司