亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

We propose a novel semi-supervised active learning (SSAL) framework for monocular 3D object detection with LiDAR guidance (MonoLiG), which leverages all modalities of collected data during model development. We utilize LiDAR to guide the data selection and training of monocular 3D detectors without introducing any overhead in the inference phase. During training, we leverage the LiDAR teacher, monocular student cross-modal framework from semi-supervised learning to distill information from unlabeled data as pseudo-labels. To handle the differences in sensor characteristics, we propose a data noise-based weighting mechanism to reduce the effect of propagating noise from LiDAR modality to monocular. For selecting which samples to label to improve the model performance, we propose a sensor consistency-based selection score that is also coherent with the training objective. Extensive experimental results on KITTI and Waymo datasets verify the effectiveness of our proposed framework. In particular, our selection strategy consistently outperforms state-of-the-art active learning baselines, yielding up to 17% better saving rate in labeling costs. Our training strategy attains the top place in KITTI 3D and birds-eye-view (BEV) monocular object detection official benchmarks by improving the BEV Average Precision (AP) by 2.02.

相關內容

The latest advancements in AI and deep learning have led to a breakthrough in large language model (LLM)-based agents such as GPT-4. However, many commercial conversational agent development tools are pipeline-based and have limitations in holding a human-like conversation. This paper investigates the capabilities of LLMs to enhance pipeline-based conversational agents during two phases: 1) in the design and development phase and 2) during operations. In 1) LLMs can aid in generating training data, extracting entities and synonyms, localization, and persona design. In 2) LLMs can assist in contextualization, intent classification to prevent conversational breakdown and handle out-of-scope questions, auto-correcting utterances, rephrasing responses, formulating disambiguation questions, summarization, and enabling closed question-answering capabilities. We conducted informal experiments with GPT-4 in the private banking domain to demonstrate the scenarios above with a practical example. Companies may be hesitant to replace their pipeline-based agents with LLMs entirely due to privacy concerns and the need for deep integration within their existing ecosystems. A hybrid approach in which LLMs' are integrated into the pipeline-based agents allows them to save time and costs of building and running agents by capitalizing on the capabilities of LLMs while retaining the integration and privacy safeguards of their existing systems.

Patch robustness certification ensures no patch within a given bound on a sample can manipulate a deep learning model to predict a different label. However, existing techniques cannot certify samples that cannot meet their strict bars at the classifier or patch region levels. This paper proposes MajorCert. MajorCert firstly finds all possible label sets manipulatable by the same patch region on the same sample across the underlying classifiers, then enumerates their combinations element-wise, and finally checks whether the majority invariant of all these combinations is intact to certify samples.

Deep learning combined with physics-based modeling represents an attractive and efficient approach for producing accurate and robust surrogate modeling. In this paper, a new framework that utilizes Physics Informed Neural Networks (PINN) to solve PDE-based problems for the creation of surrogate models for steady-state flow-thermal engineering design applications is introduced. The surrogate models developed through this framework are demonstrated on several use cases from electronics cooling to biomechanics. Additionally, it is demonstrated how these trained surrogate models can be combined with design optimization methods to improve the efficiency and reduced the cost of the design process. The former is shown through several realistic 3D examples and the latter via a detailed cost-benefit trade off. Overall, the findings of this paper demonstrate that hybrid data-PINN surrogate models combined with optimization algorithms can solve realistic design optimization and have potential in a wide variety of application areas.

This study considers a novel full-duplex (FD) massive multiple-input multiple-output (mMIMO) system using hybrid beamforming (HBF) architecture, which allows for simultaneous uplink (UL) and downlink (DL) transmission over the same frequency band. Particularly, our objective is to mitigate the strong self-interference (SI) solely on the design of UL and DL RF beamforming stages jointly with sub-array selection (SAS) for transmit (Tx) and receive (Rx) sub-arrays at base station (BS). Based on the measured SI channel in an anechoic chamber, we propose a min-SI beamforming scheme with SAS, which applies perturbations to the beam directivity to enhance SI suppression in UL and DL beam directions. To solve this challenging nonconvex optimization problem, we propose a swarm intelligence-based algorithmic solution to find the optimal perturbations as well as the Tx and Rx sub-arrays to minimize SI subject to the directivity degradation constraints for the UL and DL beams. The results show that the proposed min-SI BF scheme can achieve SI suppression as high as 78 dB in FD mMIMO systems.

Neural temporal point processes(TPPs) have shown promise for modeling continuous-time event sequences. However, capturing the interactions between events is challenging yet critical for performing inference tasks like forecasting on event sequence data. Existing TPP models have focused on parameterizing the conditional distribution of future events but struggle to model event interactions. In this paper, we propose a novel approach that leverages Neural Relational Inference (NRI) to learn a relation graph that infers interactions while simultaneously learning the dynamics patterns from observational data. Our approach, the Contrastive Relational Inference-based Hawkes Process (CRIHP), reasons about event interactions under a variational inference framework. It utilizes intensity-based learning to search for prototype paths to contrast relationship constraints. Extensive experiments on three real-world datasets demonstrate the effectiveness of our model in capturing event interactions for event sequence modeling tasks.

We introduce a Bayesian framework for mixed-type multivariate regression using continuous shrinkage priors. Our framework enables joint analysis of mixed continuous and discrete outcomes and facilitates variable selection from the $p$ covariates. Theoretical studies of Bayesian mixed-type multivariate response models have not been conducted previously and require more intricate arguments than the corresponding theory for univariate response models due to the correlations between the responses. In this paper, we investigate necessary and sufficient conditions for posterior contraction of our method when $p$ grows faster than sample size $n$. The existing literature on Bayesian high-dimensional asymptotics has focused only on cases where $p$ grows subexponentially with $n$. In contrast, we consider the asymptotic regime where $p$ is allowed to grow exponentially in terms of $n$. We develop a novel two-step approach for variable selection which possesses the sure screening property and provably achieves posterior contraction even under exponential growth of $p$. We demonstrate the utility of our method through simulation studies and applications to real datasets. The R code to implement our method is available at //github.com/raybai07/MtMBSP.

Delay alignment modulation (DAM) is an emerging technique for achieving inter-symbol interference (ISI)-free wideband communications using spatial-delay processing, without relying on channel equalization or multi-carrier transmission. However, existing works on DAM only consider multiple-input single-output (MISO) communication systems and assume time-invariant channels. In this paper, by extending DAM to time-variant frequency-selective multiple-input multiple-output (MIMO) channels, we propose a novel technique termed \emph{delay-Doppler alignment modulation} (DDAM). Specifically, by leveraging \emph{delay-Doppler compensation} and \emph{path-based beamforming}, the Doppler effect of each multi-path can be eliminated and all multi-path signal components may reach the receiver concurrently and constructively. We first show that by applying path-based zero-forcing (ZF) precoding and receive combining, DDAM can transform the original time-variant frequency-selective channels into time-invariant ISI-free channels. The necessary and/or sufficient conditions to achieve such a transformation are derived. Then an asymptotic analysis is provided by showing that when the number of base station (BS) antennas is much larger than that of channel paths, DDAM enables time-invariant ISI-free channels with the simple delay-Doppler compensation and path-based maximal-ratio transmission (MRT) beamforming. Furthermore, for the general DDAM design with some tolerable ISI, the path-based transmit precoding and receive combining matrices are optimized to maximize the spectral efficiency. Numerical results are provided to compare the proposed DDAM technique with various benchmarking schemes, including MIMO-orthogonal time frequency space (OTFS), MIMO-orthogonal frequency-division multiplexing (OFDM) without or with carrier frequency offset (CFO) compensation, and beam alignment along the dominant path.

Recent contrastive representation learning methods rely on estimating mutual information (MI) between multiple views of an underlying context. E.g., we can derive multiple views of a given image by applying data augmentation, or we can split a sequence into views comprising the past and future of some step in the sequence. Contrastive lower bounds on MI are easy to optimize, but have a strong underestimation bias when estimating large amounts of MI. We propose decomposing the full MI estimation problem into a sum of smaller estimation problems by splitting one of the views into progressively more informed subviews and by applying the chain rule on MI between the decomposed views. This expression contains a sum of unconditional and conditional MI terms, each measuring modest chunks of the total MI, which facilitates approximation via contrastive bounds. To maximize the sum, we formulate a contrastive lower bound on the conditional MI which can be approximated efficiently. We refer to our general approach as Decomposed Estimation of Mutual Information (DEMI). We show that DEMI can capture a larger amount of MI than standard non-decomposed contrastive bounds in a synthetic setting, and learns better representations in a vision domain and for dialogue generation.

This paper presents SimCLR: a simple framework for contrastive learning of visual representations. We simplify recently proposed contrastive self-supervised learning algorithms without requiring specialized architectures or a memory bank. In order to understand what enables the contrastive prediction tasks to learn useful representations, we systematically study the major components of our framework. We show that (1) composition of data augmentations plays a critical role in defining effective predictive tasks, (2) introducing a learnable nonlinear transformation between the representation and the contrastive loss substantially improves the quality of the learned representations, and (3) contrastive learning benefits from larger batch sizes and more training steps compared to supervised learning. By combining these findings, we are able to considerably outperform previous methods for self-supervised and semi-supervised learning on ImageNet. A linear classifier trained on self-supervised representations learned by SimCLR achieves 76.5% top-1 accuracy, which is a 7% relative improvement over previous state-of-the-art, matching the performance of a supervised ResNet-50. When fine-tuned on only 1% of the labels, we achieve 85.8% top-5 accuracy, outperforming AlexNet with 100X fewer labels.

We propose a new method for event extraction (EE) task based on an imitation learning framework, specifically, inverse reinforcement learning (IRL) via generative adversarial network (GAN). The GAN estimates proper rewards according to the difference between the actions committed by the expert (or ground truth) and the agent among complicated states in the environment. EE task benefits from these dynamic rewards because instances and labels yield to various extents of difficulty and the gains are expected to be diverse -- e.g., an ambiguous but correctly detected trigger or argument should receive high gains -- while the traditional RL models usually neglect such differences and pay equal attention on all instances. Moreover, our experiments also demonstrate that the proposed framework outperforms state-of-the-art methods, without explicit feature engineering.

北京阿比特科技有限公司