亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

This study introduces a reformed Sinc-convolution (Sincconv) framework tailored for the encoder component of deep networks for speech enhancement (SE). The reformed Sincconv, based on parametrized sinc functions as band-pass filters, offers notable advantages in terms of training efficiency, filter diversity, and interpretability. The reformed Sinc-conv is evaluated in conjunction with various SE models, showcasing its ability to boost SE performance. Furthermore, the reformed Sincconv provides valuable insights into the specific frequency components that are prioritized in an SE scenario. This opens up a new direction of SE research and improving our knowledge of their operating dynamics.

相關內容

語音增強是指當語音信號被各種各樣的噪聲干擾、甚至淹沒后,從噪聲背景中提取有用的語音信號,抑制、降低噪聲干擾的技術。一句話,從含噪語音中提取盡可能純凈的原始語音。

This study explores implementing a digital twin network (DTN) for efficient 6G wireless network management, aligning with the fault, configuration, accounting, performance, and security (FCAPS) model. The DTN architecture comprises the Physical Twin Layer, implemented using NS-3, and the Service Layer, featuring machine learning and reinforcement learning for optimizing carrier sensitivity threshold and transmit power control in wireless networks. We introduce a robust "What-if Analysis" module, utilizing conditional tabular generative adversarial network (CTGAN) for synthetic data generation to mimic various network scenarios. These scenarios assess four network performance metrics: throughput, latency, packet loss, and coverage. Our findings demonstrate the efficiency of the proposed what-if analysis framework in managing complex network conditions, highlighting the importance of the scenario-maker step and the impact of twinning intervals on network performance.

What makes waveform-based deep learning so hard? Despite numerous attempts at training convolutional neural networks (convnets) for filterbank design, they often fail to outperform hand-crafted baselines. These baselines are linear time-invariant systems: as such, they can be approximated by convnets with wide receptive fields. Yet, in practice, gradient-based optimization leads to suboptimal approximations. In our article, we approach this phenomenon from the perspective of initialization. We present a theory of large deviations for the energy response of FIR filterbanks with random Gaussian weights. We find that deviations worsen for large filters and locally periodic input signals, which are both typical for audio signal processing applications. Numerical simulations align with our theory and suggest that the condition number of a convolutional layer follows a logarithmic scaling law between the number and length of the filters, which is reminiscent of discrete wavelet bases.

This study devised a physics-informed neural network (PINN) framework to solve the wave equation for acoustic resonance analysis. The proposed analytical model, ResoNet, minimizes the loss function for periodic solutions and conventional PINN loss functions, thereby effectively using the function approximation capability of neural networks while performing resonance analysis. Additionally, it can be easily applied to inverse problems. The resonance in a one-dimensional acoustic tube, and the effectiveness of the proposed method was validated through the forward and inverse analyses of the wave equation with energy-loss terms. In the forward analysis, the applicability of PINN to the resonance problem was evaluated via comparison with the finite-difference method. The inverse analysis, which included identifying the energy loss term in the wave equation and design optimization of the acoustic tube, was performed with good accuracy.

Extended reality (XR) is one of the most important applications of beyond 5G and 6G networks. Real-time XR video transmission presents challenges in terms of data rate and delay. In particular, the frame-by-frame transmission mode of XR video makes real-time XR video very sensitive to dynamic network environments. To improve the users' quality of experience (QoE), we design a cross-layer transmission framework for real-time XR video. The proposed framework allows the simple information exchange between the base station (BS) and the XR server, which assists in adaptive bitrate and wireless resource scheduling. We utilize the cross-layer information to formulate the problem of maximizing user QoE by finding the optimal scheduling and bitrate adjustment strategies. To address the issue of mismatched time scales between two strategies, we decouple the original problem and solve them individually using a multi-agent-based approach. Specifically, we propose the multi-step Deep Q-network (MS-DQN) algorithm to obtain a frame-priority-based wireless resource scheduling strategy and then propose the Transformer-based Proximal Policy Optimization (TPPO) algorithm for video bitrate adaptation. The experimental results show that the TPPO+MS-DQN algorithm proposed in this study can improve the QoE by 3.6% to 37.8%. More specifically, the proposed MS-DQN algorithm enhances the transmission quality by 49.9%-80.2%.

This study aims to establish a computer-aided diagnosis system for endobronchial ultrasound (EBUS) surgery to assist physicians in the preliminary diagnosis of metastatic cancer. This involves arranging immediate examinations for other sites of metastatic cancer after EBUS surgery, eliminating the need to wait for reports, thereby shortening the waiting time by more than half and enabling patients to detect other cancers earlier, allowing for early planning and implementation of treatment plans. Unlike previous studies on cell image classification, which have abundant datasets for training, this study must also be able to make effective classifications despite the limited amount of case data for lung metastatic cancer. In the realm of small data set classification methods, Few-shot learning (FSL) has become mainstream in recent years. Through its ability to train on small datasets and its strong generalization capabilities, FSL shows potential in this task of lung metastatic cell image classification. This study will adopt the approach of Few-shot learning, referencing existing proposed models, and designing a model architecture for classifying lung metastases cell images. Batch Spectral Regularization (BSR) will be incorporated as a loss update parameter, and the Finetune method of PMF will be modified. In terms of test results, the addition of BSR and the modified Finetune method further increases the accuracy by 8.89% to 65.60%, outperforming other FSL methods. This study confirms that FSL is superior to supervised and transfer learning in classifying metastatic cancer and demonstrates that using BSR as a loss function and modifying Finetune can enhance the model's capabilities.

Energy efficiency and memory footprint of a convolutional neural network (CNN) implemented on a CNN inference accelerator depend on many factors, including a weight quantization strategy (i.e., data types and bit-widths) and mapping (i.e., placement and scheduling of DNN elementary operations on hardware units of the accelerator). We show that enabling rich mixed quantization schemes during the implementation can open a previously hidden space of mappings that utilize the hardware resources more effectively. CNNs utilizing quantized weights and activations and suitable mappings can significantly improve trade-offs among the accuracy, energy, and memory requirements compared to less carefully optimized CNN implementations. To find, analyze, and exploit these mappings, we: (i) extend a general-purpose state-of-the-art mapping tool (Timeloop) to support mixed quantization, which is not currently available; (ii) propose an efficient multi-objective optimization algorithm to find the most suitable bit-widths and mapping for each DNN layer executed on the accelerator; and (iii) conduct a detailed experimental evaluation to validate the proposed method. On two CNNs (MobileNetV1 and MobileNetV2) and two accelerators (Eyeriss and Simba) we show that for a given quality metric (such as the accuracy on ImageNet), energy savings are up to 37% without any accuracy drop.

We present a framework to integrate tensor network (TN) methods with reinforcement learning (RL) for solving dynamical optimisation tasks. We consider the RL actor-critic method, a model-free approach for solving RL problems, and introduce TNs as the approximators for its policy and value functions. Our "actor-critic with tensor networks" (ACTeN) method is especially well suited to problems with large and factorisable state and action spaces. As an illustration of the applicability of ACTeN we solve the exponentially hard task of sampling rare trajectories in two paradigmatic stochastic models, the East model of glasses and the asymmetric simple exclusion process (ASEP), the latter being particularly challenging to other methods due to the absence of detailed balance. With substantial potential for further integration with the vast array of existing RL methods, the approach introduced here is promising both for applications in physics and to multi-agent RL problems more generally.

The core network is experiencing bandwidth capacity constraints as internet traffic grows. As a result, the notion of a Multi-band flexible-grid optical network was established to increase the lifespan of an optical core network. In this paper, we use the C+L band for working traffic transmission and the S-band for protection against failure. Furthermore, we compare the proposed method with the existing ones.

Recent advancements in Text-to-SQL (Text2SQL) emphasize stimulating the large language models (LLM) on in-context learning, achieving significant results. Nevertheless, they face challenges when dealing with verbose database information and complex user intentions. This paper presents a two-stage framework to enhance the performance of current LLM-based natural language to SQL systems. We first introduce a novel prompt representation, called reference-enhanced representation, which includes schema information and randomly sampled cell values from tables to instruct LLMs in generating SQL queries. Then, in the first stage, question-SQL pairs are retrieved as few-shot demonstrations, prompting the LLM to generate a preliminary SQL (PreSQL). After that, the mentioned entities in PreSQL are parsed to conduct schema linking, which can significantly compact the useful information. In the second stage, with the linked schema, we simplify the prompt's schema information and instruct the LLM to produce the final SQL. Finally, as the post-refinement module, we propose using cross-consistency across different LLMs rather than self-consistency within a particular LLM. Our methods achieve new SOTA results on the Spider benchmark, with an execution accuracy of 87.6%.

In this article, we evaluate the first experience of computation offloading from drones to real fifth-generation (5G) operator systems, including commercial and private carrier-grade 5G networks. A follow-me drone service was implemented as a representative testbed of remote video analytics. In this application, an image of a person from a drone camera is processed at the edge, and image tracking displacements are translated into positioning commands that are sent back to the drone, so that the drone keeps the camera focused on the person at all times. The application is characterised to identify the processing and communication contributions to service delay. Then, we evaluate the latency of the application in a real non standalone 5G operator network, a standalone carrier-grade 5G private network, and, to compare these results with previous research, a Wi-Fi wireless local area network. We considered both multi-access edge computing (MEC) and cloud offloading scenarios. Onboard computing was also evaluated to assess the trade-offs with task offloading. The results determine the network configurations that are feasible for the follow-me application use case depending on the mobility of the end user, and to what extent MEC is advantageous over a state-of-the-art cloud service.

北京阿比特科技有限公司