亚洲男人的天堂2018av,欧美草比,久久久久久免费视频精选,国色天香在线看免费,久久久久亚洲av成人片仓井空

Human speech production encompasses physiological processes that naturally react to physic stress. Stress caused by physical activity (PA), e.g., running, may lead to significant changes in a person's speech. The major changes are related to the aspects of pitch level, speaking rate, pause pattern, and breathiness. The extent of change depends presumably on physical fitness and well-being of the person, as well as intensity of PA. The general wellness of a person is further related to his/her physical literacy (PL), which refers to a holistic description of engagement in PA. This paper presents the development of a Cantonese speech database that contains audio recordings of speech before and after physical exercises of different intensity levels. The corpus design and data collection process are described. Preliminary results of acoustical analysis are presented to illustrate the impact of PA on pitch level, pitch range, speaking and articulation rate, and time duration of pauses. It is also noted that the effect of PA is correlated to some of the PA and PL measures.

相關內容

 Processing 是一門開源編程語言和與之配套的集成開發環境(IDE)的名稱。Processing 在電子藝術和視覺設計社區被用來教授編程基礎,并運用于大量的新媒體和互動藝術作品中。

The COVID-19 pandemic in 2020 has caused sudden shocks in transportation systems, specifically the subway ridership patterns in New York City. Understanding the temporal pattern of subway ridership through statistical models is crucial during such shocks. However, many existing statistical frameworks may not be a good fit to analyze the ridership data sets during the pandemic since some of the modeling assumption might be violated during this time. In this paper, utilizing change point detection procedures, we propose a piece-wise stationary time series model to capture the nonstationary structure of subway ridership. Specifically, the model consists of several independent station based autoregressive integrated moving average (ARIMA) models concatenated together at certain time points. Further, data-driven algorithms are utilized to detect the changes of ridership patterns as well as to estimate the model parameters before and during the COVID-19 pandemic. The data sets of focus are daily ridership of subway stations in New York City for randomly selected stations. Fitting the proposed model to these data sets enhances our understanding of ridership changes during external shocks, both in terms of mean (average) changes as well as the temporal correlations.

Social platforms such as Gab and Parler, branded as `free-speech' networks, have seen a significant growth of their user base in recent years. This popularity is mainly attributed to the stricter moderation enforced by mainstream platforms such as Twitter, Facebook, and Reddit. In this work we provide the first large scale analysis of hate-speech on Parler. We experiment with an array of algorithms for hate-speech detection, demonstrating limitations of transfer learning in that domain, given the illusive and ever changing nature of the ways hate-speech is delivered. In order to improve classification accuracy we annotated 10K Parler posts, which we use to fine-tune a BERT classifier. Classification of individual posts is then leveraged for the classification of millions of users via label propagation over the social network. Classifying users by their propensity to disseminate hate, we find that hate mongers make 16.1\% of Parler active users, and that they have distinct characteristics comparing to other user groups. We find that hate mongers are more active, more central and express distinct levels of sentiment and convey a distinct array of emotions like anger and sadness. We further complement our analysis by comparing the trends discovered in Parler and those found in Gab. To the best of our knowledge, this is among the first works to analyze hate speech in Parler in a quantitative manner and on the user level, and the first annotated dataset to be made available to the community.

Recent research suggests that predictions made by machine-learning models can amplify biases present in the training data. When a model amplifies bias, it makes certain predictions at a higher rate for some groups than expected based on training-data statistics. Mitigating such bias amplification requires a deep understanding of the mechanics in modern machine learning that give rise to that amplification. We perform the first systematic, controlled study into when and how bias amplification occurs. To enable this study, we design a simple image-classification problem in which we can tightly control (synthetic) biases. Our study of this problem reveals that the strength of bias amplification is correlated to measures such as model accuracy, model capacity, model overconfidence, and amount of training data. We also find that bias amplification can vary greatly during training. Finally, we find that bias amplification may depend on the difficulty of the classification task relative to the difficulty of recognizing group membership: bias amplification appears to occur primarily when it is easier to recognize group membership than class membership. Our results suggest best practices for training machine-learning models that we hope will help pave the way for the development of better mitigation strategies.

Efficient usage of in-device storage and computation capabilities are key solutions to support data-intensive applications such as immersive digital experience. This paper proposes a location-dependent multi-antenna coded caching -based content delivery scheme tailored specifically for wireless immersive viewing applications. First, a memory assignment phase is performed where the content relevant to the identified wireless bottleneck areas are incentivized. As a result, unequal fractions of location-dependent multimedia content are cached at each user. Then, a novel packet generation process is carried out given asymmetric cache placement. During the subsequent delivery phase, the number of packets transmitted to each user is the same, while the sizes of the packets are proportional to the corresponding location-dependent cache ratios. Finally, each user is served with location-specific content using joint multicast beamforming and multi-rate modulation scheme that simultaneously benefits from global caching and spatial multiplexing gains. Numerical experiments and mathematical analysis demonstrate significant performance gains compared to the state-of-the-art.

Biometric authentication prospered during the 2010s. Vulnerability to spoofing attacks remains an inherent problem with traditional biometrics. Recently, unobservable physiological signals (e.g., Electroencephalography, Photoplethysmography, Electrocardiography) as biometrics have been considered a potential solution to this problem. In particular, Photoplethysmography (PPG) measures the change of blood flow of the human body by an optical method. Clinically, researchers commonly use PPG signals to obtain patients' blood oxygen saturation, heart rate, and other information to assist in diagnosing heart-related diseases. Since PPG signals are easy to obtain and contain a wealth of individual cardiac information, researchers have begun to explore its potential applications in information security. The unique advantages (simple acquisition, difficult to steal, and live detection) of the PPG signal allow it to improve the security and usability of the authentication in various aspects. However, the research on PPG-based authentication is still in its infancy. The lack of systematization hinders new research in this field. We conduct a comprehensive study of PPG-based authentication and discuss these applications' limitations before pointing out future research directions.

`Tracking' is the collection of data about an individual's activity across multiple distinct contexts and the retention, use, or sharing of data derived from that activity outside the context in which it occurred. This paper aims to introduce tracking on the web, smartphones, and the Internet of Things, to an audience with little or no previous knowledge. It covers these topics primarily from the perspective of computer science and human-computer interaction, but also includes relevant law and policy aspects. Rather than a systematic literature review, it aims to provide an over-arching narrative spanning this large research space. Section 1 introduces the concept of tracking. Section 2 provides a short history of the major developments of tracking on the web. Section 3 presents research covering the detection, measurement and analysis of web tracking technologies. Section 4 delves into the countermeasures against web tracking and mechanisms that have been proposed to allow users to control and limit tracking, as well as studies into end-user perspectives on tracking. Section 5 focuses on tracking on `smart' devices including smartphones and the internet of things. Section 6 covers emerging issues affecting the future of tracking across these different platforms.

This paper describes a system that generates speaker-annotated transcripts of meetings by using a microphone array and a 360-degree camera. The hallmark of the system is its ability to handle overlapped speech, which has been an unsolved problem in realistic settings for over a decade. We show that this problem can be addressed by using a continuous speech separation approach. In addition, we describe an online audio-visual speaker diarization method that leverages face tracking and identification, sound source localization, speaker identification, and, if available, prior speaker information for robustness to various real world challenges. All components are integrated in a meeting transcription framework called SRD, which stands for "separate, recognize, and diarize". Experimental results using recordings of natural meetings involving up to 11 attendees are reported. The continuous speech separation improves a word error rate (WER) by 16.1% compared with a highly tuned beamformer. When a complete list of meeting attendees is available, the discrepancy between WER and speaker-attributed WER is only 1.0%, indicating accurate word-to-speaker association. This increases marginally to 1.6% when 50% of the attendees are unknown to the system.

Recently, Attention-Gated Convolutional Neural Networks (AGCNNs) perform well on several essential sentence classification tasks and show robust performance in practical applications. However, AGCNNs are required to set many hyperparameters, and it is not known how sensitive the model's performance changes with them. In this paper, we conduct a sensitivity analysis on the effect of different hyperparameters s of AGCNNs, e.g., the kernel window size and the number of feature maps. Also, we investigate the effect of different combinations of hyperparameters settings on the model's performance to analyze to what extent different parameters settings contribute to AGCNNs' performance. Meanwhile, we draw practical advice from a wide range of empirical results. Through the sensitivity analysis experiment, we improve the hyperparameters settings of AGCNNs. Experiments show that our proposals achieve an average of 0.81% and 0.67% improvements on AGCNN-NLReLU-rand and AGCNN-SELU-rand, respectively; and an average of 0.47% and 0.45% improvements on AGCNN-NLReLU-static and AGCNN-SELU-static, respectively.

Objects are made of parts, each with distinct geometry, physics, functionality, and affordances. Developing such a distributed, physical, interpretable representation of objects will facilitate intelligent agents to better explore and interact with the world. In this paper, we study physical primitive decomposition---understanding an object through its components, each with physical and geometric attributes. As annotated data for object parts and physics are rare, we propose a novel formulation that learns physical primitives by explaining both an object's appearance and its behaviors in physical events. Our model performs well on block towers and tools in both synthetic and real scenarios; we also demonstrate that visual and physical observations often provide complementary signals. We further present ablation and behavioral studies to better understand our model and contrast it with human performance.

We introduce MilkQA, a question answering dataset from the dairy domain dedicated to the study of consumer questions. The dataset contains 2,657 pairs of questions and answers, written in the Portuguese language and originally collected by the Brazilian Agricultural Research Corporation (Embrapa). All questions were motivated by real situations and written by thousands of authors with very different backgrounds and levels of literacy, while answers were elaborated by specialists from Embrapa's customer service. Our dataset was filtered and anonymized by three human annotators. Consumer questions are a challenging kind of question that is usually employed as a form of seeking information. Although several question answering datasets are available, most of such resources are not suitable for research on answer selection models for consumer questions. We aim to fill this gap by making MilkQA publicly available. We study the behavior of four answer selection models on MilkQA: two baseline models and two convolutional neural network archictetures. Our results show that MilkQA poses real challenges to computational models, particularly due to linguistic characteristics of its questions and to their unusually longer lengths. Only one of the experimented models gives reasonable results, at the cost of high computational requirements.

北京阿比特科技有限公司