Research article

MULTI-DIMENSIONAL MEANING ANNOTATION IN SYNTHESIS OF LISTENER VOCALIZATIONS

Viswanatha Reddy Allugunti, Dr. Biplab Kumar Sarkar, Dr. Raman Dugyala

Online First: November 29, 2022


With the ever-increasing role of computers in many areas of today’s society, human- machine interaction has become an increasingly prominent part of our daily life. Ma- chines and the ways people interact with them have changed dramatically in the past few decades. Traditionally, the human-machine interfaces have often been regarded as purely rational activity, in which emotions and social aspects are secondary. This view has been changing since the mid 90’s when some studies (Langer 1992; Nass and Moon 2000) demonstrate that individuals mindlessly apply social rules and expecta- tions to computers. People tend to interact with computers as if they were human-like. They unconsciously apply social rules even if they believe that such an attribution is not appropriate. Age of audience vocalizations is one of the main targets of sincerely shaded conversational speechsynthesis. Accomplishment in this undertaking relies upon the answersto three inquiries: What sorts of significance are expressed through audience vocalizations? What structure is reasonable fora given significance? Furthermore, in what setting should which listener vocalizations be delivered? In this paper, we addressthe first of these inquiries. We present a technique to record natural and expressive audience vocalizations for synthesis, and depict our way to deal with distinguish an appropriate categorical description of the significance passed on in the vocalizations. In our information, one entertainer delivers an aggregate of 967 audience vocalizations, in his normal talking style and three acted emotion-specific characters. In an open categorization scheme, we find that eleven classes happen on at minimum 5%.

Keywords

Multi-Dimensional, Meaning, Annotation, Synthesis, Listener, Vocalizations.