In the laboratory we are developing a new concept of conversational language understanding (CLU) as a new unique approach that develops the idea that verbal and nonverbal conversational cues are complementary, and equally important in conversation. Key research activities are focused on understanding and modeling conversational language in human-to-human interactions. In the field of natural language processing and understanding of spoken language, gesture and non-verbal communication, as well as the ability to express information not only through words, begin to play an important role in human-machine interaction. It has become clear that the integration of such signals is a key direction in identifying a more personal aspect of user inputs and how device responses can be presented to people. In direct interaction, non-verbal signals transmitted along with spoken content (or even in the absence of it) are crucial for establishing cohesion in the discourse. In the laboratory, we are developing new fusion-based models as well as artificial intelligence (AI)-based algorithms that will be able to generate in-depth understanding in cognitive interplay with communicative intent as a central core in human interaction. The algorithms are based on deep learning techniques.
In the field of conversational agents, straightforwardness and the ability to interact with the interlocutor are one of the driving factors for user experience in IoT environments. Advanced emotional interaction plays an increasingly important role in human-machine interaction, also due to the increasingly complex systems in the environment. Laboratory research thus encompasses the synthesis and emulation of natural human-machine interaction. Research covers all components involved in human interaction (or conversation), and in this respect the development of such conversational behavior that will mimic the complete life cycle of human interaction. It generally involves receiving multimodal input, understanding information and planning response, and delivering information in the form of multimodal output. The spoken language is most often accompanied by mimicry, gestures, prosody, that is, in a multimodal dimension, to which we pay additional attention and consequently develop the necessary language resources and linguistic and technological tools for the Slovenian language.
In the field of human-machine interaction, research in the laboratory extends to new mobile technologies and applications for sensing the broader context of the user based on heterogeneous sensor systems and data fusion in the Internet of Things concept. Research in this area is focused on new concepts, methods and approaches for heterogeneous wearable sensor systems based on smart embedded systems that enable low-energy wireless communications technologies to integrate seamlessly with mobile devices and cloud services through the concept of Internet of Things and data fusion based on analytics over a large amount of data. Digital Signal Processing Laboratory has expertise in natural speech processing and generation (NLP, NLG, SLU, ECA, TTS, ASR), audio-visual signal processing and classification (gender classification, emotion, mood detection), user experience and quality of experience, machine learning and artificial intelligence (including deep learning), intelligent IoT environments and supportive living environments.