ICDM Conference 2019

Multilingual Cognitive Services Workshop 2019

November 8th, Beijing, China

Technical Description of Workshop

Many aspects of artificial intelligence, including search, question answering, and Internet of Things automation in home assistants, rely on robust cognitive services such as natural language understanding from speech and text. One of the technical challenges that remains an open research area and is coming to the forefront of this field is that of adapting cognitive services across languages, to serve a worldwide community of multilingual and multicultural users. This workshop will address research problems in cognitive service design and development that center around multilingual translation, speech recognition, and - to a degree - text understanding. These problems include:

  • Code mixing: mixed languages in speech and text (conversation, queries, commands)
  • Language recognition: identifying languages in small units of mixed natural language
  • Accents: identifying and adapting to regional accents and second-language speaker
  • Dialogue agents: responses; handling language switching in conversational contexts
  • Standardization/transcription: translating mixed texts and transcripts to one language

Nearly 20% of people in the United States, and 56% in Europe, consider themselves to be multilingual. Self-described bilingual speakers number 43% worldwide and trilingual speakers 13%; only 40% of people across the world are monolingual as of 2018. In recent years, there has been extensive research on cognitive services, language detection and monolingual translation; however, as globalization adds increasing numbers of multilingual users, the topic of multilingual cognitive services is becoming more prominent, with its own technical challenges, methodologies, and user needs. This workshop aims at gathering data science and machine learning researchers from many related areas to discuss how to meet these challenges and needs with new data mining approaches.

For example, there are many different brands of home assistants in different countries. However, when they are used by multilingual speakers, failures of natural language recognition by cognitive services can greatly diminish their accessibility and usability, to the point that they become less practical in their primary purpose (speech-based functions) than mobile devices and applications. When multilingual speakers ask for music by their favorite creative artists or search for information on notable people, places, and things, they are often unable to use native personal and place names, or local terms, because these embedded named entities may be treated as foreign phrases by a regionalized cognitive service. The crucial issue is that most cognitive services are regionalized to be intrinsically monolingual, an assumption that is part of the inherent problem for the large and growing body of multilingual users.

Therefore, we seek to bring together researchers from different fields of data mining, including transdisciplinary and interdisciplinary data scientists, to discuss their innovations, views, and visions regarding cutting-edge cognitive services technology.

Active research areas that are related to cognitive services include:

  • Data mining and computational linguistics in multilingual domains
  • Multimodal data science, especially video (dialogues, speechreading)
  • Machine learning using multilingual natural language data, including text/transcripts
  • Multilingual speech recognition/prediction with deep learning/artificial neural nets
  • Human-centered computing, including cognitive models and user modeling
  • Home assistants and other dialogue agents
  • Machine translation
  • Human-robot interaction (HRI) and human-computer interaction (HCI)
  • Usability of interactive services: how to respond to multilingual queries and dialogue
  • User adaptation and personalization
  • Understanding emotions in user context: home/work, friends/strangers, online/in person

The emphasis of this workshop shall be approaches based on the above methodologies.

Intended Audience and Impact

We welcome paper submissions from researchers in all areas of domain adaptation in cognitive services, particularly:

  • data mining for cognitive user modeling, adaptation, and personalization
  • higher-level tasks: question answering (QA) and knowledge-base population
  • speech recognition
  • machine translation and language recognition
  • natural language processing

We also hope to attract ICDM participants from industrial R&D with interesting current applications that showcase multilingual aspects of social media



8:00 AM


9:30 AM


10:30 AM


10:45 NOON


11:30 AM


12:00 NOON