Home EVENTS ICMI 2026 – 28th International Conference on Multimodal Interaction: Everything You Need...

ICMI 2026 – 28th International Conference on Multimodal Interaction: Everything You Need to Know

KEY FACTS
Date: 2026-10-06 to 2026-10-08
Location: Napoli, Italy
Type: Conference
Website: icmi.acm.org/2026

What Is ICMI 2026?

The 28th International Conference on Multimodal Interaction (ICMI 2026) is the premier annual gathering hosted by the Association for Computing Machinery (ACM) dedicated to the science of multimodal interaction. Scheduled for October 6–8, 2026, in the historic city of Napoli, Italy, this conference brings together leading researchers, engineers, and practitioners from across the globe to explore how machines can understand and respond to human communication through multiple channels—including vision, speech, language, touch, and gesture.

Organized under the auspices of ACM’s Special Interest Group on Computer-Human Interaction (SIGCHI) and the Special Interest Group on Artificial Intelligence (SIGAI), ICMI 2026 serves as a critical nexus for advances in multimodal AI and human-computer interaction (HCI). The conference has long been recognized as the definitive venue for work that bridges the gap between individual modalities—such as computer vision and natural language processing—and the integrated systems that combine them. For AI professionals, ICMI 2026 represents a unique opportunity to engage with the cutting edge of how machines perceive and interact with the world in a human-like, holistic manner.

Why It Matters for AI Professionals

As AI systems increasingly move beyond single-task benchmarks toward real-world deployment, the ability to fuse multiple input streams—audio, visual, textual, and sensor data—has become a core competency. ICMI 2026 directly addresses this shift, offering a deep dive into the architectures, datasets, and evaluation methods that make multimodal systems robust and context-aware. Attendees will gain insights into how leading labs are tackling challenges such as cross-modal alignment, temporal synchronization, and the handling of ambiguous or incomplete input.

For professionals working in conversational AI, autonomous systems, assistive technology, or user experience design, the conference provides actionable knowledge on building interfaces that feel more natural and intuitive. The integration of vision, speech, and language is not just an academic pursuit—it is increasingly central to products ranging from smart assistants to augmented reality platforms. ICMI 2026 offers a concentrated look at where the field is heading and what technical hurdles remain.

What to Expect

ICMI 2026 will feature a comprehensive program covering the full spectrum of multimodal interaction research. While the final schedule and speaker list are yet to be released, the conference traditionally includes the following components:

  • Keynote Presentations: Invited talks from leading figures in multimodal AI and HCI, addressing both foundational theories and emerging trends.
  • Technical Paper Sessions: Peer-reviewed presentations of the latest research on topics such as multimodal fusion, emotion recognition, gesture and gaze tracking, and speech-driven interfaces.
  • Workshops and Tutorials: Half-day and full-day sessions focused on specialized areas, including dataset creation, evaluation methodologies, and practical toolkits for building multimodal systems.
  • Demo and Poster Sessions: Interactive opportunities to see live systems and discuss work-in-progress with authors and practitioners.
  • Grand Challenge Competitions: Benchmarking events that push the state of the art in specific multimodal tasks, such as social signal processing or human-robot interaction.

Key themes for ICMI 2026 are expected to include advances in large multimodal models, real-time interaction systems, ethical considerations in multimodal data collection, and applications in healthcare, education, and accessibility.

Who Should Attend

ICMI 2026 is designed for a diverse audience spanning academia and industry. The primary attendees include:

  • AI and Machine Learning Researchers: Those specializing in multimodal learning, representation learning, and cross-modal transfer will find the technical program directly relevant to their work.
  • Human-Computer Interaction Specialists: Professionals focused on user experience, interface design, and usability testing will benefit from insights into how multimodal input can improve interaction quality.
  • Software Engineers and Developers: Practitioners building conversational agents, robotic systems, or AR/VR applications will gain exposure to state-of-the-art integration techniques.
  • Product Managers and Industry Leaders: Decision-makers looking to understand the practical implications of multimodal AI for their product roadmaps and strategic investments.
  • Graduate Students and Academics: Those pursuing research in computer vision, speech processing, natural language understanding, or affective computing will find networking and collaboration opportunities.

How to Register

Registration for ICMI 2026 will open closer to the conference date. Pricing details, including early-bird rates and student discounts, are to be announced. All registration and logistical information will be published on the official conference website. To stay updated and secure your place at this landmark event in multimodal AI, visit icmi.acm.org/2026.