Downloads: 2 | Views: 178 | Weekly Hits: ⮙1 | Monthly Hits: ⮙2
Analysis Study Research Paper | Computer Science & Engineering | India | Volume 12 Issue 9, September 2023 | Popularity: 4.9 / 10
Multi-Modal Fusion Techniques in Deep Learning
Radhika Shetty D S
Abstract: Multi-modal fusion techniques in deep learning have gained significant attention due to their capacity to leverage information from diverse sources and enhance the performance of various machine learning applications. This paper provides an overview of the key approaches and strategies employed in the fusion of data from multiple modalities, including images, text, audio, and more. We explore the spectrum of fusion techniques, ranging from early fusion, which combines raw features at the input level, to late fusion, which aggregates predictions at the output level. Additionally, we delve into mid-level fusion techniques that merge representations at intermediate layers within neural networks [1]. Attention mechanisms, such as self-attention and cross-modal attention, play a pivotal role in dynamically weighing the contributions of different modalities during processing. Cross-modal embeddings are discussed as a means to map data from disparate modalities into a shared embedding space, facilitating seamless integration. Graph-based fusion models are explored for their ability to capture inter-modal relationships in a structured manner, while co-attention and co-guidance mechanisms enhance the modeling of interactions between modalities [1]. Hybrid models, combining elements of both early and late fusion, are presented as versatile solutions adaptable to a variety of multi-modal tasks. Memory-augmented neural networks are also examined, offering the capacity to store and retrieve information from different modalities as needed. Through a comprehensive exploration of these multi-modal fusion techniques, this paper aims to provide researchers and practitioners with insights into the advancements and possibilities in the field. These techniques have widespread applications across domains such as natural language processing, computer vision, audio analysis, and beyond, making them a valuable area of study in contemporary deep learning research.
Keywords: Multi-modal fusion techniques, Deep learning, Data fusion, Cross-modal attention, Hybrid models
Edition: Volume 12 Issue 9, September 2023
Pages: 526 - 532
DOI: https://www.doi.org/10.21275/SR23905100554
Make Sure to Disable the Pop-Up Blocker of Web Browser
Similar Articles
Downloads: 1
Research Paper, Computer Science & Engineering, India, Volume 10 Issue 10, October 2021
Pages: 1137 - 1146Time Series Visualization using Transformer for Prediction of Natural Catastrophe
Shivam Pandey, Mahek Jain
Downloads: 1 | Weekly Hits: ⮙1 | Monthly Hits: ⮙1
Research Paper, Computer Science & Engineering, India, Volume 11 Issue 6, June 2022
Pages: 428 - 429An Approach for Traffic Prediction in SDN for Performing Routing
H. Pavithra, Dr. G. N. Srinivasan
Downloads: 1 | Weekly Hits: ⮙1 | Monthly Hits: ⮙1
Research Proposals or Synopsis, Computer Science & Engineering, India, Volume 11 Issue 9, September 2022
Pages: 837 - 842An Optimized IoT-Enabled Big Data Analytics Architecture for Edge-Cloud Computing Using Deep Learning
Bharathi K.
Downloads: 1 | Weekly Hits: ⮙1 | Monthly Hits: ⮙1
Student Project, Computer Science & Engineering, India, Volume 12 Issue 3, March 2023
Pages: 1253 - 1257Cryptocurrency Prediction using Deep Learning
Abusufiyan Athani, Samarth Kumar, Kirankumar K R, Abhishek Sonavane
Downloads: 1 | Weekly Hits: ⮙1 | Monthly Hits: ⮙1
Research Paper, Computer Science & Engineering, India, Volume 12 Issue 4, April 2023
Pages: 1339 - 1343Detecting Parkinson's Disease Using XGBoost
Utkarsh Jain, Manveer Singh Malhi, Dr. Nithya