Advanced Information Retrieval System: Theoretical and Experimental Perspective

Image-Audio Based Recommendations System for Information Retrieval

Author(s): Urmila Pilania*, Manoj Kumar* and Sanjay Singh *

Pp: 71-83 (13)

DOI: 10.2174/9798898813666126010009

* (Excluding Mailing and Handling)

Abstract

This chapter presents the classification and analysis of fashion data, which consists of 90 images belonging to one class, using deep learning techniques. Data augmentation is done to pre-process the dataset. Features are retrieved using Convolutional Neural Networks (CNNs), VGG16, and ResNet50. These modes are trained on styles and patterns of images so that recognition can be done. For the styles and subtitles, another dataset of 144 audio files has been utilized. Voice is converted into text by using Machine Learning (ML) and Natural Language Processing (NLP) techniques. Pre-processing of audio files has been performed using Mel-Frequency Cepstral Coefficients (MFCC) along with normalization to reduce noise. The Recurrent Neural Networks (RNNs) technique converts the audio file into a text file. The proposed work is evaluated based on accuracy, reliability, and adaptability. 


Keywords: Convolutional neural network, Image recommendation system, Subtitles recommendation system, VGG16, ResNet50, Mel-Frequency Cepstral Coefficients (MFCC), Recurrent neural networks.