Information Retrieval (IR) techniques are growing continuously from being
keyword-based systems to advanced search. These days, IR techniques utilize Machine
Learning (ML), Deep Learning (DL), and Natural Language Processing (NLP) for
providing more accurate and personalized results. In the proposed research work, the
IR techniques are analysed for their merits and demerits. In the work, it has been
examined how contemporary research has been transformed into query document
matching. This work integrates Term Frequency-Inverse Document Frequency (TFIDF) into two retrieval metrics—cosine similarity and dot product similarity.
Integration aims to provide better results. Cosine similarity is good at capturing vector
orientation, while dot product similarity is good for vector magnitude. A combined
similarity is weighted at parameter α to enhance the retrieval capacity. From the
simulation of work, it has been calculated that the combined method performed well. In
the future, authors will incorporate machine learning or deep learning methods to
enhance the performance of these IR techniques.
Keywords: Information retrieval, Term frequency-inverse document frequency, Cosine similarity, dot product similarity, Retrieval Augmented Generation (RAG).