Movie Recommendation System

I built a content‑based movie recommendation engine in Python using the TMDB 5000 Movies and Credits datasets. I first merged each film’s metadata with its cast and crew details in pandas, then combined key textual features (overview, genres, keywords, cast, crew) into a unified “tags” field. After lowercasing and stemming with NLTK’s PorterStemmer, I vectorized those tags using scikit‑learn’s CountVectorizer (capped at 5,000 terms) and computed a cosine‑similarity matrix over the resulting feature vectors. I wrapped the core recommend(movie_title) function— which looks up a movie’s index, sorts its similarity scores, and returns the top five matches—inside a Streamlit app. In the UI, users can type or select a title, click “Recommend,” and instantly see poster images, titles, and overviews of their five most similar films

Stack

PythonPandasNumPyNLTKScikit‑LearnStreamlit
View Source on GitHub
© 2025 Santosh Luitel