In IMDB website gives movie recommendation based on your watched movie information.
You are expected to write simple console application which recommends movies to clients using data which shows the movies that clients watched already.
You need fill your client preference and movie information from files. These file contain client information about the movies watched or not.
Minimum Requirements: Each client related data (the movies that the client watched) will be kept in a linked list.
From a console menu item, you must be able find the most similar clients to a given client and "sort" according to their similarity. Show the sorted clients and their similarity.
From a console menu item, given a client, recommend movies according to similar users. Invent and explain your mechanisms! E.g. find the most similar client to the given client id, and recommend the movies from that client which our original client have not watched yet.
You can use STL classes as containers. However you must implement sorting yourself!!! Data File:
There are 2 data files. One is called client-preference file. Its content is similar to
5 c10 m5 c10 m6 c11 m6 c11 m8 c15 m9
Above means there are 5 entries. The client c10 watched movies with id m5 and m6. c11 watched m6 and m8. c15 watched m9. The other data file is called movie_idmap. Its content is similar to
7
m5 "Return of the Jedi" m6 "When Harry met Sally" m7 "The Exorcist" m8 "Hunger Games" m9 "Maya The Bee" m10 "Darkt Knight" m11 "Truman Show"
Above means there are 7 movies with id and name maps. How to compute similarity between 2 client:
The similarity between clients is related to which movies they watched are common, and which movies they watched are different. There is a very simple similarity measure called Jaccard similarity. Basically given 2 client, find the movies that both clients watched. And find all the movies that both clients watched. And divide their counts. Formula is below.
E.g., for the above client-preference matrix, let's say you want to find Jaccard similarity between c10 and c11. The list of the movies they commonly watched together is [m6] which is only one movie. The list of all the movies they watched is [m5, m6, m8] which has 3 movies. The Jaccard similarity is 1/3.