For that , most recommender engines operate by trying to do just that, estimating rating for some or all other items. One possible way of evaluating recommender's suggestions is to evaluate the quality of its estimated preference values, that is, evaluating how closely the estimated preferences match the actual preferences of the user.
|Precision and Recall in the context of Search Engines|
"The precision is the proportion of recommendations that are good recommendations, and recall is the proportion of good recommendations that appear in top recommendations."
|234 UserIDs, 1599 PlaceIDs, 3463 Ratings for the Apontador Data Set|
After some pre-processing for the experiments, I have pre-processed the database and came into a rating distribution. This plot represents a heat map illustrating the distribution of the ratings evaluated by the users (rows) to places (columns). The black areas represent the absence of the rating for that particular place from that user.
|Ratings Matrix - Users x Items|
|F-Score Formula (Image from Wikipedia)|
|F1 Score Graph, the best are with score f near to 1.0 (top right of the graph)|
In this post I presented some metrics used to evaluate the quality of the recommendations in a recommender engine and explained through some examples using real data set from Apontador social network. All the code used here is provided at by personal repository at Github and is all written in Python.