Hi all,
Let's begin another article's series. Now i will talk about recommendation systems and how we can implement some simple recommendation algorithms using information filtering with functional examples. You probably already came into recommendation systems but you didn't know.
There are some examples: Amazon, Netflix, etc. They generally register the user preference and based on this profile, it uses it (the information inside it) to suggest new products that you may like. It's a powerful system and you can use it to build systems that find people that have the same preference as you or make automatic suggestions based on preferences and tastes that other people like.
To start, let's introduce what's a recommendation system.
1 - Recommender Systems
"A recommender system can be defined as "any system that produces individualized recommendations as output or has the effect of guiding the user in a personalized way to interesting or useful objects in a large space of possible options. " (Burke 2002).
The problem being addressed is that "too much information", challenging those in search of something to find that which is interesting within the vast array available.
From wikipedia we can see the following definition:
"Recommender systems form a specific type of information filtering (IF) technique that attempts to present information items (movies, music , books , news images, web pages, etc.) that are likely of interest to the user. Typically, a recommender system compares the user's profile to some reference characteristics, and seeks to predict the 'rating' that a user would give to an item they had not yet considered. These characteristics may be from the information item (the content-based approach) or the user's social environment (the collaborative filtering approach).
Recommender systems have been around since the early nineties and have evolved to meet the needs of e-commerce, research, museums and collections, digital libraries and entertainment. The volume of literature on recommender systems seems to be becoming as large as the problem itself and a high level model was proposed in order to find a path through this array of information and to guide the selection of information about recommender systems. The Figure 1 shows the main elements of a recommender system and describes the nature of the links between the elements.
Figure 01. Recommendation Systems Schemma
Recommendation is carried out by some kind of recommendation engine which employs a set of algorithms to compare a user profile with a set of reference characteristics. There are three types of source for the reference characteristics: Information about the items themselves (content), information about the social environment (collaborative) and information about the web usage (web analytics). Actually many tools use a mix of these techniques and they are a set of the information filtering systems.
Information Filtering System is a system that removes redundant or unwanted information from an information stream using (semi) automated methods in order to present to a human user. In the recommendation systems context, to do this the user's profile is compared to some reference characteristics as the content-based approach or the user's social environment (collaborative filtering approach). In the next section we will see those types of Information Filtering (IF).
2 - Information Filtering (IF)
2.1- Content-based-Filtering
Content based filtering uses information about the items to make recommendations. It will recommend items to a user if the items are similar in content to items that the user liked in the past. This approach allows recommendation of previously unrated items to users with unique interests and can provide explanations for its recommendations. As long as the system has some information about an item, recommendations can be mad even if the system has received a small number of ratings, or none at all. The disadvantage of this mechanism is that each item must be characterized with respect to the features that appear in the user's profile requiring modelling of each user's profile.
2.2- Collaborative Filtering
Collaborative filtering makes predictions about the interests of a user by collecting the choices or expressions of taste from many users. It finds areas of agreement between people and bases recommendations on the assumption that people who agreed in the past are likely to do so in the future. It looks for users who share the same ratings patterns with the active user, a neighbourhood of similar users, and uses their ratings to create a prediction. Unlike content-based filtering, it doesn't need to know anything about the item themselves, only people's opinions about the items.
Collaborative filtering may be based on the explicit ratings of users or on implicit observation of user behaviour. User behaviour is observed and compared to the behaviour of other users, for example, items purchased, queries made, items printed, or music listened to. Predictions can then be made about a user's future behaviour assuming like-mindedness in the past as a predictor for future patterns of behaviour.
There are two problems in this system of users and ratings: the 'first-rater problem' and 'cold-start problem'. The first-rater problem occurs when a new item goes into the system and has not yet received any ratings, preventing it from being recommended. The cold-start problem occurs for new users, about whom there is insufficient information from their active ratings or observed behaviour with which to predict their preferences.
Now that we presented some popular IF techniques, let's go further through those methods and see them in action. In the next article i will present some filtering techniques with their implementation on Python programming language.
Stay tunned!
Marcel Pinheiro Caraciolo
References
[1] http://en.wikipedia.org/wiki/Recommender_systems
[2] http://en.wikipedia.org/wiki/Information_filtering
[3] http://en.wikipedia.org/wiki/Collaborative_filtering