Pages

Non-Personalized Recommender systems with Pandas and Python

Tuesday, October 22, 2013


Hi all,

At the last PythonBrasil I gave a tutorial about Python and Data Analysis focused on recommender systems, the main topic I've been studying for the last years. There is a popular python package among the statisticians and data scientists called Pandas. I watched several talks and keynotes about it, but I didn't have a try on it. The tutorial gave me this chance and after the tutorial me and the audience fell quite excited about the potential and power that this library gives.

This post starts a series of articles that I will write about recommender systems and even the introduction for the new-old refreshed library that I am working on:  Crab,  a python library for building recommender systems. :)

This post starts with the first topic about the theme: Non-personalized Recommender Systems and giving several examples with the python package Pandas.  In future I will also post an alternative version of this post but referencing Crab, about how it works with him.

But first let's introduce what Pandas is.

Introduction to Pandas


Pandas is a data analysis library for Python that is great for data preparation, joining and ultimately generating well-formed, tabular data that's easy to use in a variety of visualization tools or (as we will see here) machine learning applications. For further introduction about pandas, check this website or this notebook.

Non-personalized Recommenders


Non-personalized recommenders can  recommend items to consumers based on what other consumers have said about the items on average. That is, the recommendations are independent of the customer,  so each customer gets the same recommendation.  For example, if you go to amazon.com as an anonymous user it shows items that are currently viewed by other members.

Generally the recommendations come in two flavours: predictions or recommendations. In case of predictions are simple statements that are formed in form of scores, stars or counts.  On the other hand, recommendations are generally simple a list of items shown without any number associated with it.

Let's going by an example:

Simple Prediction using Average

The score in the scale of 1 to 5 to the book Programming Collective Intelligence was 4.5 stars out of 5.
This is an example of a simple prediction. It displays a simple average of other customer reviews about the book.
The math behind it is quite simple:

Score = ( 65 * 5 + 18 * 4 + 7 * 3 +  4 * 2 +  2 * 1)
Score =  428/ 96
Score = 4.45 ˜= 4.5 out of 5 stars

In the same page it also displays the information about the other books which the customers bought after buying Programming Collective Intelligence. A list of recommended books presented to anyone who visits the product's page. It is an example of recommendation.




But how Amazon came up with those recommendations ? There are several techniques that could be applied to provide those recommendations. One would be the association rules mining, a data mining technique to generate a set of  rules and combinatios of items that were bought together. Or it could be a simple average measure based on the proportion of who bought x and y by who bought x. Let's explain using some maths:




Let X be the number of customers who purchased the book Programming Collective Intelligence. Let Y be the other books they purchased. You need to compute the ration given below for each book and sort them by descending order.  Finally, pick up the top K books and show them as related. :D

Score(X, Y) =  Total Customers who purchased X and Y / Total Customers who purchased X


Using this simple score function for all the books you wil achieve:


Python for Data Analysis                                                 100%

Startup Playbook                                                              100%

MongoDB Definitive Guid                                                0 %

Machine Learning for Hackers                                          0%


As we imagined the book  Python for Data Analysis makes perfect sense. But why did the book  Startup Playbook came to the top when it has been purchased by customers who have not purchased Programming Collective Intelligence.  This a famous trick in e-commerce applications called banana trap.   Let's explain: In a grocery store most of customers will buy bananas. If someones buys a razor and a banana then you cannot tell that the purchase of a razor influenced the purchase of banana.  Hence we need to adjust the math to handle this case as well. Modfying the version:

Score(X, Y) =  (Total Customers who purchased X and Y / Total Customers who purchased X) / 
         (Total Customers who did not purchase X but got Y / Total Customers who did not purchase X)

Substituting the number we get:

Python for Data Analysis =   ( 2 / 2 ) /  ( 1 / 3) =  1 / 1/3  =  3 

Startup Playbook   =   ( 2 / 2)  /  ( 3 /  3)  =  1 

The denominator acts as a normalizer and you can see that Python for Data Analysis clearly stands out.  Interesting, doesn't ? 

The next article I will work more with non-personalized recommenders, presenting some ranking algorithms that I developed for Atepassar.com for ranking  professors. :)

Examples with real dataset (let's play with CourseTalk dataset)

To present non-personalized recommenders let's play with some data. I decided to crawl the data from the popular ranking site for MOOC's  Course Talk.  It is an aggregator of several MOOC's where people can rate the courses and write reviews.  The dataset is a mirror from the date  10/11/2013 and it is only used here for study purposes.



Let's use Pandas to read all the data and start showing what we can do with Python and present a list of top courses ranked by some non-personalized metrics :)

Update: For better analysis I hosted all the code provided at the IPython Notebook at the following link by using nbviewer.

All the dataset and source code will be provided at crab's github, the idea is to work on those notebooks to provide a future book about recommender systems :)

I hope you enjoyed this article,  and stay tunned for the next one about another type of non-personalized recommenders:  Ranking algorithms for vote up/vote down systems!

Special thanks for the tutorial of Diego Manillof :)

Cheers,

Marcel Caraciolo

101 comments:

  1. Great article, mate. Can't wait for next part!
    Good luck

    ReplyDelete
  2. Great post, Marcel.

    I've been using pandas for a while now, it's really great for data management. The only downside is that pandas has limited out-of-core capabilities. My dataset is ~200GB big and I have to use a high-performance cluster to be able to use it with pandas. But apparently Wes McKinney is working on that (see his last post: http://wesmckinney.com/blog/?p=697).

    ReplyDelete
  3. Nice Information.....
    Please refer this site also
    Java Training in Chennai,

    javatraininginchennai

    ReplyDelete
  4. It was really a wonderful article and I was really impressed by reading this blog. We are giving all software Course Online Training. The HTML5 Training in Chennai is one of the reputed Training institute in Chennai. They give professional and real time training for all students.

    ReplyDelete
  5. your information is really useful for me.Most of the company using the python programming language.Thank you for your discussion you paragraphBest Python training institute in Chennai

    ReplyDelete
  6. your information is very useful for python programming language.Python training center in Chennai

    ReplyDelete
  7. You have stated definite points about the technology that is discussed above. The content published here derives a valuable inspiration to technology geeks like me. Moreover you are running a great blog. Many thanks for sharing this in here.

    Salesforce Training in Chennai
    Salesforce Training
    Salesforce training institutes in chennai

    ReplyDelete
  8. fantastic presentation of informatica..if sharinng this session will describe near real-time architectures for accelerating the delivery of data to critical analytics and customer service applications in real world once again i want to share this sites Informatica Training in chennai

    ReplyDelete
  9. Very good articles,thanks for sharing this useful information.

    Hyperion

    Informatica

    ReplyDelete
  10. Nice site.... refer this site .if Our vision succes!Training are focused on perfect improvement of technical skills for Freshers and working professional. Our Training classes are sure to help the trainee with COMPLETE PRACTICAL TRAINING and Realtime methodologies.
    Oracle Rac Training Chennai
    haddoop:

    ReplyDelete
  11. Embedded system training: Wiztech Automation Provides Excellent training in embedded system training in Chennai - IEEE Projects - Mechanical projects in Chennai. Wiztech provide 100% practical training, Individual focus, Free Accommodation, Placement for top companies. The study also includes standard microcontrollers such as Intel 8051, PIC, AVR, ARM, ARMCotex, Arduino, etc.

    Embedded system training in chennai
    Embedded system course in chennai
    VLSI trraining in chennai
    Final year projects in chennai

    ReplyDelete
  12. Thank you for posting it will be helpful. Thank you and please keep update like this with this site. Definitely it will be useful for all.

    SQL DBA Training in Chennai

    ReplyDelete
  13. Thank you for taking the time to provide us with your valuable information. We strive to provide our candidates with excellent care and we take your comments to heart.As always, we appreciate your confidence and trust in us.

    SAP training in Chennai

    ReplyDelete
  14. Your exclusive blog is a real godsend for my current investigation. Due to the fact that my syllabus involves an Introduction to Pandas, it's extremely helpful! Furthermore, I found several articles; one of them is available at http://bigessaywriter.com/blog/artificial-intelligence-impact-on-education!

    ReplyDelete



  15. All are saying the same thing repeatedly, but in your blog I had a chance to get some useful and unique information, I love your writing style very much, I would like to suggest your blog in my dude circle, so keep on updates.


    SAP SD Training in Chennai

    ReplyDelete
  16. You did a great job.. Thanks a lot for sharing this useful informative post with us.. Keep on blogging like this informative post with us, to develop my career in the right way.

    SharePoint Training in Chennai

    ReplyDelete
  17. I heard Anand College is one of the top engineering colleges in jaipur .
    Thanks for sharing this blog.

    ReplyDelete
  18. wow its great thing and wish to you do more level of things and i am waiting to get your new ideas. keep share more things.
    PTE Coaching in Chennai

    ReplyDelete
  19. Thank you for taking the time to provide us with your valuable information. We strive to provide our candidates with excellent care and we take your comments
    to heart.As always, we appreciate your confidence and trust in us
    Hadoop Training in chennai

    ReplyDelete
  20. It’s really amazing that we can record what our visitors do on our site. Thanks for sharing this awesome guide. I’m happy that I came across with your site this article is on point,thanks again and have a great day. Keep update more information..

    iOS App Development Company

    ReplyDelete
  21. Wow, that was an informative article on Non-Personalized Recommender systems with Pandas and Python and I have learned a lot of information about the system that will be of importance when I embark on Research paper chapter 4 writing. Thanks so much for sharing the article with us and I am looking forward to reading more posts from this site.

    ReplyDelete
  22. This comment has been removed by the author.

    ReplyDelete
  23. This comment has been removed by the author.

    ReplyDelete
  24. This comment has been removed by the author.

    ReplyDelete
  25. This comment has been removed by the author.

    ReplyDelete
  26. This is an excellent blog for learners from the beginning to ending, Check it once MSBI Online Training

    ReplyDelete
  27. Hi There,


    Fully agree on Non-Personalized Recommender systems with Pandas and Python. We’re seeing a lot of projects tackle big complex problems but few seem to have taken into consideration and in particular reasons to adopt.


    I began the installation wizard for installing Ubuntu. I clicked "install ubuntu alongside windows" then had to quit the installation. When i started the installation up again the continue button after highlighting the button for "install Ubuntu alongside windows" wasn't clickable. so i went back to windows and checked my disk manager and my main drive "drive 0" had an EFI System Partition that I'm thinking was the beginning of the creation of installing Ubuntu. My question is how do I merge this EFI Partition back to the main partition so I can install Ubuntu along Windows?
    Thanks a lot. This was a perfect step-by-step guide. Don’t think it could have been done better.


    Many Thanks,
    Morgan

    ReplyDelete
  28. This comment has been removed by the author.

    ReplyDelete
  29. • Nice and good article. It is very useful for me to learn and understand easily. Thanks for sharing your valuable information and time. Please keep updating IOT Online Training

    ReplyDelete
  30. Positive site, where did u come up with the information on this posting?I have read a few of the articles on your website now, and I really like your style. Thanks a million and please keep up the effective work.
    Also Read : R Programming Course Fees | R Language training in Chennai

    ReplyDelete

  31. Great post! I am actually getting ready to across this information, It's very helpful for this blog.Also great with all of the valuable information you have Keep up the good work you are doing well. hadoop training in chennai velachery | hadoop training course fees in chennai | Hadoop Training in Chennai Omr

    ReplyDelete
  32. Greetings. I know this is somewhat off-topic, but I was wondering if you knew where I could get a captcha plugin for my comment form? I’m using the same blog platform like yours, and I’m having difficulty finding one? Thanks a lot.

    Advanced AWS Online Training | Aws Online Certification Course
    Best AWS Training in Chennai | Advanced Amazon Web Services Training Institute in Chennai Velachery, Tambaram, OMR
    Advanced AWS Training in Bangalore |Best AWS Training Institute in Bangalore BTMLA ,Marathahalli

    ReplyDelete
  33. I wanted to thank you for this great read!! I definitely enjoying every little bit of it I have you bookmarked to check out new stuff you post.is article.
    python interview questions and answers
    python tutorials
    python course institute in electronic city

    ReplyDelete
  34. The way of you expressing your ideas is really good.you gave more useful ideas for us and please update more ideas for the learners.
    Java classes in chennai
    core Java training in chennai
    java class
    Hadoop Training in Chennai
    Selenium Training in Chennai

    ReplyDelete
  35. Whoa! I’m enjoying the template/theme of this website. It’s simple, yet effective. A lot of times it’s very hard to get that “perfect balance” between superb usability and visual appeal. I must say you’ve done a very good job with this.
    aws training in bangalore
    RPA Training in bangalore
    Python Training in bangalore
    Selenium Training in bangalore
    Hadoop Training in bangalore

    ReplyDelete
  36. Thank you, keep posting such type of posts, If you want to learn more about embedded system courses in Bangalore then join the best embedded system institute which makes you an experts in machine learning with Professional Training Institute, You get practical based embedded courses in Bangalore.

    ReplyDelete
  37. I would like to thank you for the efforts you have made in writing this article. I am hoping the same best work from you in the future as well. In fact your creative writing abilities has inspired me to start my own BlogEngine blog now. Really the blogging is spreading its wings rapidly. Your write up is a fine example of it.
    DevOps Training in Bangalore

    DevOps Training in Bangalore

    DevOps Training in Bangalore

    DevOps Training in Marathahalli

    DevOps Training in Pune

    DevOps Online Training-gangboard

    ReplyDelete
  38. Enjoyed reading the article above, really explains everything in detail, the article is very interesting and effective. Thank you and good luck for the upcoming articles learn python training in Bangalore

    ReplyDelete
  39. I am really happy to say this I am deeply read your article, I am searching like this type valuable information, it’s really helpful for me, I am happy to found it, thank you so much for share this blog, great work, keep sharing like this type of article, thank you so much for read my comment, if any one searching website designing company in India please visit my website
    Web Designing company

    ReplyDelete
  40. This is the exact information I am been searching for, Thanks for sharing the required infos with the clear update and required points. To appreciate this I like to share some useful information regarding Microsoft Azure which is latest and newest,

    Regards,
    Ramya

    Azure Training in Chennai
    Azure Training Center in Chennai
    Best Azure Training in Chennai
    Azure Devops Training in Chenna
    Azure Training Institute in Chennai
    Azure Training in Chennai OMR
    Azure Training in Chennai Velachery
    Azure Online Training
    Azure Training in Credo Systemz
    DevOps Training in Credo Systemz

    ReplyDelete
  41. Great Blog, there is so much reality written in this content and everything is something which is very hard to be argued. Top notch blog having excellent content. Sharepoint consultants near me

    ReplyDelete
  42. Great Info! ...Thanks for sharing information with us. If someone wants to know about Taxi Service App and Health Management Software I think this is the right place for you.
    Taxi Dispatch App | Taxi Service Providers | Safety and Health Management System

    ReplyDelete
  43. Thanks a lot for high quality and results-oriented help and would endorse your blog post to anybody who wants and needs support about this area.
    python training in chennai |python course in chennai

    ReplyDelete
  44. Nice blog..! I really loved reading through this article. Thanks for sharing such a amazing post with us and keep blogging...
    Artificial Intelligence Solutions

    ReplyDelete
  45. Really thanks for posting such an useful & informative stuff....

    learn informatica

    ReplyDelete
  46. Very intuitive article presented! AI is moving fast building industry 4.0. Cerexio Industry 4.0 Solution.

    ReplyDelete
  47. This professional hacker is absolutely reliable and I strongly recommend him for any type of hack you require. I know this because I have hired him severally for various hacks and he has never disappointed me nor any of my friends who have hired him too, he can help you with any of the following hacks:

    -Phone hacks (remotely)
    -Credit repair
    -Bitcoin recovery (any cryptocurrency)
    -Make money from home (USA only)
    -Social media hacks
    -Website hacks
    -Erase criminal records (USA & Canada only)
    -Grade change
    -funds recovery

    Email: onlineghosthacker247@ gmail .com

    ReplyDelete
  48. I appreciate this piece of useful information. Kshemkari Export Import academy one of the best leading Trade and Training Institute for import and export business, provides the best service in India with expert TeamFor more information visit our site: Export Import Certificate Training

    ReplyDelete
  49. I think you have a great article here, But let me share with you all here about my experience with a loan lender called Benjamin Lee who helped me expand my business with his loan company that offered me a loan amount of 600,000.00 USD which I used to upgrade my business months ago. He was really awesome working with him because he a Gentle man with a good heart, a man who can listen to your heart beat and tell you that everything will be OK, when I contacted Mr lee it was on my Facebook page that his advert came up then I visited his office at Michigan to discuss about the loan offer that he and his company render, He makes me understand how all process go then I decided to give a try to it was successful just like he promised, yeah I believe him, I trust him, I rely on him as well about all my project he will be my dear financial officer and I'm glad my business is probably going well and I'm going makes my business growth like grass with his help.he work's with a great investors and guess what? They also give international loans. Is that not awesome to hear when you know a lot of business project are growing up each day by day in your heart hoping that you going to make income of that job to raise money for the project, Ops, then Mr Lee will help you with that, Yes international loan he will help you with that perfectly because I trust him very much for that kind of job, Look don't be shy or shaded give a possible try to Mr lee here his contact : 247officedept@gmail.com

    ReplyDelete
  50. I think you have a great article here, But let me share with you all here about my experience with a loan lender called Benjamin Lee who helped me expand my business with his loan company that offered me a loan amount of 600,000.00 USD which I used to upgrade my business months ago. He was really awesome working with him because he a Gentle man with a good heart, a man who can listen to your heart beat and tell you that everything will be OK, when I contacted Mr lee it was on my Facebook page that his advert came up then I visited his office at Michigan to discuss about the loan offer that he and his company render, He makes me understand how all process go then I decided to give a try to it was successful just like he promised, yeah I believe him, I trust him, I rely on him as well about all my project he will be my dear financial officer and I'm glad my business is probably going well and I'm going makes my business growth like grass with his help.he work's with a great investors and guess what? They also give international loans. Is that not awesome to hear when you know a lot of business project are growing up each day by day in your heart hoping that you going to make income of that job to raise money for the project, Ops, then Mr Lee will help you with that, Yes international loan he will help you with that perfectly because I trust him very much for that kind of job, Look don't be shy or shaded give a possible try to Mr lee here his contact : 247officedept@gmail.com

    ReplyDelete
  51. Thanks for this nice article. you can try Statistical Aid: A School of Statistics for statistics and data analytics help.

    ReplyDelete
  52. Really awesome blog!!! I finally found a great post here.I really enjoyed reading this article. Thanks for sharing valuable information.
    Devops Training Institute in Pune
    Devops Training in Pune

    ReplyDelete
  53. I feel happy about and learning more about this topic. keep sharing your information regularly for my future reference. This content creates new hope and inspiration within me. Thanks for sharing an article like this. the information which you have provided is better than another blog.
    Carpets Manufacturers in Panipat

    ReplyDelete
  54. Know every aspect of Artificial Intelligence with Tecdecod. Here, I will share with you some unknown and secret hacks and tips of AI. There are many things you need to know about AI, and this guide will help you know everything. So, what are you waiting for? Subscribe my blog and understand the whole AI concept.

    ReplyDelete
  55. Studyprovider has experts team are giving the homework help, assignment help, report, thesis, research writing services and electrical engineering assignment available 24/7 seven days a week contact now.

    ReplyDelete
  56. Studyprovider has experts team are giving the homework help, assignment help, report, thesis, research writing services and computer engineering assignment help available 24/7 seven days a week contact now.

    ReplyDelete
  57. “Airlinespolicy247.com” tells about policies of different airlines under one hood. Passengers usually get confused about Flight Change Policy, Cancellation Policy, Baggage Policy, Check-In Policy, and Pet Policy. We cover all of these policies of different airlines and you will find everything in one place. With us, you enjoy the benefits of getting knowledge of every policy of each airline which you must know before traveling with an airline.
    For more details, visit at https://www.airlinespolicy247.com/
    Southwest Flight Change Policy

    ReplyDelete

  58. I like your post. I appreciate your blogs because they are really good. Please go to this website for the Data Science Course: Data Science course in Bangalore. These courses are wonderful for professionalism.

    ReplyDelete
  59. Hi there very cool web site!! Guy .. Excellent ..
    Superb .. I will bookmark your web site and take
    the feeds additionally? I'm glad to search out a
    lot of useful info hede wituin thhe submit, we'd like work oout more strategies

    토토
    스포츠토토

    ReplyDelete
  60. These are genuinely fantastic ideas about blogging really. You have touched some very nice points here. Please keep up this good writing.

    svkm university date sheet 2021 | Shekhawati University Time Table 2021 | Uniraj Exam Time Table 2021

    ReplyDelete
  61. Your flight booking will presently be dropped and you will get a programmed affirmation email. Assuming you bought your ticket online at lufthansa.com or at a Lufthansa Office, the discount will be made naturally and there is no requirement for any further activity. For tickets booked through a travel service, we suggest that you additionally reach out to them. In the event that you don't get a reaction on time or it appears to be essential for different reasons, kindly reach us again for additional handling.
    Read more Last Minute Flight Deals

    ReplyDelete

  62. The information which you have provided is very good. It is very useful who is looking for Data Science certification training in noida,with 100% placement supports. for more call - 8802820025.
    Data Science Training in Noida

    ReplyDelete
  63. Hello!!

    Useful blog. thank you for sharing to us.

    used tractors online

    ReplyDelete
  64. Impressive,
    business online is in now a days
    <a href="https://ecomsole.com>EcomSole is the best eCommerce consulting agency</a> with good annual revenue. It has many clients from different countries of the world. Mainly deals with Amazon Dropshipping, Amazon FBA Wholesale, Walmart, and eBay Store Automation.

    ReplyDelete
  65. business online is in now a days
    <a href="https://ecomsole.com> EcomSole is the best eCommerce consulting agency </a> with good annual revenue. It has many clients from different countries of the world. Mainly deals with Amazon Dropshipping, Amazon FBA Wholesale, Walmart, and eBay Store Automation.

    ReplyDelete
  66. business online is in now a days
    EcomSole is the best eCommerce consulting agency with good annual revenue. It has many clients from different countries of the world. Mainly deals with Amazon Dropshipping, Amazon FBA Wholesale, Walmart, and eBay Store Automation.

    ReplyDelete