In this post, I will share with you some of my scientific contributions at the startup Favoritoz, a previous business that I was involved as Co-Founder and Scientist Chief. The main contribution was an algorithm to rank on-line deals in coupon systems. This algorithm was implemented and worked for ten months when the startup was running on-line. I believe that all this implementation could serve as baseline for anyone who is looking after some work related to this field and also a main contribution to the recommender system community.
Favoritoz's Coupon System |
Favoritoz was on-line coupon system where retailers could publish offers with special discounts or their products to segmented customers interested at their products or by a specific brand. One of the critical features was the personalization, where people could say wha they would like to receive daily at their deals wall. For instance, If l liked sushis, videogames and junk-food , the system could detect those interests and using our algorithm it would rank higher offers and deals related by those topics.
My goal was to develop this baseline for ranking deals based on their interests, but not only using the user profile, also we could use particular aspects from the on-line deals such as: the initial date of the deal, the popularity of the deal or the number of coupons left for that deal. All those aspects could be measured in order to compute a overall score that would be used to rank those recommendable items.
Ranking Model
Recommender systems aim to present desirable items for a person to choose from. This task is accomplished by sorting some itens in ther order of utility or interest for the person. Generaly those items are organized in some form of list, such as, in our scenario, the various deals organized as wallboard on Favoritoz. But the question is how to order those deals in a way that users could interest at or even convert into transactions ? That's my work! I came with a simple ranking algorithm that could use various type of information available in order to come with a useful ranking system that could recommend the best deals for each of our customers at Favoritoz.
By looking to specific features in our domain (coupons market) we could use them to build a ranking function that could give some personalized recommendations. For instance, we look for "hot" deals, which is represented by most bought deals (popularity) . But it can lead to deals that are common to everyone and it's the opposite of personalization. So let's use another information like the publishing date and expiration date of the deal which could give us the newest deals and the deals that are almost close to end. Furthermore, let's look into the number of coupons available, in this case, a deal that are almost with no coupons left, could atract more users to purchase the last ones. Another important feature is the user taste. In Favoritoz people could follow retailers (their favorites) and categories (sports, cuisine, automobile, etc.) We use also this information in the ranking equation. In the end, in order to produce rankings that balance all of these aspects we came wih a prediction model using all those features.
I liked the NetFlix approach on his recommendation problem at their company, where they came up with a simple scoring approach by choosing the ranking function to be a linear combination of several features. We also came with this approach but using different features exclusive at our domain.
It's like this equation:
SD = Publishing Date of the deal d,
ED = Expiration Date of the deal d,
P = Deal Popularity
CA = Coupons Available of the deal d,
UP = User Profile (user tastes)
DV = Diversity (based on stores)
It's simple and of course there are many improvements that could be made. But it was our baseline and at our tests in production was very satisfactory. In 4 months running we achieved an increase of almot 300% in click-through-rates per user thant without any ranking methods. Of course several external factors could influence this result, but showed us promising results at our first attempt. I implemented all the code with Python and a fork of my current framework Crab, a python framework for building recommender systems and variations.
All the algorithm explained can be accessed at this link.
My goal was to develop this baseline for ranking deals based on their interests, but not only using the user profile, also we could use particular aspects from the on-line deals such as: the initial date of the deal, the popularity of the deal or the number of coupons left for that deal. All those aspects could be measured in order to compute a overall score that would be used to rank those recommendable items.
Ranking Model
Recommender systems aim to present desirable items for a person to choose from. This task is accomplished by sorting some itens in ther order of utility or interest for the person. Generaly those items are organized in some form of list, such as, in our scenario, the various deals organized as wallboard on Favoritoz. But the question is how to order those deals in a way that users could interest at or even convert into transactions ? That's my work! I came with a simple ranking algorithm that could use various type of information available in order to come with a useful ranking system that could recommend the best deals for each of our customers at Favoritoz.
By looking to specific features in our domain (coupons market) we could use them to build a ranking function that could give some personalized recommendations. For instance, we look for "hot" deals, which is represented by most bought deals (popularity) . But it can lead to deals that are common to everyone and it's the opposite of personalization. So let's use another information like the publishing date and expiration date of the deal which could give us the newest deals and the deals that are almost close to end. Furthermore, let's look into the number of coupons available, in this case, a deal that are almost with no coupons left, could atract more users to purchase the last ones. Another important feature is the user taste. In Favoritoz people could follow retailers (their favorites) and categories (sports, cuisine, automobile, etc.) We use also this information in the ranking equation. In the end, in order to produce rankings that balance all of these aspects we came wih a prediction model using all those features.
I liked the NetFlix approach on his recommendation problem at their company, where they came up with a simple scoring approach by choosing the ranking function to be a linear combination of several features. We also came with this approach but using different features exclusive at our domain.
It's like this equation:
frank(u,d) = w1 * SD + w2 * ED + w3*P + w4*CA + w5*UP + w6*DV + bwhere,
SD = Publishing Date of the deal d,
ED = Expiration Date of the deal d,
P = Deal Popularity
CA = Coupons Available of the deal d,
UP = User Profile (user tastes)
DV = Diversity (based on stores)
It's simple and of course there are many improvements that could be made. But it was our baseline and at our tests in production was very satisfactory. In 4 months running we achieved an increase of almot 300% in click-through-rates per user thant without any ranking methods. Of course several external factors could influence this result, but showed us promising results at our first attempt. I implemented all the code with Python and a fork of my current framework Crab, a python framework for building recommender systems and variations.
All the algorithm explained can be accessed at this link.
Conclusion
As I presented, this ranking algorithm was my first attempt to solve this problem identified at our on-line website. Since the startup didn't go on (no more coupons), I couldn't reformulate the algorithm in terms of optimization and more brute A/B testes. With more data availability we could achieve even better results. It was a rapid experimentation and I believe this work could be useful for someone that is working in this research field. I also invite anyone to qualify my recommendation approach to improve it or pointing out some flaws. (-- under approval -- This article is indexed at arxiv.org.)
Feel free to analyze it and if you want to reference this article or extend this work, let me know. Favoritoz was a great research lab for me!
That's all,
Regards,
Marcel Caraciolo
As I presented, this ranking algorithm was my first attempt to solve this problem identified at our on-line website. Since the startup didn't go on (no more coupons), I couldn't reformulate the algorithm in terms of optimization and more brute A/B testes. With more data availability we could achieve even better results. It was a rapid experimentation and I believe this work could be useful for someone that is working in this research field. I also invite anyone to qualify my recommendation approach to improve it or pointing out some flaws. (-- under approval -- This article is indexed at arxiv.org.)
Feel free to analyze it and if you want to reference this article or extend this work, let me know. Favoritoz was a great research lab for me!
That's all,
Regards,
Marcel Caraciolo