Recommendations and how to measure the ROI with some metrics ?

Sunday, July 8, 2012

Hi all,

We talked a lot about recommender systems, specially discussing the techniques and algorithms used to build and evaluate algorithmically those systems. But let's discuss now how can we measure in quantitative terms how a social network or an on-line store can measure the return of investment (ROI) of a given recommendation.

The metrics used in recommender systems

We talk a lot about F1-measure, Accuracy, Precision, Recall, AUC,  those buzzwords widely known by the machine learning researchers and data mining specialists. But do you know what is CTR, LOC, CER or TPR ?  Let's explain more about those metrics and how they can evaluate the quantitative benefits of a given recommendation.

First, it is important to understand what is a metricA metric is a meaure system that quantifies a trend, dynamics or a certain characteristic. It is commonly used to explain phenomenas, identify the causalities , share discoveries or project results in future events.  Define and monitor those metrics are important to evaluate the return of investment (ROI) of specific actions, demands and test hypothesis.

For recommender systems we can use metrics to evaluate their performance on conversion, interaction or impact. In the figure 1 we can see those groups and how the metrics are distributed:

Metrics groups for evaluate the recommender systems

The impact measures include the places where the recommendations are presented, for example, the e-commerce home page, the product list page or the shopping cart page; and the number of recommendation lists; which is the total number of recommendation lists shown inside the store in a specified period of time. They can provide a signal the coverage or the amplitude of the recommendation service in the website.

The most important measure group is the interaction. The CTR (click-through rate) is the one of the metrics most used nowadays in this group to evaluate the engagement level of the users with the recommendations. It quantifies the level of interest among the recommended products.  It is calculated by dividing the number of clicks in the recommended items and the total number of recommendations presented.

The third one and the most relevant in the group are the metrics that measure the conversion of the recommendation service. Among those metrics, the most popular are: 1) the rate of orders with recommendation, that is, the division between the number of orders with recommendations and the total number of orders. 2) the rate of items recommended per order created by recommendation, that is, from the orders what's the proportion between the number of recommended items and the total of items in the order. 3) The increase of the average ticket, which corresponds to the division between the average ticket of the orders that contains recommended items minus the average ticket of the store and the average ticket of the store.  Finally the revenues increase rate, that corresponds to the revenues generated by recommendations divided by the difference of the total profit and the revenue by recommendation.

It is important to notice that the those metrics correspond to percentual values measured in a specified time period, so the divisions evaluated above must be multiplied by 100% to determine correctly the proposed taxes or the percentual values. So let's review the presented metrics and respectively abbreviations:

- REC:  number of recommendations presented in a list.
- LOC:  places where the recommendation lists are placed.
- CER:  total of clicks in the recommendations
- CTR (%):  rate of clicks in the recommendations
- TPR (%):  proportion of orders with recommendations
- TIR (%):  proportion of recommended items per order with recommendation
- IAT (%):  increase in the average ticket
- IR (%):  increase in the revenue

Understanding the metrics

For better understanding of the metrics illustrated above let's use a real world scenario and present how to calculate each of them.  Let's consider the artificial data presented at Table 1, where the ORDERS is the total number of orders in the store, ORDERS_REC is the total number of orders with recommendations, NIP corresponds to the average number of items per order,  NIRP is the  average number of recommended items per order with recommendations. AT is the average ticket and ATR is the average ticket of orders with recommendations.

The Table 1 shows that the store had 150.000 items recommended presented at 01/06/2012 and 1400 orders closed.

Table 1:  Historical Data of Sales at E-Commerce WebSite

Using this data, we can calculate the following metrics:

CTR =  (CER/REC) * 100 = (18.000 / 150.000) * 100%  =  12%
TPR =  (ORDERS_REC/ORDERS) * 100 % = (250/1400) * 100% =  17.9%
TIR =  (NIRP/NIP) * 100%  =  (1.7/4.5) * 100% = 37.8%
ATM = (ATR - AT) / (AT)  * 100% =  (315-268/268) = 17,5% 

The last metric proposed here refers to the percentual increase at the revenue.  Considering the data available in the table 1, the store above profited a total of R$ 375.200,00 at 01/06/2012. Having the total sales from recommendations in R$ 67,000,00 the increase of profit will be:

IR  =  (67.000) / (375.200 - 67.000) * 100 =  21,73%

The results present that 12% of times that the recommendations are presented, one is clicked; from all the orders purchased at the store, 17,9% has at least one item recommended; from the items at the shopping cart, 37,8% were recommended; and  the recommendation increases in 17,5% the value of the average ticket in the store.  About the revenue, the recommendation resulted an increase of 21,7%.

So until now we presented the metrics and some numbers about how to calculate them. But let's go further and see how we can now compare for example now, there are three recommendation approaches that we want to test at our web store using a kind of test A/B in a specified period (You don't know what is a test A/B? Read it here.)

So let's consider three approaches for example:

- Technique 1:  Content Based Filtering

- Technnique 2:  Only Most popular Ranking

- Technique 3:  Collaborative Filtering

The Table 2 presents the performance of the recommendation system in the three approaches, with the average value of the following recommendation metrics:  CTR, TPR, TIR, IAT e IR for one period date.

Table 2: Perfomance at our e-commerce store with each recommendation approach

In the table 2 we can see that the average interaction is between 6.9% and 14.8% with the recommendations approaches implemented at the store.  It means that, at least 6,9% of the recommended items presented were clicked.  About the conversion metrics, we can observe that the recommendations are promoting new sales, which it wouldn't exist if there was no recommendations.   The conversion rates are between 4,5 % to 13,5% at those stores.  The best recommendation approach tested  was the third one with 14,4% of increase at the profit.  It is important to notice that this rate is calculated in average, so during the period analyzed there was some peaks in the increase of the profit, for instance sometimes 30% , and 4%.

At the same time, the numbers indicate that the recommendations improve an significant up-sell in the orders with recommendations, since at least half of the itens at the shopping carts came from recommendations (the minimum avg TIR was 48.6%)

One result that can came to our attention was the difference at the performance of sales between the store with the technique 1 and the one with the technique 3 when compared to the performance of the store with the technique 2.   The stores with Collaborative Filtering and Content Based Filtering has a better impact  in recommendations than the technique 2, since in those stores there are personalized recommendations in several places at the website, where in the approach 2 there is only a lower number of pages thast have recommendations.  So if the recommendations are assertive and there are several impact places where people can see the recommendations, the expected result is even better, as the numbers explain at the table presented above.

What we can do with those metrics

Evaluate the result of the conversion of the recommender system at your website is critical. We generally don't focus on those metrics and give more importance to accuracy and better coverage, but what it really imports is the improvement in sales or user acceptance in clicks, etc.  The personalization of a social network of a e-commerce using recommendation systems must be evaluated periodically.   Some tips for you who wants to plan to do this:

- Define a metrics's plan:  what the metrics most important to measure at your website ?  CTR?  TPR?  TIR?

- Establish the goals or the reference target values:  For instance, we want an increat at the profit of 10%  (AF = 10%)  and the increase of average ticket of recommendations at 15% (ATM = 15%).

- Monitor the metrics using the correct tools: It is important to have an web analytics dashboard to analyzed the results as also to obtatin the metrics described above and other relevant parts of your business.

Recommender Systems is more than only algorithms, it is important to understand how to apply them and measure them closely to see how they are effective or even if it needs to be redesigned or improved.  With all those steps and metrics you will be able to find the best configuration for your website and the effective recommendation strategy to present to your clients.

I hope you enjoyed this article,

Best regards,

Marcel Caraciolo

PS: This article is based on the brazilian article at E-commerceBrazil Magazine June/2012 Edition. I recommend a lot if you are a brazilian to read it either.