Pages

Atepassar Social Network Friendship Connections Visualizations using GeoLocalization!

Friday, March 11, 2011

Hi all


I've been looking after some visualization tools for social networks in order to present a visual representation of the the AtePassar social network, helping me to see how the users are connected and  visualize the friendships between them.  However, sometime ago I found this post about a new visualization created by the Facebook team which has explored new types of visualization. They plotted a new visualization that showed how geography and political borders affected where people lived relative to their friends.  This visualization focused on which cities all around the world had a lot of friendships between them.

The result is shown here:

Facebook Friendship Visualization




If you want to know more about the how they managed to create this map, you can check the Facebook's blog.  Inspired by this work I decided to create one by my own analyzing the AtePassar network. AtePassar is a famous brazilian social network where I work for as a data mining analyst creating and bringing collective intelligence to improve the features of the website.

AtePassar Social Network


So I have created a Python script which exports the data from the AtePassar Profile users and then convert it to a structured file with information of each user's current city and summed the number of friends between each pair of cities. Then, I merged the data with the longitude and latitude of each city. The coordinates of the brazilian cities were obtained at the Datasus website, a Brazilian data repository    
for the government with statistics and data files about Brazil's  population, health, geography, etc.  You can download the database with the information of the cities here.  To open and read it you can use a third-party library called dbfpy, which handles with .dbf data files. The script is available for download at my personal repository at Github. You can use and modify it for your needs. 

The result of all the experiment is shown in the figure below.  There are some interesting insights about it:


Atepassar SocialNetwork until Feb 2011 - Friendship Visualization


  • There are several black areas in the map. Since Brazil is a huge country and there are several places, specially in the North region where we have the Amazon Forest, the demography there is quite low, so we don't have many users around there.  Also, in the North region is the region with the lowest number of users at AtePassar. We see only in the capitals the presence of users, so we believe the access to internet is still a problem around that region or maybe our network is not yet released there.
  • We have a great number of users in Recife (PE), São Paulo (SP), Rio de Janeiro (RJ) and Brasília (DF).  As you may see the white shinning lines that interconnect those states in contrast to another cities states in the map. We believe Recife is important specially because the team working behind AtePassar is from Recife, PE, so the marketing around network there is more present than other cities. Another reason is because of the videos available at Atepassar, which the provider (the course and teachers staff) is also quite famous around Recife, PE.  São Paulo, Rio de Janeiro and Brasília are considered currently the cities that have the greatest number of students registering for public exams according to a research made by a popular news site  CorreioWeb, specialized in news about public exams.


After seeing the Perone's post at his blog using the visualization tool Gource to create a new visualization for the Google Analytics,  I realized that project could help me to tell the history of AtePassar Social Network. After writing some python code,  I decided to represent the users by using the states of the users and I also changed the default user icon from Gource to brazilian state flags (You can download them here). 

The social network started at 2009 and launched for public in middle of 2010, where today the network have more than 30 thousand users registered.  We modified the Gource in a way that it could represent the history users registering of all social network by showing the users and his hometowns. Unfortunately, Gource does not work with more than ~= 15.000 nodes, so I decided to show only a period of the social network since its launch until April 2010.  












I've also tried the visualization tool 3D Ubigraph, however since there were thousands of nodes, it didn't work for long periods. This time I've tried to present the network in a different aspect by checking the friendship between the users. It is clear in the video below that the network centers around between two users, by the way, the founders of the social network rjcf and marcoscampello. Another aspect to see is that there are many users but with low degree of friendship. This happens because the timeline of the socialnetwork is the same for all users. Different from Twitter, the user in AtePassar can see what everyone posts in the timeline. We believed that in the beginning of the social network in order to estimulate the interaction between users, we decided to show the posts of all users at Atepassar. But the team is looking carefully if the timeline stream becomes overloaded. The video is presented below.








I was so excited with the results that I decided to use the Gource tool for presenting the history of all users that joined our local community of Python Technology here at Pernambuco-Brazil to present in a lecture of one of our meetings. You can read more about it in this post.


I'd like to mention Andreas Kaltenbrunner for supporting me in this work, giving me some insights on how drawing the brazilian map using coordinates.  He did a similar work on a spanish social network called Tuenti. You can see his post about it here.

I hope you like it,

Cheers, 

Marcel Caraciolo

5 comments: