Pages

High Performance Computation with Python - Part 03

Saturday, September 24, 2011

Hi all,


This article is the third one of the series about High Computation with Python.  For anyone that missed the first and the second parts check this link and this one.  The goal is to present approaches to make CPU-demanding tasks in Python run much faster.


The techniques that are being covered:


  1.  Python Profiling - How to find bottlenecks
  2.  Cython -  Annotate your code and compile to C
  3.  Numpy Vectors - Fast vector operations using numpy arrays
  4.  Numpy integration with Cython - fast numerical Python library wrapped by Cython
  5.  PyPy - Python's new Just in Time  Compiler

In this post I will talk about Numpy Vectors and how you can wrap it with Cython!


The Problem

In this series we will analyze how to optimize the statistical Spearman Rank's Correlation coefficient,  which it is a particular measure used to compute the similarity between items in recommender systems and assesses how well the relationship between two variables can be described using a monotonic function. The source code for this metric can be found in the first post.


Numpy


Numpy is a powerful extension to Python, adding support for large, multi-dimensional array and matrices, along with several mathematical functions to manipulate these arrays. To install it you can type this command at your terminal

$ sudo easy_install numpy

Or

$ pip install numpy

In our example we will change the spearman.py . Import the numpy library and change the spearman_correlation to look  the one below. If you run and test it you will ger the same output as before.



The numpy strength is that can simplify lots of operations on vectors or matrixes of numbers since they work directly in all list rather than on individual elements at one time.  So before we had nested for loops over individual terms in a list, now with numpy you could do the same job in a faster and simple way.
Some notes:

  • You define an array with numpy.array statement, in our case a list of tuples indexed by the labels keys and ranks. (lines 29 and 30).
  • Lots of operations already implemented in numpy, such as numpy.in1d which finds where the elements in the first vector are in the second vector returning an array os bools.
  •  We have numpy.sort which sort the elements based on a key, in this example (ranks) (lines 16 and 17).
  • diffs * diffs does a pairwise multiplication, think of it as diff[0] = diff[0] * diff[0]; diff[1] = diff[1] * diff[1]...; diff[n-1] = diff[n-1] * diff[n-1]. (line 36)
  • size is an attribute from numpy.array to fetch the m*n elements (count) from an array.
If it stills unclear I suggest you to try it at the command line, step-by-step to look over the results. Put a small number of elements in the array and see it in action.

Numpy with Cython


Numpy is a powerful library and uses very fast C optimized math libraries to perform these calculations very quickly. You can also wrap your python code with Cython. The main difference is the annotation of the numpy arrays. You can see the tutorial for further details.  The difference are how we import: cimport numpy as np and the assinature of the function _rank_dists.




Special Notes - Meeting Scipy

Another poweful library is Scipy, it is a package for Python that brings several algebra techniques for dealing with matrices and vectors.  One special module is the scipy.stats, which comes with the spearmanr function. It receives two arrays with the observations and returns the spearman coefficient. Amazing! Let's see our code below:


In the next post we will study the Pypy, a JIT Compiler which can speed your code with minimal changes at your code!

I hope you enjoyed this article,

Marcel Caraciolo

8 comments:

  1. Embedded system training: Wiztech Automation Provides Excellent training in embedded system training in Chennai - IEEE Projects - Mechanical projects in Chennai. Wiztech provide 100% practical training, Individual focus, Free Accommodation, Placement for top companies. The study also includes standard microcontrollers such as Intel 8051, PIC, AVR, ARM, ARMCotex, Arduino, etc.

    Embedded system training in chennai
    Embedded Course training in chennai
    Matlab training in chennai
    Android training in chennai
    LabVIEW training in chennai
    Robotics training in chennai
    Oracle training in chennai
    Final year projects in chennai
    Mechanical projects in chennai
    ece projects in chennai

    ReplyDelete
  2. Great post. Thank you for sharing such useful information. Please keep sharing

    Best B.Tech College in Noida

    ReplyDelete

  3. تعد شركة تركيب اثاث ايكيا بالرياض هي الشركة الرائده والاولي في كافة الاثاث من تركيب وفك ونقل وتخزين وكافة الاعمال المتعلقة بالاثاث في الرياض وكافة المناطق والمحافظات بالمملكة العربية السعودية، وقد تصدرت شركة خبراء المملكة لتكون الأولى في مجال فك ونقل وتركيب الأثاث المنزلي وايضا فك وتركيب الستائر بالرياض وهي تتميز عن باقي شركات الرياض نظرا لما تقدمة من خدمات بشكل احترافي كما انها تتميز عن غيرها بكفاءة الفنيين والامتخصصين في مجال تركيب الاثاث فلا داعي لكثرة البحث فلديك خبراء المملكة فهم فعلا خبراء ومتميزون في جميع خدماتهم المقدمة
    شركة تركيب اثاث ايكيا بالرياض
    عامل شركة تركيب ستائر بالرياض
    شركة تركيب ستائر بالرياض
    ارخص باركية بالرياض
    عامل تركيب اثاث ايكيا بالرياض
    طريقة تركيب اثاث ايكيا
    طريقة تركيب غرف النوم صيني

    ما يميز شركة تركيب نقل وتركيب اثاث بالرياض
    - تعد شركة تركيب اثاث ايكيا من الشركات المفضلة لكثير من العملاء فهم من منحوها الصدارة والتميز لتميز الخدمات المقدمة لهم وهي الاولي في تركيب الاثاث لزيادة خبراتها الكبيرة لسنوات.
    تتميز ايضا شركة خبراء المملكة بكبر فرق العمل المتخصصة والمدربة بمهاره وتقنية عالية كما اننا ندعم صفوفنا بصفة مستمرة من العمال والموظفين والفنيين الأكفاء والمهرة كما انها نستقبل العمالة الفليبنية

    ReplyDelete
  4. This professional hacker is absolutely reliable and I strongly recommend him for any type of hack you require. I know this because I have hired him severally for various hacks and he has never disappointed me nor any of my friends who have hired him too, he can help you with any of the following hacks:

    -Phone hacks (remotely)
    -Credit repair
    -Bitcoin recovery (any cryptocurrency)
    -Make money from home (USA only)
    -Social media hacks
    -Website hacks
    -Erase criminal records (USA & Canada only)
    -Grade change
    -funds recovery

    Email: onlineghosthacker247@ gmail .com

    ReplyDelete
  5. Studyprovider has experts team are giving the homework help, assignment help, report, thesis, research writing services and math assignment help available 24/7 seven days a week contact now.

    ReplyDelete