Performing runtime benchmarks with Python Monitoring Tool Benchy

Friday, March 22, 2013

Hi all,

I've been working on in the last weeks at a little project that I developed called benchy. The goal of benchy is answer some trivial questions about which code is faster ? Or which algorithm consumes more memory ? I know that there are several tools suitable for this task, but I would like to create some performance reports by myself using Python.

Why did I create it ? Since the beginning of the year I decided to rewrite all the code at Crab, a python framework for building recommender systems. And one of the main components that required some refactoring was the pairwise metrics such as cosine, pearson, euclidean, etc. I needed to unit test the performance of several versions of code for those functions. But doing this manually ? It's boring. That's why benchy came for!

What benchy can do ?

Benchy is a lightweight Python library for running performance benchmarks over alternative versions of code. How can we use it ?

Let's see the cosine function, a popular pairwise function for comparing the similarity between two vectors and matrices in recommender systems.

Let's define the benchmarks to test:

With all benchmarks created, we could test a simple benchmark by calling the method run:

The dict associated to the key memory represents the memory performance results. It gives you the number of calls repeat to the statement, the average consumption usage in units . In addition, the key 'runtime' indicates the runtime performance in timing results. It presents the number of calls repeat following the average time to execute it timing in units.

Do you want see a more presentable output ? It is possible calling the method to_rst with the results as parameter:

Benchmark setup

import numpy
X = numpy.random.uniform(1,5,(1000,))

import scipy.spatial.distance as ssd
X = X.reshape(-1,1)
def cosine_distances(X, Y):
    return 1. - ssd.cdist(X, Y, 'cosine')

Benchmark statement

cosine_distances(X, X)

name	repeat	timing	loops	units
scipy.spatial 0.8.0	3	18.36	10	ms

Now let's check which one is faster and which one consumes less memory. Let's create a BenchmarkSuite. It is referred as a container for benchmarks.:

Finally, let's run all the benchmarks together with the BenchmarkRunner. This class can load all the benchmarks from the suite and run each individual analysis and print out interesting reports:

Next, we will plot the relative timings. It is important to measure how faster the other benchmarks are compared to reference one. By calling the method plot_relative:

As you can see the graph aboe the scipy.spatial.distance function is 2129x slower and the sklearn approach is 19x. The best one is the numpy approach. Let's see the absolute timings. Just call the method plot_absolute:

You may notice besides the bar representing the timings, the line plot representing the memory consumption for each statement. The one who consumes the less memory is the nltk.cluster approach!

Finally, benchy also provides a full repport for all benchmarks by calling the method to_rst:

Performance Benchmarks

These historical benchmark graphs were produced with benchy.

Produced on a machine with

Intel Core i5 950 processor

Mac Os 10.6

Python 2.6.5 64-bit

NumPy 1.6.1

scipy.spatial 0.8.0

Benchmark setup

import numpy
X = numpy.random.uniform(1,5,(1000,))

import scipy.spatial.distance as ssd
X = X.reshape(-1,1)
def cosine_distances(X, Y):
    return 1. - ssd.cdist(X, Y, 'cosine')

Benchmark statement

cosine_distances(X, X)

name	repeat	timing	loops	units
scipy.spatial 0.8.0	3	19.19	10	ms

sklearn 0.13.1

Benchmark setup

import numpy
X = numpy.random.uniform(1,5,(1000,))

from sklearn.metrics.pairwise import cosine_similarity as cosine_distances

Benchmark statement

cosine_distances(X, X)

name	repeat	timing	loops	units
sklearn 0.13.1	3	0.1812	1000	ms

nltk.cluster

Benchmark setup

import numpy
X = numpy.random.uniform(1,5,(1000,))

from nltk import cluster
def cosine_distances(X, Y):
    return 1. - cluster.util.cosine_distance(X, Y)

Benchmark statement

cosine_distances(X, X)

name	repeat	timing	loops	units
nltk.cluster	3	0.01024	1e+04	ms

numpy

Benchmark setup

import numpy
X = numpy.random.uniform(1,5,(1000,))

import numpy, math
def cosine_distances(X, Y):
    return 1. -  numpy.dot(X, Y) / (math.sqrt(numpy.dot(X, X)) *
                                     math.sqrt(numpy.dot(Y, Y)))

Benchmark statement

cosine_distances(X, X)

name	repeat	timing	loops	units
numpy	3	0.009339	1e+05	ms

Final Results

name	repeat	timing	loops	units	timeBaselines
scipy.spatial 0.8.0	3	19.19	10	ms	2055
sklearn 0.13.1	3	0.1812	1000	ms	19.41
nltk.cluster	3	0.01024	1e+04	ms	1.097
numpy	3	0.009339	1e+05	ms	1

Final code!

I might say this micro-project is still a prototype, however I tried to build it to be easily extensible. I have several ideas to extend it, but feel free to fork it and send suggestions and bug fixes. This project was inspired by the open-source project vbench, a framework for performance benchmarks over your source repository's history. I recommend!

For me, benchy will assist me to test several pairwise alternative functions in Crab. :) Soon I will publish the performance results that we got with the pairwise functions that we built for Crab :)

I hope you enjoyed,

Regards,

Marcel Caraciolo

44 comments:

AnonymousMay 17, 2013 at 7:45 AM
Thank you for your comparison of several ways to calculate cosine similarity, it saved me time!
ReplyDelete
Replies
UnknownJune 10, 2013 at 11:00 AM
Thanks for your sharing
ReplyDelete
Replies
UnknownNovember 25, 2015 at 1:55 AM
Embedded system training: Wiztech Automation Provides Excellent training in embedded system training in Chennai - IEEE Projects - Mechanical projects in Chennai. Wiztech provide 100% practical training, Individual focus, Free Accommodation, Placement for top companies. The study also includes standard microcontrollers such as Intel 8051, PIC, AVR, ARM, ARMCotex, Arduino, etc.

Embedded system training in chennai
Embedded Course training in chennai
Matlab training in chennai
Android training in chennai
LabVIEW training in chennai
Robotics training in chennai
Oracle training in chennai
Final year projects in chennai
Mechanical projects in chennai
ece projects in chennai

ReplyDelete
Replies
plc trainingJune 23, 2016 at 12:12 AM
Wiztech Automation is the Leading Best quality PLC, Scada, DCS, Embedded, VLSI, PLC Automation Training Centre in Chennai. Wiztech’s Industrial PLC Training and the R & D Lab are fully equipped to provide through conceptual and practical knowledge aspects with hands on experience to its students.

PLC training in Chennai
PLC training institute in Chennai
PLC training centre in Chennai
PLC, SCADA training in Chennai
Automation training in Chennai
DCS training in Chennai
ReplyDelete
Replies
UnknownJuly 3, 2017 at 4:58 AM
Thanks for sharing in this articles...very useful for us..

PLC training in Cochin, Kerala
Automation training in Cochin, Kerala
Embedded System training in Cochin, Kerala
VLSI training in Cochin, Kerala
PLC training institute in Cochin, Kerala
Embedded training in Cochin, Kerala
Best plc training in Cochin, Kerala
ReplyDelete
Replies
UnknownMarch 30, 2018 at 2:03 AM
aem interview questions
salesforce interview questions oops abab interview questions
itil interview questions
informatica interview questions extjs interview questions
sap bi interview questions
hive interview questions
ReplyDelete
Replies
UnknownMay 21, 2018 at 2:07 AM
Greetings Mate,

Jeez oh man, while I applaud for your writing , it’s just so damn straight to the point Performing runtime benchmarks with Python.

I have a net book Samsung N150plus,and sometimes I'm having problems with google that crashes and close, even is a bit slow as I'm going to a 2gb ram also, I had re installed it several times but someone told me to install a Linux software on my net book,
can i?are there any disadvantages?
Please keep providing such valuable information.

Obrigado,
Abhiram
ReplyDelete
Replies
UnknownSeptember 4, 2018 at 10:50 PM
Python has adopted as a language of choice for almost all the domain in IT including the most trending technologies such as Artificial Intelligence, Machine Learning, Data Science, Internet of Things (IoT), Cloud Computing technologies such as AWS, OpenStack, VMware, Google Cloud, etc.., Big Data Analytics, DevOps and Python is prepared language in traditional IT domain such as Web Application Development, Infrastructure Automation ,Software Testing, Mobile Testing.
python online training
ReplyDelete
Replies
UnknownDecember 10, 2018 at 12:22 AM
Thanks for sharing the article, its really useful. Keep updating more with us. Best Python Online Training || Learn Python Course
ReplyDelete
Replies
jamesJune 26, 2019 at 10:16 PM
Amazing content.
Data Mining Service Providers in Bangalore

ReplyDelete
Replies
AnonymousJune 25, 2020 at 9:41 PM
python training in bangalore | python online training
aws training in Bangalore | aws online training
artificial intelligence training in bangalore | artificial intelligence online training
machine learning training in bangalore | machine learning online training
data science training in bangalore | data science online training

ReplyDelete
Replies
jane hollySeptember 19, 2020 at 6:58 PM
This professional hacker is absolutely reliable and I strongly recommend him for any type of hack you require. I know this because I have hired him severally for various hacks and he has never disappointed me nor any of my friends who have hired him too, he can help you with any of the following hacks:

-Phone hacks (remotely)
-Credit repair
-Bitcoin recovery (any cryptocurrency)
-Make money from home (USA only)
-Social media hacks
-Website hacks
-Erase criminal records (USA & Canada only)
-Grade change
-funds recovery

Email: onlineghosthacker247@ gmail .com
ReplyDelete
Replies
AnonymousFebruary 8, 2022 at 6:19 AM

I like your post. I appreciate your blogs because they are really good. Please go to this website for the Data Science Course: Data Science course in Bangalore. These courses are wonderful for professionalism.
ReplyDelete
Replies
Learnbay Data ScienceMay 6, 2022 at 12:08 AM
HI.
Great Article.
This Is Just An Awesome Blog That People Can Learn A Very Good Lesson About. It Is Very Informative And Explained In Detailed And Simple Words Which Is Easy To Understand.
I Have Come Across A Website That Is Informative And Helps Me To Get A Good Knowledge
Want to Learn Data Science Course in Hyderabad.
Visit my Profile for More Information
Data science course in Hyderabad .
ReplyDelete
Replies
AnonymousMay 12, 2022 at 6:24 PM
This is an awesome post. Really very informative and creative contents. Visit my website to get best Information About Best MPSC Coaching Institute in Borivali.
Best MPSC Coaching Institute in Borivali
Top MPSC Coaching Institute in Borivali
ReplyDelete
Replies
ias examJuly 7, 2022 at 11:23 AM
Here you can find a list of all the Top IELTS online Coaching
ReplyDelete
Replies
jacksonAugust 20, 2022 at 5:25 AM

The information which you have provided is very good. It is very useful who is looking for Data Science certification training in noida,with 100% placement supports. for more call - 8802820025.
Data Science Training in Noida
ReplyDelete
Replies
horizon510March 14, 2023 at 1:13 AM
This comment has been removed by the author.
ReplyDelete
Replies
AnonymousNovember 24, 2023 at 10:14 AM
I work in factored ia and I found this information very helpful.
ReplyDelete
Replies
카지노사이트 모음November 7, 2024 at 12:48 AM
Very well written, and your points are well-expressed. please, don’t ever stop writing.
ReplyDelete
Replies
안전 슬롯사이트November 7, 2024 at 12:51 AM
Nice article and explanation Keep continuing to write an article like this
ReplyDelete
Replies
바카라사이트February 15, 2025 at 1:19 AM
I would like to see these materials often. I will come here often. Thank you!!
ReplyDelete
Replies
카지노사이트February 15, 2025 at 1:20 AM
Its really an Informative post and i loved it and appreciate your effort.
ReplyDelete
Replies
파워볼사이트February 15, 2025 at 1:21 AM
Thanks for sharing these Useful information! This is really interesting info
ReplyDelete
Replies
CqFebruary 17, 2025 at 2:48 PM
Cool tutorial
ReplyDelete
Replies
AnonymousFebruary 26, 2025 at 4:15 AM
소액결제 현금화
신용카드 현금화
I am mainly passionate about your outstanding achieve

ReplyDelete
Replies
토토사이트October 15, 2025 at 1:42 AM
I have been looking for sites like this for a long time. Thank you!
ReplyDelete
Replies
토토사이트 추천October 15, 2025 at 1:42 AM
Very value able post, I read the whole story when I start reading it.
ReplyDelete
Replies
바카라사이트October 15, 2025 at 1:42 AM
I appreciate your post thanks for sharing the information.
ReplyDelete
Replies
바카라사이트 추천October 15, 2025 at 1:43 AM
Very good written information.
ReplyDelete
Replies
gizmodoOctober 21, 2025 at 3:36 AM

Really a great addition. I have read this marvelous post
ReplyDelete
Replies
MeganrogerOctober 29, 2025 at 3:15 AM

Perfect just what I was looking for!
ReplyDelete
Replies
NikhilJanuary 19, 2026 at 9:57 AM
An iOS course introduces mobile development fundamentals clearly.It focuses on UI and performance.This iOS course is beginner-friendly.It is useful.
ReplyDelete
Replies
ONLINE IT GURUJanuary 19, 2026 at 10:57 AM
"Enhance your data visualization skills with tableau online training designed for beginners and professionals alike. Learn interactive dashboards, analytics, and reporting at your own pace through hands-on online sessions."
ReplyDelete
Replies
ONLINE IT GURUJanuary 19, 2026 at 10:59 AM
"Boost your career with expert salesforce admin classes designed to equip you with practical skills for managing Salesforce efficiently."
ReplyDelete
Replies
vrJanuary 20, 2026 at 6:53 AM
Learn Dell Boomi integration platform to connect apps, automate workflows, and manage cloud data with practical hands-on projects. dell boomi online training
ReplyDelete
Replies
vrJanuary 20, 2026 at 6:53 AM
Understand data structures, ER diagrams, normalization, and database design to create efficient and scalable data models. best data modelling courses
ReplyDelete
Replies
vrJanuary 20, 2026 at 6:55 AM
Master Java programming with object-oriented concepts, core libraries, and real projects to become a professional Java developer.java full stack course
ReplyDelete
Replies
vrJanuary 20, 2026 at 6:56 AM
Learn to design intuitive interfaces, user flows, wireframes, and prototypes using tools like Figma and Adobe XD for digital products.ui ux online course
ReplyDelete
Replies
ONLINE IT GURUFebruary 3, 2026 at 6:07 AM
"Enhance your analytics career with tableau developer training designed for hands-on learning and real-world projects."
ReplyDelete
Replies
ONLINE IT GURUFebruary 3, 2026 at 6:45 AM
Boost your career with power bi online training designed to help you master data visualization and analytics from anywhere. Learn practical skills and real-world applications through interactive, expert-led sessions.
ReplyDelete
Replies
ONLINE IT GURUFebruary 3, 2026 at 6:47 AM
"Boost your career with our comprehensive salesforce admin course designed for beginners and professionals alike."
ReplyDelete
Replies
ONLINE IT GURUFebruary 3, 2026 at 6:49 AM
"Enhance your career with salesforce development course designed for beginners and professionals alike.
ReplyDelete
Replies
educationFebruary 19, 2026 at 8:28 AM
Job-oriented data analyst training online helps you master analytics tools effectively.
You work on real-time business scenarios.
Hands-on projects strengthen your portfolio.
Expert trainers provide continuous guidance.
Flexible schedules support convenient learning.
This training accelerates your journey into data analytics roles.
ReplyDelete
Replies

Add comment

Artificial Intelligence in Motion

Pages