Hi all,
Last week I was at the
Scipy 2011 Conference at Austin, Tx. My first international conference as also my first lecture international! The Scipy Conference is an annual meeting for scientific computing developers and researchers that use python scientific packages in their research or work. It was a great opportunity for meeting new python developers, know more about what's happening in scientific python nowadays and to learn about Scipy, Numpy and Matplotlib, considered the standard libraries for developers who wants start to develop in the scientific world.
At the first day of the conference, I had the opportunity to learn more about
Numpy, a widely used library for numerical computations in Python as also learn more about the
Scikit-learn framework, a great open-source toolkit for machine learning developers written in Python, Numpy and Scipy.
You can access both tutorials available here at the
Scipy Conference Tutorials WebPage. Numpy is an amazing library, and what I learned I started already applied at the library I am currently working on called
Crab for building recommender systems. The Scikit-learn is also an interesting framework written in Scipy, Numpy and Matplotlib with several machine learning techniques and has as one main features the easy-to-use interface with lots of examples and tutorials for starters and beginners in machine learning. It works so smoothly that I decided to use it as dependency of the Crab framework.
The another tutorial was about an introduction to
Traits,
Matplotlib and
Chaco - great tools for creating nice user interfaces and plotting charts. One of the best parts of this tutorial was easily to create nice interfaces and animated plottings with a few lines of code. Take a look of what you can do here or even see a
real-time animated plotting with Matplotlib.
Traits and Chaco are part of the EPD package developed by the company
Enthought, whose one of the co-founders is one of the main developers and founders of Numpy! Yeah :D Those frameworks allow easily create nice interfaces only using models concepts. If you want to learn more, please check out the tutorials as the
official website about how to download, install and use it.
Another keynote interesting was about the
Ipython, the incremented shell for scientific Python developers. What amazed me was when he showed the matplotlib embedded at the shell instead of opening a new window! The work around the Ipython has been fantastic, with several features for python developers! I extremely recommend!
The rest of the conference was dedicated to keynotes and talks about currently works on data science, core technologies and data mining with Python, Scipy , Numpy and related libraries. I had the opportunity of giving the lecture -
Crab - A Python Framework for Building Recommender Systems written by me, Bruno Melo and Ricardo Caspirro, actually the main contributors for this work. The idea is to provide for python developers a recommender toolkit so they can easily create, test and deploy recommender engines with simple interfaces written with the scientific python packages such as Numpy, Scipy and Matplotlib.
You can check out my slides at the Scipy Conference
here.
The project is currently being developed by the non-profitable organization called
Muriçoca, that we decided to create to manage and develop the Crab Framework.
One of the best keynotes was the presentation of
Hilary Manson, the Data Scientist at
bit.ly. She gave a funny lecture about her work and the current challenges with handling with large data sets and lots of URL-shortening happening at the backend of Bit.ly. It is quite amazing the amount of data and what you can do and extract useful information from all this data.
At least, I decided also to give a lighting talk about
Mining the Scipy Lectures. A simple lecture to show what you can do with the data from the Scipy Conference Schedule and play with it. I used some NLP techniques and clustered based on the most frequent topics to check how was distributed the lectures at Scipy based on the keywords from their titles. To visualize I used the Graph Visualization tool
Ubigraph to show in 3D the clusters generated (by the way I used the K-means algorithm to cluster).
The slides are also available
here and the source code
here.
|
3D Lectures Clusters |
Soon I will release the PDF with the article submitted as also the video with both keynotes that I presented. It was an amazing conference at Austin, making new friends and lots of new partners! :D I expect to be there next year, absolutely! One of my goals this year also is to prepare a scientific computing course using Python, wait for more information soon here at the blog (it will include matplotlib, scipy and numpy)!
Cheers,
Marcel Caraciolo