Python for Data Analysis: First Steps

I’ve been looking at Data Scientist roles, and a little more of what skills one needs for Analytics roles. A frequent requirement is some history or experience of using R or other statistical analysis programmes. I don’t have any history of using these types of programmes. I had used Mathematica in my Physics masters research project back in 2010, but not for a lot of statistical work. I have used python, however, and I knew that O’Reilly sold a book entitled “Python for Data Analysis”. Hopefully I can gain some skills and experience from this book, which I’ll summarise here.

First job was setting up python properly. I tried to install all of the packages required directly to the python found in OS X (I’m running 10.11, El Capitan), but something went very wrong with the installation of pandas. (Lots of reports of unused functions.) In googling for answers, I found this blog post which explained how to set up and install the relevant packages within virtual environments.

The steps laid out in the blog post were all correct, except I found that I had to paste the following lines into a newly created .bash_profile and then run the command source .bash_profile, rather than into an already existing .bash_rc (that is, there was no .bash_rc).

export WORKON_HOME=$HOME/.virtualenvs
source /usr/local/bin/virtualenvwrapper.sh

A second difference was finding the version of pandas. Rather than pandas.version.version, as in the blog post, I had to use pandas.__version__.

This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *