I don't remember exactly how I came across Mining the Social Web, probably while looking for interesting Python books in O'Reilly's catalog. Anyhow, the title immediately caught my attention, it promised information on using web services in Python and analyzing social network data.
Another thing that got me hooked is that the author, Matthew A. Russell, created a Github repository with the code from the book. It's not unusual that the sample code of a programming book can be downloaded by anyone, but in this case the author actively invited the reader and open source communities to participate.
There is also a dedicated Twitter account and a Facebook page with updates on the book and other information on relevant topics. Clearly Mr. Russell cannot be accused of not knowing what he's talking about.
The book is divided into 10 chapters and packed with practical examples on how to retrieve, explore, analyze and visualize data from social networks like Twitter, Facebook, and LinkedIn, Mailboxes, Blogs, and the meanwhile abandoned Google Buzz.
Throughout the book a vast variety of concepts and techniques like OAuth authentication, statistical analysis, clustering, text-mining, natural language processing, data visualization, data processing with MapReduce and others are introduced.
The book runs through these topics accompanied by many code examples at a fast pace and an advanced level. To get the most of this book readers need to have good knowledge of Python and should also be familiar with text and language processing concepts or be ready to get heads up on them.
I found Mining the Social Web to be a demanding read due to the high density of information. Though, I certainly did not grasp every detail of the book and am less interested in processing data from email conversations, LinkedIn or Google Buzz, I found the book to be a rich source of inspiration.
Reading Mining the Social Web sparked many ideas in me and I started working on projects, the most recent being Coderstats, that gather, analyze, and visualize data from the Web. I have fun doing this stuff and will go on doing so.
If you are looking for inspiration for what you can do with the huge amounts of data available on the Web and are familiar with Python and text processing, check out the first chapter, that you can download as PDF, to see whether this is also a good book for you.
This post was written by Ramiro Gómez (@yaph) and published on (updated: ). Ramiro is a developer who likes open source, data mining, visualization and writing. To be informed about new posts you can subscribe to the Geeksta RSS feed.
External links on this website may contain affiliate IDs, which means that I earn a commission if you make a purchase using these links. This allows me to offer hopefully valuable content for free while keeping this website sustainable. For more information, please see the disclosure section on the about page.