I studied communication science and linguistics and first learned about natural language processing (NLP) at university. NLP is essential in fields like corpus linguistics, machine translation, speech synthesis and recognition, and many others.
After graduating I have mainly worked as a developer building Web applications. Though most of my programming tasks involve some kind of text processing, linguistic aspects have only occasionally played an important role.
Nonetheless, I never lost interest in NLP and came in contact with the Natural Language Toolkit (NLTK) a few times so I decided to dive deeper into this very capable, feature-rich and grown-up Python library.
There is lots of documentation and a free online version of the book Natural Language Processing with Python. But I looked for a more lightweight and practical approach and had the chance to get a free review copy of the Python Text Processing with NLTK 2.0 Cookbook.
The book is divided into 9 chapters and contains 80 recipes for doing specific NLP tasks using Python, NLTK, MongoDB, Redis, WordNet and other software and services. Each recipe starts with a brief introduction, followed by an example implementation of the problem, an explanation how it works, and an outlook of what else can be done.
The author, Jacob Perkins, expects readers to be familiar with basic text processing concepts and targets Python programmers who want to quickly get started with the NLTK for natural language processing. I'm not sure whether basic familiarity is enough though, since the introductions to the underlying concepts at the beginning of the chapters are rather short.
As with other (programming) cookbooks this is not a book that you'd read from A to Z, but one that you pick up as a reference when you look for a solution to a particular problem in the field of NLP.
That being said, there is one particularly noticeable weakness in the PDF version of the ebook: references to other recipes within the book and to external Web pages are not clickable links. Something that really should be addressed in future editions in my point of view.
Apart from that, the book is well written and comprehensible, provided you have at least basic knowledge of the covered topics. To judge for yourself, you can read the 3rd chapter online in PacktLib and take a look at Jacob Perkins' Web site.
Summing up the NLTK cookbook provides a good, concise overview of what you can do with this powerful library and serves as a useful companion for Python programmers who work with the NLTK or intend to do so.
This post was written by Ramiro Gómez (@yaph) and published on (updated: ). Ramiro is a developer who likes open source, data mining, visualization and writing. To be informed about new posts you can subscribe to the Geeksta RSS feed.
External links on this website may contain affiliate IDs, which means that I earn a commission if you make a purchase using these links. This allows me to offer hopefully valuable content for free while keeping this website sustainable. For more information, please see the disclosure section on the about page.