(Ab)using Pandas to Migrate Disqus Threads
I recently converted an old site of mine from Drupal to a static Web site created with Logya to save some kittens' lives. I intend to write a more detailed post about the process, but will focus on a URL issue here.
Logya is flexible regarding URLs, accepting common file extensions like
.html and even
.php, but the most straightforward way is to end them with a forward slash. On the old Drupal site I had a mix of URLs ending with
.html or no file extension but without an ending slash, e. g. www.ramiro.org/blog/umstellung-von-joomla-auf-drupal.
In theory I could have kept all URLs like they were, because Apache takes care of redirects, if the path corresponds to a directory on the server, which it does. But reality is different, since I use Disqus for comments and the redirected URLs differed from the ones Disqus knew about.
To resolve this issue I took advantage of the Migrate Threads tool Disqus offers. You find it at
your-site-id.disqus.com/admin/tools/migrate/. For cases like this you can download a file containing the URLs Disqus knows about on your site and upload a CSV file which maps old URLs to new ones, hence URL mapper.
To create this mapping I wrote the following short Python script, using the pandas library, which is actually meant to facilitate more sophisticated tasks like doing data analysis, but also takes the pain out of dealing with CSV files in Python.
In addition to appending a slash to URLs that don't end with
www subdomain is removed, because short URLs are sooo en vogue. To have Apache redirect from www to non-www I added the following generic rewrite rule to the
Usually, I use the Python standard library for reading and writing CSV files, but pandas came in quite handy here. I'm curious to learn about other somewhat deviant use cases, feel free to share yours in the comments.
- How to Examine a Remote Linux Server via SSH: A Sysadmin's Guide.
- Python's Global Interpreter Lock (GIL): Understanding the Pros and Cons
- Profitable Freelance Writing: Top Niches & Success Advice
- Unlocking the Potential of Podcasting as a Profitable Online Venture
Subscribe to RSS Feed
This post was written by Ramiro Gómez (@yaph) and published on . Subscribe to the Geeksta RSS feed to be informed about new posts.
Tags: migration pandas python tutorial
Disclosure: External links on this website may contain affiliate IDs, which means that I earn a commission if you make a purchase using these links. This allows me to offer hopefully valuable content for free while keeping this website sustainable. For more information, please see the disclosure section on the about page.
Share post: Facebook LinkedIn Reddit Twitter