The Migration Headache that was Tumblr to WordPress
While working on We Don’t Lie to Google, it became apparent that I would need a better way to keep track of things I’d already posted, particularly as time passed, in order to prevent posting duplicate entries.
Enough’s been said about how remarkable Tumblr is, and how easy it’s made blogging. But it’s also been plagued by service outages. Though it’s been said that you get what you pay for — the service is free.
At some point, I made the decision to move away from Tumblr for a few reasons:
- If the project was going to continue in a serious manner, I needed hosting that wouldn’t suffer from outages about which I could do absolutely nothing
- I wanted more control over and access to my data, both entries and uploaded images
- When you make a living making websites, Tumblr starts to feel a little janky with respect to how things, like static files, are managed
When I made the decision to move to WordPress, I started to look for ways to export my data from Tumblr. Here’s what I found:
Tumblr has a backup utility that doesn’t work
On Tumblr’s “Goodies” page, there is a link to download a beta of their Mac backup utility. Sadly by the time I discovered it, it didn’t function — at all. I sent a support request asking if there was a solution for the issue I was having, along with detailed info (portions of my console.log). A few days later I got this reply:
Hi Douglas,
I can’t tell you how much we appreciate your support and feedback. I’ve passed this along to our engineering team to investigate. Thanks!
Great, someone cares! Escalated to the Engineering team! Color me impressed. Except that I heard nothing more for a full month, at which time I decided to write in again. A few days later, another response (from a different support rep), this time less great:
Sorry development of that app has been discontinued and the developers are working on a more robust backup solution. However, we don’t have a rollout date at this time. My apologies.
That came back on May 10, 2011 — as of this writing, the app is still listed on the Goodies page, with “Windows version coming soon.” But the link still goes to the same beta version I initially downloaded.
Ah, but there is an API!
So there is, but in the case of We Don’t Lie To Google, all of my posts were images, and using the API would require scripting a solution to download all of the images. Not impossible, but We Don’t Lie to Google is a content-based project — losing focus on the content could have made it too easy to abandon the project as a whole.
Also, when I was working on this migration, the API was less reliable than the actual service.
Third-party backup solutions
There are a few people who have written solutions that tie into the Tumblr API, and I used a combination of them to get all of my posts out of Tumblr and into WordPress.
- Tumblelog Backup Tool: This utility let me save a Safari-style .webarchive of all of my posts, which meant that I was able to bulk download all of the images I’d uploaded and sort them out on my local machine.
- tumblr2wp: A utility that will reformat the XML output of the Tumblr API into a WordPress WXR (WordPress eXtended RSS) file.
So, what about all those images?
When you upload an image to Tumblr, it renames the file with a hash and the image size, to something like “tumblr_llfz9dzXbY1qcdvqyo1_500.png”.
And because I’d been less than organized when creating the entries initially, it was time to reap what I’d sowed. I had to manually rename every image. Fortunately, there were only about 75 entries at the time, but that still took a few hours to get everything straightened out.
Getting the posts into WordPress
Importing the entries from the WXR file was simple: just a few clicks and all of the entries were in WordPress.
The next part was a bitch: I had to manually associate each image with an entry. It wasn’t hard work, just tedious, and was unable to be scripted for a few reasons:
- I wanted the file names to contain the search terms rather than the caption
- Tumblr’s image posts don’t allow for metadata or anything other than a caption and tags
- I hadn’t considered the possibility that I would want to leave Tumblr in the future
At any rate, I decided that scripting a solution likely would have taken longer than just manually renaming files since there were less than 100 at the time.
Rewriting URLs
Tumblr post URLs follow this pattern: /post/[post-id]/[post-slug (image caption)]. The post-id is unnecessary, so I planned to leave it out of the URL entirely (though tumblr2wp offers the option to have URL formatted like [post_id]-[post-slug]).
Apache’s mod_rewrite to the rescue: requests for /post/[post-id]/[post-slug] get a 301 redirect to /post/[post-slug]. Problem solved.
A note: Tumblr will handle requests for /post/[post-id] just the same as /post/[post-id]/[post-slug] — because the Permalinks that Tumblr generates have the slug in there (at least the ones I saw), I made the assumption that requests to WordPress for /post/[post-id] could be 404’d. So far this has not been a problem for me.
Lessons learned
It’s hard to actually be mad at Tumblr — they offer a pretty great service for free that’s spawned hundreds, if not thousands, of pretty successful tumble logs. But if your content is important to you, and you don’t like putting all of your data-eggs in one basket, the experience can be frustrating. There’s not a great way to move to another service, particularly for users who are less technically adept than I am.
All that said, I’d seen the shortcomings of Tumblr (for this project, at least) only a few weeks into my project — losing track of what I’d already posted because I couldn’t FTP in to see a list of files, or even look at a file manager.
It all comes back to how important you think your content is. If you would like one day to be able to leave Tumblr, well, I’d think twice before you posted another entry. Each new post pushes you a little further down that rabbit hole, and the deeper you are, the harder it is to climb (or dig) your way out.