I’m going to build a single page web application to replace my WordPress site so the first thing I’m going to need to do is work out what I want to build. This is the hardest part of the process I find as there are a million things I’d like to build so narrowing it down to measurable goals can be quite tough. Lets start off with the requirements, what must the site do?
- Retain all WordPress data – Articles and Comments
- Retain all permalinks to said articles (SEO matters)
- Have live feeds from my social networking tools
- Enable visitors to easily share the pages on my site
- Single page web application
- Parallax scrolling (flavour of the month, I’m a sucker for punishment)
- Google SEO compliance for single page web apps
- Scalable infrastructure with autoscaling
That’s quite a list and some of those items are going to take a lot of work. First things first we need to decide on some technologies that we are going to use to build this monstrosity. I love Python so we are definitely going to be using it in the backend and also for our deployment process. But there are so many web frameworks for Python, which one should we use? I’ve used Django a number of times and feel quite comfortable with it so therefore we wont be using it, instead I’m going to try out Tornado and we’re going to be running it on top of Pypy. Pypy is an alternative implementation of Python with JIT compilation giving it significant performance gains. Tornado is a popular Python web framework that was originally written for FriendFeed and has since been open sourced by Facebook, it is asynchronous and has numerous async modules that will aid in our development (tornadio2 and asyncdynamo being two that spring to mind).
That’s all well and good but we have some data we need to retain, those first two points in the requirements clearly say that we’re taking WordPress with us! Getting the data out of MySQL isn’t exactly difficult, a simple
mysqldump --user=username --password my_database > /tmp/my_database.sql
will dump all the SQL out in a file however we aren’t going to be using a relational database. We’re going to join the NoSQL fan club and store our articles in DynamoDB seeming as this whole project will be running on AWS (Amazon Web Services). I’ve never used DynamoDB before but a quick look at the documentation appears as though it should do the job just fine for what I have in mind. Does an article need more than a primary key for querying? Guess we’ll find out soon enough.
I’m thinking Disqus is probably the best solution for comments, I don’t particularly want to write my own commenting system when I can plugin a free service like Disqus. After a little reading I discovered that you can export from WordPress to Disqus using a format known as WXR. Looking at the example it looks pretty straightforward, loop through the existing WordPress DB, push the articles to DynamoDB and put the new ID’s into WXR for importing into Disqus so DynamoDB and Disqus are in synch. As a bonus, Disqus enables visitors to share pages on my site so we’ll be killing two birds with one stone again!
Our final requirement is building this site to scale. I’ve worked with Rackspace and Linode before and have only relatively recently started working with Amazon Web Services but (and it’s a huge BUT) AWS really is in a league of its own. We’re going to use as much of AWS as my wallet will let me! Seriously though I’ll only be using t1.micro instances because I really don’t want to sink hundreds of dollars into AWS each month for a technology demo. We’ll be building a CloudFormation template to bring up as much of our environment as humanly possible, an external elastic load balancer, 2 frontend web servers (n+1), an internal load balancer, 2 backend web/application servers (n+1), DynamoDB, Route53, S3, CloudFront, CloudWatch, SES, SQS all nicely packaged inside a Virtual Private Cloud. We’ll be implementing autoscale groups in case we get smashed with traffic.
An additional requirement that isn’t on our list is SaltStack. I’ve used Puppet before and I want to get across SaltStack as its written in Python and its configuration is done using YAML which seems quite appealing compared to Puppet’s Ruby pseudo config language. We’ll be using SaltStack to software our EC2 instances when they start up through autoscaling.
That’s it for this instalment. The next post will be about setting up DynamoDB and Disqus, we’ll even write some handy little code to migrate the data over for us!