Planning and Preparation – WTF am I going to build?

I’m going to build a single page web application to replace my WordPress site so the first thing I’m going to need to do is work out what I want to build. This is the hardest part of the process I find as there are a million things I’d like to build so narrowing it down to measurable goals can be quite tough. Lets start off with the requirements, what must the site do?

Requirements:

  • Retain all WordPress data – Articles and Comments
  • Retain all permalinks to said articles (SEO matters)
  • Have live feeds from my social networking tools
  • Enable visitors to easily share the pages on my site
  • Single page web application
  • Parallax scrolling (flavour of the month, I’m a sucker for punishment)
  • Google SEO compliance for single page web apps
  • Scalable infrastructure with autoscaling

That’s quite a list and some of those items are going to take a lot of work. First things first we need to decide on some technologies that we are going to use to build this monstrosity. I love Python so we are definitely going to be using it in the backend and also for our deployment process. But there are so many web frameworks for Python, which one should we use? I’ve used Django a number of times and feel quite comfortable with it so therefore we wont be using it, instead I’m going to try out Tornado and we’re going to be running it on top of Pypy. Pypy is an alternative implementation of Python with JIT compilation giving it significant performance gains. Tornado is a popular Python web framework that was originally written for FriendFeed and has since been open sourced by Facebook, it is asynchronous and has numerous async modules that will aid in our development (tornadio2 and asyncdynamo being two that spring to mind).

That’s all well and good but we have some data we need to retain, those first two points in the requirements clearly say that we’re taking WordPress with us! Getting the data out of MySQL isn’t exactly difficult, a simple

mysqldump --user=username --password my_database > /tmp/my_database.sql

will dump all the SQL out in a file however we aren’t going to be using a relational database. We’re going to join the NoSQL fan club and store our articles in DynamoDB seeming as this whole project will be running on AWS (Amazon Web Services). I’ve never used DynamoDB before but a quick look at the documentation appears as though it should do the job just fine for what I have in mind. Does an article need more than a primary key for querying? Guess we’ll find out soon enough.

I’m thinking Disqus is probably the best solution for comments, I don’t particularly want to write my own commenting system when I can plugin a free service like Disqus. After a little reading I discovered that you can export from WordPress to Disqus using a format known as WXR. Looking at the example it looks pretty straightforward, loop through the existing WordPress DB, push the articles to DynamoDB and put the new ID’s into WXR for importing into Disqus so DynamoDB and Disqus are in synch. As a bonus, Disqus enables visitors to share pages on my site so we’ll be killing two birds with one stone again!

The next three items are all frontend questions, to build a single page app you’ll probably want to use some sort of javascript MVC framework, and Angular.js is the tool we are going to use. I could use Backbone or Ember however I like Google so I’ve decided to go with Angular.js because I like how opinionated it is and the separation of HTML from JS is really nice (in my humble opinion). I’m going to have to do some research on the SEO compliance front as I’m not to sure what the implications will be on our project but some quick googling uncovered that Google has some recommendations on how to make a single page app available to Google so you don’t cop a penalty or worse yet, not get listed because the javascript isn’t processed by the crawler. On the parallax front, I’m sure there will be Javascript libraries to help with this part and its more sugar than a requirement so I wont dwell on it here.

Our final requirement is building this site to scale. I’ve worked with Rackspace and Linode before and have only relatively recently started working with Amazon Web Services but (and it’s a huge BUT) AWS really is in a league of its own. We’re going to use as much of AWS as my wallet will let me! Seriously though I’ll only be using t1.micro instances because I really don’t want to sink hundreds of dollars into AWS each month for a technology demo. We’ll be building a CloudFormation template to bring up as much of our environment as humanly possible, an external elastic load balancer, 2 frontend web servers (n+1), an internal load balancer, 2 backend web/application servers (n+1), DynamoDB, Route53, S3, CloudFront, CloudWatch, SES, SQS all nicely packaged inside a Virtual Private Cloud. We’ll be implementing autoscale groups in case we get smashed with traffic.

An additional requirement that isn’t on our list is SaltStack. I’ve used Puppet before and I want to get across SaltStack as its written in Python and its configuration is done using YAML which seems quite appealing compared to Puppet’s Ruby pseudo config language. We’ll be using SaltStack to software our EC2 instances when they start up through autoscaling.

That’s it for this instalment. The next post will be about setting up DynamoDB and Disqus, we’ll even write some handy little code to migrate the data over for us!

Time for a change!

After much deliberating I’ve decided to rebuild nigeldunn.com from scratch. I’m still nutting out the details of what I’m going to use but at this stage I’m planning on single page app using HTML5, angular.js, pypy and cyclone.io. I want this to be a knowledge sharing exercise as much as it is a redesign so…

The goals of this exercise are:

  1. Build a SEO friendly single page web application
  2. Retain all of my blog articles and comments from WordPress
  3. Make the site fully responsive (in other words: desktop and mobile friendly)
  4. Add some interesting real time functionality to the site (websockets or similar)
  5. Document the process so others can learn from it
  6. Showcase how to build a high performance, scalable website using the latest techniques
  7. Illustrate how to effectively use AWS for high availability and auto scaling
  8. Maybe learn a few things along the way 🙂

Now to be fair most of my skillset is around DevOps so I’ll do my best on the design front but I may have to recruit a graphic designer to help. I also plan on releasing all of my code for my site including deployment scripts and cloudformation templates via my GitHub account.

Because my site doesnt get very much traffic I’m going to use t1.micro instances on AWS to keep the costs down as this site will be a proof of concept (it doesn’t generate income so I don’t want to pay hundreds of dollars a month unnecessarily) however it will have autoscaling setup to allow for traffic spikes.

Continue on to Part 1: Planning and Preparation – WTF am I going to build?