teohm.dev

I enjoy life, and make stuff for people I care about :)

Chef Cookbooks for Busy Ruby Developers

Have you ever setup a Rails production environment from scratch, by hand? If you had, I share your pain every time when a new project started.

The process is often repetitive. To me, it seems to be a waste to do it manually every time. It also consumes time and attention. It would be great if I could spend them on tasks that bring more values to clients.

To minimize such waste, I have written two Chef cookbooks to automate the process:

  • rackbox - to provision rack-based web server (Nginx as front server, Unicorn and Passenger as upstream app servers, rbenv as ruby version manager).
  • databox - to provision database server (supports MySQL and PostgreSQL).

Getting started

In this post, I will show you a step-by-step guide on how to use the cookbooks together with knife-solo to provision a remote server in 4 steps:

  1. setup Chef Solo environment
  2. modify config file
  3. provision remote server
  4. tweak Capistrano deploy.rb

A working example in also available at teohm/kitchen-example.

1. Setup Chef Solo environment

  • Install Chef Solo tools on local machine.
  • Download Chef cookbooks to local machine.
  • Install chef-solo on remote server.

Install Chef Solo tools

Let’s create a new directory,

1
2
mkdir chef-kitchen
cd chef-kitchen

and a Gemfile.

1
2
3
4
source "https://rubygems.org"

gem "knife-solo", ">= 0.3.0pre3"
gem "berkshelf"

I recommend knife-solo >= 0.3 as it includes a few major fixes and improvements.

Now, install the ruby gems.

1
bundle install

Finally, setup a kitchen directory structure with knife-solo.

1
bundle exec knife solo init .

Download Chef cookbooks

I use Berkshelf to manage cookbooks. So we need a Berksfile,

1
2
3
4
5
site :opscode

cookbook "runit", ">= 1.1.2"  # HACK: force-use this version
cookbook "databox"
cookbook "rackbox"

(I added a hack here to force berkshelf to use runit 1.1.2 required by rackbox. Still looking for a better solution.)

We can now download cookbooks with berks install.

1
bundle exec berks install --path cookbooks/

Install chef-solo on remote server

1
bundle exec knife solo prepare testbox

In this example, testbox is a host I setup in my ~/.ssh/config:

1
2
3
4
Host testbox
  User ubuntu
  Hostname ec2-51-221-13-121.ap-southeast-1.compute.amazonaws.com
  IdentityFile ~/.ssh/testbox_ec2.pem

2. Customize config file

  • Download config example
  • Customize config file

Download config example

1
curl https://raw.github.com/teohm/kitchen-example/master/nodes/host.json.example --output nodes/testbox.json

Modify config file (JSON)

The config file starts with a run_list. You specify a list of cookbook recipes here. Chef will run them in the same order in this list.

It is followed by cookbook attributes. You can modify these attributes. A full reference of attributes are described in each cookbook’s README (see appbox, databox, rackbox).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
{
  "run_list": [
    "databox",
    "rackbox"
  ],
  "appbox": {
    "deploy_keys": ["ssh-rsa 5bnmu23890fghghjk"],
    "admin_keys": ["ssh-rsa 456789fghjkvbn567"]
  },
  "databox": {
    "db_root_password": "welcome!",
    "databases": {
      "mysql": [
        { "username": "app1",
          "password": "app1",
          "database_name": "app1_production" }
      ],
      "postgresql": [
        { "username": "app2",
          "password": "app2",
          "database_name": "app2_production" }
      ]
    }
  },
  "rackbox": {
    "ruby": {
      "versions": ["1.9.3-p385", "1.9.2-p320"],
      "global_version": "1.9.3-p385"
    },
    "apps": {
      "unicorn": [
        { "appname": "app1",
          "hostname": "app1.test.com" }
      ],
      "passenger": [
        { "appname": "app2",
          "hostname": "app2.test.com" }
      ]
    }
  }
}

3. Provision remote server

1
bundle exec knife solo cook testbox

It uploads the kitchen directory and runs chef-solo on the remote server. Chef-solo will then takeover and execute the run list to setup everything.

What do we get at this point?

Basically, it’s done!

We have a full-stack, rack-based server with:

  • 3 user accounts (deploy, devops, apps)
  • rbenv as ruby version manager
  • nginx as front-server
  • unicorn, passenger-standalone as upstream app servers, managed by runit
  • postgresql, mysql installed and databases created
  • all apps will be stored in /home/apps/

4. Tweak Capistrano deploy.rb

Now, it’s ready to deploy a Rack-based app to the remote server!

I have two example Rails apps available on Github:

There are a few minor tweaks required in Capistrano deploy.rb, as listed below.

I will explain the tweaks in details next time. Meanwhile, check out the complete working examples at: app1/config/deploy.rb and app2/config/deploy.rb

Login as deploy user

1
set :user, "deploy"

Deploy to /home/apps

1
set :deploy_to, "/home/apps/#{application}"

Load rbenv in Capistrano

1
default_run_options[:shell] = '/bin/bash --login'

Run bundler with --binstubs

1
2
3
require 'bundler/capistrano'
set :bundle_flags, "--deployment --binstubs"
set :bundle_without, [:test, :development, :deploy]

Restart app with runit sv

1
2
3
4
5
6
7
8
9
10
11
namespace :deploy do
  task :start do
    run "sudo sv up app1"
  end
  task :stop do
    run "sudo sv down app1"
  end
  task :restart, :roles => :app, :except => { :no_release => true } do
    run "sudo sv restart app1"
  end
end

Feedback

If you are interested on using the cookbooks, or have an idea/feedback/question about this topic, feel free to drop me (@teohm) a message. Pull requests and issue reports are definitely welcomed!

Reload Required Files in Rails

When a Rails project grows, I often notice the need to refactor domain logic into a separate module, isolated from Rails framework.

The isolated module will still stay in the main Rails app codebase, but can be easily packaged as a Ruby gem, tested separately, and used in other related applications.

Requirements

During the refactoring, we want to:

  1. Use require to load the module like a normal gem. If possible, no dependency on Rails autoload features such asrequire_dependency and autoload_paths.

  2. Edit the module without restarting server during development. In other words, we need to find a way to reload the module on every request.

Reload require files

After some research, I found a working solution by Timothy Cardenas:

1
2
3
4
5
6
7
ActionDispatch::Callbacks.to_prepare do
  if Object.const_defined?("Module1")
    Object.send(:remove_const, "Module1")
  end
  $".delete_if {|s| s.include?('module1') }
  require 'module1'
end

What it does is, before each request,

  1. Unload the top-level module.
  2. Un-require all required files from the module.
  3. Re-require the top-level module.

An extra step in Rails 3.2

In Rails 3.2, ActionDispatch::Callbacks.to_prepare has a slightly different behavior. It will run before a request only if a watchable file is modified.

You need to specify your own watchable files:

1
2
# watch all .rb files recursively under modules/module1/ dir
config.watchable_dirs['modules/module1'] = [:rb]

Introduce require_reloader gem

Before bundling the solution into a gem, I did a search on RubyGems.org and found Colin Young’s gem_reloader. It’s based on Timy’s solution as well. I forked it and started playing around.

In the end, I made some major changes to include Rails 3.2 support, some fixes and new features.

So I decided to release it as a new Ruby gem: require_reloader.

1
2
3
4
5
6
# config/environments/development.rb
YourApp::Application.configure do
  ...
  RequireReloader.watch :module1,
    path: 'modules/module1'
end

Currently it supports Rails 3, including 3.1 and 3.2.

If you are working on something similar, looking forward for your feedbacks and pull requests. It is not tested on Rails 2 yet, so pull requests are definitely welcomed!

Start Using Ruby % (Percent) Notation

If you seldom use Ruby percent (%) notation in daily work, here’s a quick summary of what I picked up recently.

Delimiter allows any non-alphanum

You can use any non alpha-numeric character as delimiter:

1
2
3
4
%(any alpha-numeric)
%[char can be]
%%used as%
%!delimiter\!! # escape '!' literal

Bracket pairs no need to escape

No need to escape bracket pairs, even when nested. You can escape, but will need to escape both open and close bracket.

1
2
3
4
5
%( (pa(re(nt)he)sis) ) #=> "(pa(re(nt)he)sis)"
%[ [square bracket] ]  #=> "[square bracket]"
%{ {curly bracket} }   #=> "{curly bracket}"
%< <pointy bracket> >  #=> "<pointy bracket>"
%< \<this works as well\> >  #=> "<this works as well>"

Modifiers for String, Regex, Array, Symbol, Shell command

We often use % notation to create String and Array literals. But it also supports Symbol, Regex and shell command.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
%(interpolated string (#{ "default" }))
  #=> "interpolated string (default)"
%Q(interpolated string (#{ "default" }))
  #=> "interpolated string (default)"
%q(non-interpolated string)
  #=> "non-interpolated string"
%r(#{ "interpolated" } regexp)i
  #=> /interpolated regexp/i
%w(non-interpolated\ string  separated\ by\ whitespaces)
  #=> ['non-interpolated string', 'separated by whitespaces']
%W(interpolated\ string #{ "separated by whitespaces" })
  #=> ['interpolated string', 'separated by whitespaces']
%s(non-interpolated symbol)
  #=> :'non-interpolated symbol'
%x(echo #{ "interpolated shell command" })
  #=> "interpolated shell command\n"

Here’s some % notation examples written in minitest in case you interested.

Ruby Idiom to Ensure Variable Is Array

Stop doing this:

1
2
3
4
5
6
arry = input || []  # handle input == nil
arry = [input] unless arry.kind_of?(Array)  # handle single value object

arry.each do |item|
  #process item
end

In Ruby, you should use Kernel#Array to convert the variable into an array object:

1
2
3
4
5
6
7
Array(input).each do |item|
  # process item
end

# Array(nil)     # => []
# Array("foo")   # => ["foo"]
# Array([1,2,3]) # => [1,2,3]

More usage examples written in minitest.

Objective-C Collection Operators

Calculate average - a shorter way

When you have a collection of Transactions objects and want to calculate its average amount, instead of looping through the collection like this:

1
2
3
4
5
double sum = 0;
for (Transaction *transaction in transactions) {
    sum += [transaction.amount doubleValue];
}
NSNumber *avg = [NSNumber numberWithDouble:(sum / [transactions count])];

you can reduce the loop to 1 line of code, using Objective-C key-value coding:

1
NSNumber *avg = [transactions valueForKeyPath:@"@avg.amount"];

Currently, there is a fixed set of collection operators:

  • Simple collection operators (@avg, @count, @sum, @max, @min)
  • Object operators (@distinctUnionOfObjects, @unionOfObjects)
  • Array and set operators (@distinctUnionOfArrays, @unionOfArrays)

For details, refer to more usage examples.

Breaking ARC Retain Cycle in Objective-C Blocks

In a recent client project, we noticed its iOS app often received low memory warning. The iOS app is developed with ARC-enabled (Automatic Reference Counting).

When profiled the app using Instruments > Allocations, we found that a lot of unused ViewController objects were not released from memory.

Retain Cycle

The codebase uses a lot of Objective-C blocks as shown below, and self is often being called within the blocks:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
@interface DetailPageViewController : UIViewController
@property(nonatomic, strong) UIButton *backButton;
...
@end

@implementation DetailPageViewController
@synthesize backButton;

- (void)loadView {
  ...
  [self.doneButton onTouch:^(id sender) {
    [self doSomething];
    self.isDone = YES;
  }];
}
...
@end

In this example code, the controller object holds a strong reference to doneButton object. But because doneButton onTouch: block is calling self, now the button object holds a strong reference back to the controller object.

When object A points strongly to object B, and B points strongly to A, a retain cycle is created, and both objects cannot be released from memory.

Use Lifetime Qualifiers

Apple Developer’s guide on ARC transition suggested a few ways to break the retain cycle using different lifetime qualifiers. It is definitely a must read for iOS development.

If your app is targeted for iOS 5, you can use __weak lifetime qualifier to break the cycle:

1
2
3
4
5
6
7
8
9
10
@interface DetailPageViewController : UIViewController
- (void)loadView {
  ...
  __weak DetailPageViewController *controller = self;
  [self.doneButton onTouch:^(id sender) {
    [controller doSomething];
    controller.isDone = YES;
  }];
}
@end

Because we are using __weak qualifier, doneButton onTouch: block only has a weak reference to the controller object. Now, the controller object can be released from memory when its reference count drops to 0.

Mac Tips: Use Caps Lock as Control Key

After using tmux, I start using Control key a lot often. I decided to move my Control key up to the home row, by assigning it to Caps Lock.

For Mac users, open System Preferences > Keyboard, choose Keyboard tab. Click button Modifier Keys.. and change Caps Lock Key: to ^ Control.

A screenshot of Key Modifier Key dialog

Mac Tips: Change Default Application for a File Type

To switch the default application for file type, right-click the file, select Get Info and change the application in Open With section.

Remember, remember.. click Change All… button to apply the changes to all files. It seems obvious, but took me a while to figure this out. :)

A screenshot of Get Info dialog

Mac Tips: Turn on Remote Login (SSH Server)

Enable SSH server on Mac OSX is surprisingly easy. Open System Preferences > Sharing, and turn on Remote Login. You can futher restrict which user can login via SSH on the same screen.

A screenshot Sharing dialog box.

Using Sshuttle in Daily Work

I was first introduced to sshuttle by Sooyoung (@5ooyoung) in Favorite Medium as a workaround to The Great Firewall in China.

Since then, it has become my light-weight network tunneling tool in daily work.

Install sshuttle

The installation is easy now. You can install it through Mac OSX Homebrew, or Ubuntu apt-get.

1
brew install sshuttle

I use sshuttle to..

1. Tunnel all traffic

This is the first command I learned. It forwards all TCP traffic and DNS requests to a remote SSH server.

1
sshuttle --dns -vr ssh_server 0/0

Just like ssh, you can use any server specified in ~/.ssh/config. The -v flag means verbose mode.

Besides TCP and DNS, currently sshuttle does not forward other requests such as UDP, ICMP ping etc.

2. Tunnel all traffic, but exclude some

You can exclude certain TCP traffic using -x option.

1
sshuttle --dns -vr ssh_server -x 121.9.204.0/24 -x 61.135.196.21 0/0

For instance, when I am in China, I don’t want to tunnel Youku.com traffic to a foreign server, because its movie streaming service is only available within China.

In this case, I use -x option to exclude Youku.com IP addresses.

3. Tunnel only certain traffic

To tunnel only certain TCP traffic, specify the IP addresses or IP ranges that need tunneling.

1
sshuttle -vr ssh_server 121.9.204.0/24 61.135.196.21

This command comes in handy, whenever I need to test an app feature (e.g. Netflix movie streaming) which only available in certain countries, or to bypass ISP faulty caches.

4. VPN to office network

I seldom do VPN, but all you need is the remote SSH server with -NH flags turned on.

1
sshuttle -NHvr office_ssh_server

-N flag tells sshuttle to figure out by itself the IP subnets to forward, and -H flag to scan for hostnames within remote subnets and store them temporarily in /etc/hosts.

IP addresses.. troublesome?

Well, I try not to deal with IP addresses manually. So I wrote a few sshuttle helpers (tnl, tnlbut, tnlonly, vpnto) that allow me to use domain names instead of IP addresses:

Tunnel all traffic

1
tnl

Tunnel all traffic, but exclude some

1
tnlbut youku.com weibo.com

Tunnel only certain traffic

1
tnlonly netflix.com movies.netflix.com

VPN to office network

1
vpnto office_ssh_server

The script is available on my GitHub repo. You can load it into your ~/.bashrc. To override the default tunneling SSH server in the script:

1
TNL_SERVER=user@another_server tnl