All posts by jeff

Leveraging C++ open source in rails apps

One thing that many software developers neglect is the build/buy decision.   Often it is more cost effective to buy an existing solution rather than build your own.   This can happen at the macro level as well – inside your custom-built solution, problems may be solved using libraries written by others.   If that library happens to be open source, all the better!

But what if the best solution out there is written in some other language?   I ran into this very scenario when I came across an ideal open source solution written in C++ that I wanted to use in my Rails application.   The solution implements a very complex data structure and support code, and would add zero value to my application by trying to do it from scratch.  There is nothing like that library in the Rails ecosystem. 

To the Rescue: Foreign Function Interface, FFI.  This is, in itself, a very impressive piece of work.  With just a few lines of code to bind the interfaces, you can literally call natively compiled code from Ruby.  

You add a gem to your Gemfile:

gem 'ffi'

Then develop a bridge between the rails and c++ spaces.

  1. Declare all external interfaces in a .h file for C++ to name/externalize them properly
  2. Use FFI’s DSL to declare the bridge specifics

The header file (simplified for presentation):

extern "C" {
  void addBlock(Index *idx, int64_t start_day, int64_t start_hour, int64_t end_day, int64_t end_hour, int64_t id);
  int64_t get_right(return_right_down* rrd);
  int64_t get_down(return_right_down* rrd);
  int64_t get_corner(return_right_down* rrd);
}

The DSL:

module SpecialLibrary
  extend FFI::Library

  ffi_lib 'c'
  ffi_lib_flags :now, :global
  ffi_lib './lib/special/libspecial_schedtool.o'

  attach_function :addBlock, [:pointer, :int, :int, :int, :int, :int], :void
  attach_function :get_right, [:pointer], :int
  attach_function :get_down, [:pointer], :int
  attach_function :get_corner, [:pointer], :int
end

The functions declared in the header file are implemented by you in a C++ file and compiled to a .o declared using the ffl_lib syntax in the DSL above; those functions call C++ library API entries in your leveraged library (libspecial in my case), which is built according to the supplier and installed on the system as a library.

If that’s not impressive enough, it works on Ubuntu and OSX.  The commands to do the C++ compilation of the interfaces is different, but the code itself needs no changes.   I put this little script together to build on either system:

#! /bin/bash
#
# Different commands on OSX v Linux
case `uname` in
'Darwin')
    g++ -std=c++0x -I/usr/local/include/ -L/usr/local/lib libspecial_schedtool.cpp -lspecialindex_c -lspecialindex -dynamiclib -o libspecial_schedtool.o
    ;;
'Linux')
    g++ -std=c++0x -fPIC -I/usr/local/include/ -L/usr/local/lib libspecial_schedtool.cpp -lspecialindex_c -lspecialindex -shared -o libspecial_schedtool.o
    ;;
*)
    Echo "Error: don't know how to compile on this platform"
esac


Honestly, my app would not exist today without this leverage.  

Bundler SSL failures using rails-assets.org

I recently discovered a cool way to use Bower assets in rails applications from these guys.   I want to give a great shout out to  who provide an example application you can clone that is a good learning example on how use Restangular with Rails.

That was the good news.  I have an OSX development machine running 10.10 that I did that work on and it worked great.   My laptop running 10.11 however, would fail to bundle the assets from rails-assets.org.   It was complaining about SSL certificates and I had no idea how to deal with that.  My configuration is rvm 1.27.0, Jruby 9.0.5.0, rails 3.2.22.2, JDK 1.8.

I found an easy test to run quickly to compare environments and test for the problem:

ruby -e 'require "net/http"; require "uri"; require "jruby-openssl"; puts "ruby: "+RUBY_VERSION; puts "openssl: "+OpenSSL::OPENSSL_VERSION;Net::HTTP.get(URI("https://rails-assets.org"))'

If that fails, but changing rails-assets to rubygems succeeds, then you probably have the same problem and can try my solution.  What I don’t quite understand is how that test fails on OSX 10.10, yet the bundle install succeeds.

When using Jruby, the certificates are stored in the JDK.   You can use part of a recommended solution to find out where they are:

rvm osx-ssl-certs status all

It will return something like:

Certificates for /Library/Java/JavaVirtualMachines/jdk1.8.0_91.jdk/Contents/Home/jre/lib/security/cacerts: Up to date.

That cacerts file is a keystore.  You need to add the root certificate used by rails-assets.org to that keystore. There are probably different ways to do it, but Firefox allows you to export the certificate easily:

  1. Browse to rails-assets.org
  2. Click the lock icon, then the right arrow, more information,  view certificate, details
  3. Click the root certificate (DST Root CA X3)
  4. Click export, save to a file ending in .pem, with format X509 certificate.  In my example I saved to ~/Desktop/DSTRootCAX3.pem

Now put it in your keystore:

cd /Library/Java/JavaVirtualMachines/jdk1.8.0_91.jdk/Contents/Home/jre/lib/security/
sudo keytool -import -trustcacerts -alias root -file ~/Desktop/DSTRootCAX3.pem -keystore cacerts

(Note that the default password for the keystore is ‘changeit’)

That import fixes the ruby one line test above for 10.10 and 10.11, for JDK 1.7 and 1.8.

Things that seemed good but did nothing:
Several solutions here.

  1. Try bypassing ssl by using http in your gemfile (it must redirect to SSL because this does not work).
  2. gem update –system. No apparent effect
  3. put ‘:ssl_verify_mode: 0‘ in your ~/.gemrc.   This has the effect of getting past the first level of certificate error, but fails installing specific bower assets.   Further, it isn’t a good solution because it’s disabling ssl.
  4. use rvm to install ssl before installing ruby:  Doesn’t work for Jruby — the build is done with maven and the option for the ssl directory is not supported.

This looked promising.  Unfortunately on my laptop, the “rvm osx-ssl-certs update” ends up corrupting the JDK keystore! Instead of a keystore, it puts a text file containing certificates in its place, and subsequent failures start complaining about an invalid keystore. You’ll need to reinstall the JDK or recover the certificate file from a backup. The other solutions in the page simply don’t work.

Persistence or Stubbornness

I’m attributing the motivation for this post to public radio, where I heard this recently:   http://nonsenseatwork.com/915-when-stubbornness-makes-nonsense-of-persistence/.  Persistence is often touted as a key attribute of a successful entrepreneur, and one of the attributes I  have remarkable evidence for possessing.   But maybe I’m just stubborn?

In an article for HBR, Muriel Maignan Wilkins wrote that you can manage your stubbornness by being open to new ideas and being able to admit you are wrong.   The best entrepreneurs I have seen are willing to consider other choices but also balance that with a need to stay on course — changing plans to chase the “next best idea” is not leadership, yet neither is following  a course assured to fail.  Most adults have likely had an experience where they were convinced something wouldn’t work, and then it did.   That’s likely because someone else believed, beyond the obvious reasons why it would fail, that it could work.

In my startup, RideGrid, we believed that people would ride in a car, one time, with an  non-professional driver they don’t know.   People said we were crazy.   You could talk them through the rationale, but safety  and trust were their first concerns.   That was 2007.   Uber has now launched Uber Pool, which has some similar characteristics to RideGrid.   Very few people have that safety fear today:

  • My mother has used Uber.
  • People I work with in Tijuana told me this week that Uber drivers are more trustworthy than taxi drivers — if you left your laptop in a Tijuana Taxi – it’s gone.  If you left it in an Uber car, you might get it back.

Good entrepreneurs analyze evidence and motivation, and can pivot when they believe they are wrong, but don’t shift at intuition before doing some analysis.  Jumping at  the intuition of others is entrepreneurial suicide.  The analysis doesn’t have to be quantitative, but must consider all  relevant variables and their interaction.  The entrepreneur also has to filter out distractions of analyzing too much.

Tolerance of ambiguity, self awareness, and the fearless pursuit of what might end up being a waste of time but you don’t think so….. that’s persistence, and there is nothing stubborn about that.

This post is also the subject of a Leadership Minute Video.

 

 

Openstack in the garage

Objective

I have a client who wanted me to do some research on openstack. They were particularly interested in how difficult it would be to deploy. Along the way I thought it would be cool to have an openstack cluster at home so that when I needed a running machine to test something out I could spin it up without messing up one of my machines.

SO I started dreaming about Openstack. Cost seemed prohibitive, since most documentation suggests you need a lot of machines. After I found an article that said you could run it on two physical machines, I started looking around for inexpensive hardware. I purchased three HP Proliant DL 380s from Amazon. Amazing machine for $160-$180: Dual CPU with quad cores, 16GB ram, embedded RAID, dual disks, dual power supplies with IPMI controlled power, dual GbE NICs. The price varies based on the supply and the disk capacity; mine came from ServerPlus365.

Implementation

MaaS and JuJu looked really simple, and the article I found used them to deploy on the two machines using containers. I started there and after many trials, swore off Ubuntu and headed towards RDO… I figured out why the Proliants are so cheap — they don’t run CentOS 7 (so therefore, also not RHEL 7). The reason is that the RAID controller isn’t supported, so you can’t even install. Moreover, you can’t really upgrade from CentOS 6 because apparently not all packages get properly upgraded. So after a brief foray into trying Redhat’s latest RDO deployment model, I came back to Ubuntu.

The experience was generally really frustrating. I feel like I’ve found the one path through the maze. If you have the patience to read on, I think you’ll see why. Each failure takes a long time to figure out.

If you follow along, you too might have your own cluster.

Install the MaaS Cluster Controller

I borrowed a windows laptop for this with an attached USB 300GB disk.   I could have used one of the three Proliants, but using the laptop would give me more deployable compute capacity.

I downloaded “Trusty” Ubuntu 14.04 LTS server iso using bittorrent; took about 7 minutes.  Installed using the MaaS multi-node option. This ultimately was the key — using the packages from the disk and a specific PPA which gave me access to MAAS 1.7. The 1.5 version that comes with the “Trusty” release is so primitive I found it useless.  The 1.8 version under Trusty with the maas-maintainers/stable PPA didn’t have the Debian installer, required by the hardware I have (a consequence of the same RAID controller which causes CentOS issues). I tried the “Vivid” release with 1.8 because it supposedly supports the fast install on the RAID, but I couldn’t spin up anything except Vivid instances, which would be fine except there aren’t currently any JuJu charms for Openstack on Vivid.

What maas do you have?  Check it here:

apt-cache policy maas{,-dns,-dhcp,-cluster-controller,-region-controller} | grep Installed -B1 -A1

i tried various ways of getting 1.7, and in the end I found a PPA which has the packages I needed.   To replicate  my experience you’ll need to run:

sudo apt-get install software-properties-common
sudo add-apt-repository ppa:andreserl/maas
sudo apt-get update
sudo apt-get upgrade
sudo apt-get install maas maas-dns maas-dhcp
apt-cache policy maas{,-dns,-dhcp,-cluster-controller,-region-controller} | grep Installed -B1 -A1

Make sure you’re on 1.7.5 or newer.

In my deployment, I used two networks: one for the MAAS management and IPMI power control (10.0.1.0/24), and one that goes to the outside world on DHCP managed by my gateway router (192.168.1.0/24). My /etc/network/interfaces file on the MAAS cluster/region controller is:

# The loopback network interface
auto lo
iface lo inet loopback

# The primary network interface
auto eth0
iface eth0 inet static
    address 10.0.1.10
    netmask 255.255.255.0
 
# wifi
auto wlan0
iface wlan0 inet manual
    wpa-driver nl80211
    wpa-roam /etc/wpa_supplicant.conf
 
iface default inet dhcp

You can see I am using the wifi for the second network, which is a whole different bag of trouble because the laptop has a broadcom wifi chip that isn’t supported out of the box. I have documented elsewhere how to make that work.

Because avahi daemon is already installed, if you are working from a mac, you can point your browser at the hostname of the cluster controller you just installed with a .local domain, otherwise you’ll need to know its IP address, and browse to http://YOURHOST/MAAS.  It will tell you what you need to do, including some command line work to define the admin credentials.

Install your ssh keys.  Don’t skip this.   This enables you to ssh into the instances JuJu creates. In the upper right hand corner of the MaaS GUI, you’ll see “root” and a drop down. Clicking that allows you to access preferences. You can paste the contents of a public key after hitting “add ssh key” that is paired with a private key on the machine you will log in from

Screen Shot 2015-06-13 at 4.55.14 PM

DNS is going to come off the primary Ethernet rather than your cluster controller.  If you let the cluster controller manage DNS, then it will add the Zone name to the FQDN when JuJu tries to contact the deployed node.  Maybe there is a way to make that work, but in my case the juju client could not connect to the host. However, when you use host.local as your hostname, then mDNS allows the mac to connect to the new instances correctly.  If your juju controller is windows then best of luck.   Otherwise make sure mdns is in your resolv.conf path.

Configure the hardware

I connected the eth0 of the MaaS hardware to my home network, and eth1 to the MaaS managed network. This doesn’t get brought up connectly under MaaS — the eth0 network doesn’t get brought up.

In the bios, configure for PXE boot from eth1, and order network boot before hard disk boot. Hit F8 during the boot sequence to configure ILO (IPMI). Turn off DHCP in ILO,  give it a static address, and a username and a password.  I noticed that during enlistment or commissioning, somehow maas made an entry in the ILO name/password.   I haven’t tried just using that.

Also make sure “Intel Virtualization Technology” is enabled.   This will make the hypervisors perform better.   In this model it’s under advanced->processor options.

Enlist your nodes

Turn your nodes on and get them enlisted.   They should show up in the MaaS GUI.  I named them so I know which one is which (Hansel.local, Gretel.local and witch.local — the witch is the juju bootstrap node).   Name your nodes yournode.local, and configure the the cluster controller to manage DHCP but not DNS.   I thought that perhaps I could get DHCP from the router and PXE fromt he cluster controller but that doesn’t seem to be the case.

 

Select the Debian Installer. The fast installer will get a ways but will fail in “curtin”. Apparently the HP Proliant DL 380 names the SCSI array such that the fast installer doesn’t recognize it.

 

Bootstrap JuJu and configure services

The inspiration for this comes from this posting, for which I am eternally grateful.  It would have taken a long time to figure that out.  Basically we can run the JuJu GUI in the bootstrap machine directly, and each of the JuJu services in LXC containers on the bootstrap machine as well. The only real problem I had with that is that each container needs a little fixing as it is brought up.

First, install juju on your client machine. I used my main development machine, a mac. That means I’m using four machines for this test.

juju init

will create the environment for you to edit.  Edit your resulting ~.juju/environments.yaml file so that you have the following entries.   For MAASTER, use the hostname or IP address of your MaaS cluster controller you just set up.  For YOURSTRING, also under the preferences pane you’ll find a key under “MaaS keys”.  Use that.

default: maas
          maas-server: 'http://MAASTER/MAAS/'
          maas-oauth: 'YOURSTRING'
          default-series: trusty

Then bootstrap your control node. This will allocate one of the MaaS nodes and configure it.

juju bootstrap

Once you see juju waiting to contact the new node, connect to the cluster controller, and ssh from there to the new node with ssh ubuntu@xxx.local. (This assumes by the way, that your private key on the cluster controller has a public key that you pasted into the keys in MAAS.  The ubuntu account has no known password – you need the key to get in).  You are going to manually add eth0 and configure the default route to use that.

sudo vi /etc/network/interfaces 
# Add in:
# auto eth0
# iface eth0 inet dhcp
sudo ifup eth0
sudo route add default gw 192.168.1.1
sudo route del default gw 10.0.1.10

As soon as you do that, juju should continue and finalize configuring the node.

Then wait for that to finish. If you have errors…. something is amiss. Now you can use the bootstrapped node to provision the services including Openstack. We’ll first bring up the JuJu GUI, mainly because it’s cool but it will also give you feedback when things don’t go well.

juju deploy --to 0 juju-gui
juju expose juju-gui
grep pass ~/.juju/environments/maas.jenv 

Wait for the juju-gui service to start; if you can’t get this working, no sense making things worse. Because I’m on a mac, I can use the name of my MaaS node directly in the browser’s address bar (in my case “witch.local”, which I use in the following discussion – please replace with your node’s name). Use the password just shown from the grep and the account ‘admin’.

Now start up the containerized services:

juju deploy --to lxc:0 mysql
juju deploy --to lxc:0 keystone
juju deploy --to lxc:0 nova-cloud-controller
juju deploy --to lxc:0 glance
juju deploy --to lxc:0 rabbitmq-server
juju deploy --to lxc:0 openstack-dashboard
juju deploy --to lxc:0 cinder

The problem I mentioned with this recipe is that the containers can’t find their way to the physical juju bootstrap host. To fix this, verify that each container is ‘started’ with:

juju status|grep '0/lxc/[0-9]*:'  -A1

Then add the address of the host to each container’s hosts file using a script run on the juju bootstrap server:

#!	/bin/sh
# install this as "fix" and chmod 755 to it
if test $# -ne 1
then
	echo "usage: fix {lxcnum}"
	exit 99
fi
ipaddr=`ifconfig juju-br0|grep 'inet '|sed -e 's/^[^0-9]*://'| awk '{print $1}' `

lxcn=/var/lib/lxc/juju-machine-0-lxc-$1
if test \! -d $lxcn
then
	echo "LXC does not exist: $lxcn"
fi
echo  $ipaddr `hostname`.local >> ${lxcn}/rootfs/etc/hosts

so for example:

for i in 7 8 9 10 11 12 13 ; do
> sudo ./fix $i
> done

The nova-compute setup will take longer so start it now and set up all the relations.

juju deploy nova-compute
juju add-relation mysql keystone
juju add-relation nova-cloud-controller mysql
juju add-relation nova-cloud-controller rabbitmq-server
juju add-relation nova-cloud-controller glance
juju add-relation nova-cloud-controller keystone
juju add-relation nova-compute nova-cloud-controller
juju add-relation nova-compute mysql
juju add-relation nova-compute rabbitmq-server:amqp
juju add-relation nova-compute glance
juju add-relation glance mysql
juju add-relation glance keystone
juju add-relation glance cinder
juju add-relation mysql cinder
juju add-relation cinder rabbitmq-server
juju add-relation cinder nova-cloud-controller
juju add-relation cinder keystone
juju add-relation openstack-dashboard keystone

That will start up your second physical server and make it available for running your work. Once all services are green on the juju-gui, you can log into openstack. Create the password for openstack gui and figure out its address:

juju set keystone admin-password="helloworld"
juju status openstack-dashboard|grep public

point your browser at the IP address you just found followed by “/horizon”, (e.g. http://192.168.1.164/horizon), then log in with the “helloworld” password or whatever you used.

Here I have found I have to mess with routing on my workstation, the JuJu client. I have to tell it how to get to the 10.0.1.0/24 network. It needs to go through the laptop, the maas cluster controller. For me that’s:

sudo route add 10.0.1.0/24  witch.local

Awesome!  You are Openstacking.

Configure OpenStack

You probably thought you were done. Unfortunately the network is not properly set up yet. I’m not sure why the nova-compute charm doesn’t set up at least some of this. You need to connect the bridge network to your actual network. Ultimately I used juju-br0, already bridged to eth1, as follows:

On the JuJu gui, select the nova-compute charm and then the wrench for settings on the left. Change the “bridge-interface” from br100 to juju-br0. Then hit commit and confirm.

On each nova-compute node, login (ssh ubuntu@north.local for me)  install the nova command line and set the environment variables to get the create-network command to run:

sudo apt-get install python-novaclient
export OS_TENANT_ID=86f6dad7c72222dbed1683fe8ef34ca
export OS_AUTH_URL=http://192.168.1.171:35357/v2.0
export OS_USERNAME=admin
export OS_PASSWORD=helloworld

The Tenant ID can be found from the openstack GUI under the “Access and Security” settings when logged in with admin user, using the “Project” (Tenant) called “admin”. Use “juju status keystone” to find the address of the OS_AUTH_URL. Then, still on the nova-compute node, execute:

nova network-create vmnet --fixed-range-v4 10.0.0.0/16 --fixed-cidr 10.0.20.0/24 --bridge juju-br0 

Use the openstack dashboard to create one or more Tenants, also called Projects. Then you create Users. Each user logs in and creates an SSH key for access to their instances. Using names without spaces will make command line work later easier.

If you launch instances and never see anything under the log tab after clicking the instance name, then you probably messed up the image import. The target of the image URL is a.img (ISO). It won’t complain if you do something else, it will just fail to start the instance. The other symptom I’ve seen is an image that never leaves the “Queued” or “Saving” state – I have had this problem with the windows image described at http://www.cloudbase.it/windows-cloud-images/ so I’ve given upon that for now.

Your instance size should be small or larger. The tiny instance won’t work with the ubuntu images, and the failure was not clear.

Now boot an instance from the openstack dashboard. It now comes up and you can ping it from the nova-compute node at the IP address shown in the Horizon GUI.

Configure the security group (default) to do whatever you want. I put in three rules to open up all traffic. If you forget this, you won’t be able to access the instances even within your network.

Create a floating IP address pool. I created one with a single entry for demo purposes, drawn from my main network’s subnet. That way I can port forward to it in my router. If you don’t do this, the network that is being managed by openstack (the 10.0.0.0/16 subnet) will be cut off unless you configure the nova-compute node for NAT’ing.

nova-manage floating create --pool nova --ip_range 192.168.1.230

Now attached that to an instance that I want to be seen outside. I was kind of worried that it wouldn’t figure it out and route to it but it does — it has to go through the nova-compute node, so make sure forwarding is turned on there. This should say ‘1’:

cat /proc/sys/net/ipv4/ip_forward

I wrote a script to assign floating IP addresses. My DHCP server only allocates up to 192.168.1.199 on a class C, so I allocated 20 addresses as follows:

#!      /bin/sh
ipx=200
while test $ipx -lt 220
do
        ip=192.168.1.$ipx
        echo puttting IP address $ip in the pool
        sudo nova-manage floating create --pool nova --ip_range $ip
        ipx=`expr $ipx + 1`
done

Annoyances

If you shut down the cluster from the MaaS GUI – select the nodes and “stop” them, then later “start” them — The juju-gui thinks all the services are happy, but there may be connection errors in the compute node that prevents starting and launching instances.  nova service list shows that nova-network did is not running:

ubuntu@north:/etc/init.d$ nova service-list
OS Password: 
+----------------+----------------------+----------+---------+-------+----------------------------+-----------------+
| Binary         | Host                 | Zone     | Status  | State | Updated_at                 | Disabled Reason |
+----------------+----------------------+----------+---------+-------+----------------------------+-----------------+
| nova-conductor | juju-machine-0-lxc-2 | internal | enabled | up    | 2015-06-14T18:18:35.000000 | -               |
| nova-cert      | juju-machine-0-lxc-2 | internal | enabled | up    | 2015-06-14T18:18:35.000000 | -               |
| nova-scheduler | juju-machine-0-lxc-2 | internal | enabled | up    | 2015-06-14T18:18:35.000000 | -               |
| nova-compute   | north                | nova     | enabled | up    | 2015-06-14T18:18:34.000000 | -               |
| nova-network   | north                | internal | enabled | down  | 2015-06-14T02:13:33.000000 | -               |
+----------------+----------------------+----------+---------+-------+----------------------------+-----------------+

And this document states that

Also, in practice, the nova-compute services on the compute nodes do not always reconnect cleanly to rabbitmq hosted on the controller when it comes back up after a long reboot; a restart on the nova services on the compute nodes is required.

which is great except there is a lot of information about how to do this that doesn’t seem correct for maas/juju/ubuntu.  Apparently this is what they mean (run on the nova-compute nodes):

sudo service nova-compute restart

Having said that, since I installed from the 14.04 DVD and then used the ppa above, it has booted reliably each time. Moreover I have had no problems with LXC containers, which I did when using the juju-maintainers/stable ppa.

Conclusion

I hope your experience goes smoother than mine. I hope you have a usable openstack in your garage too.