Ganglia Web frontend in Ubuntu 16.04 install issue

    03 May 2016

    Ubuntu 16.04 Xenial comes with Ganglia Web Front end 3.6.1 included however doesn’t pull in all the dependencies. If you get an error like this

    Sorry, you do not have access to this resource. "); } try { $dwoo = new Dwoo($conf['dwoo_compiled_dir'], $conf['dwoo_cache_dir']); } catch (Exception $e) { print "
    

    You are missing Mod PHP and PHP7-XML module. To correct that you need to do execute following commands

    sudo apt-get install libapache2-mod-php7.0 php7.0-xml ; sudo /etc/init.d/apache2 restart
    

    If you don’t have Ganglia web frontend enabled all you need to do is type

    sudo ln -s /etc/ganglia-webfrontend/apache.conf /etc/apache2/sites-enabled/001-ganglia.conf
    sudo /etc/init.d/apache2 restart
    

    Google Compute Engine Load balancer Let's Encrypt integration

    18 April 2016

    Let’s Encrypt (LE) is a new service started by Internet Security Research Group (ISRG) to offer free SSL certificates. It’s intended to be automated so that you can obtain a certificate quickly and easily. Currently however LE requires installation of their client software which makes a request to their API for a domain you want to secure then generates a random script that it puts at a random Web Path for the domain so that LE backend servers can check them. In a nutshell to get a certificate for domain myhost.mydomain.xyz LE client will require you add predetermined text at a URL they provide e.g.

    http://myhost.mydomain.xyz/.well-known/jdoiewerhwkejhrwehrheuwhruewh

    If that matches you have validated you are the owner of the domain and LE issues you a certificate. More detail on how it works can be found here.

    Difficulty is that in order to automate this process you either

    • have to allow LE client to control your web server (currently only Apache) - this may disrupt your traffic in case of any issues
    • allow it to drop files into a web root which may be problematic if your domain is behind load balancer and you need to copy the validation content to all nodes
    • use standalone method where LE spins it’s own standalone server but requires you to shut down your web server
    • devise a different method

    In following section I will describe a method on how to do this with Google Cloud Engine (GCE) Load balancer since it supports conditional URL path matching. You could also do something very similar with other load balancers such as Varnish or Haproxy.

    Conceptually what we’ll do is

    • Modify the GCE Load balancer URL map to send all traffic intended for LE to a special backend e.g. any URL with /.well-known/ will be sent to a custom backend
    • Spin up a minimal VM with Apache on GCE
    • Use the LE client Docker image to manage the signing process or simply install the LE client

    To make configuration easy I will be using https://www.terraform.io since it greatly simplifies this process. This process also assumes you are already running GCE load balancer against the domain you are trying to secure.

    First we’ll need to create an instance template. I am using the Google Container Engine images as they already come with Docker installed.

    variable "gce_image_le" {
        description         = "The name of the image for Let's Encrypt."
        default             = "google-containers/container-vm-v20160321"
    }
    
    resource "google_compute_instance_template" "lets-encrypt" {
        name                = "lets-encrypt"
        machine_type        = "f1-micro"
        can_ip_forward      = false
        tags                = [ "letsencrypt", "no-ip" ]
    
        disk {
            source_image    = "${var.gce_image_le}"
            auto_delete     = true
        }
    
        network_interface {
            network         = "${var.gce_network}"
            # No ephemeral IP. Use bastion to log into the instance
        }
    
        metadata {
            startup-script  = "${file("scripts/letsencrypt-init")}"
        }
    
    }
    

    You will notice I am using a startup script (scripts/letsencrypt-init) inside this instance template which looks like this

    apt-get update
    apt-get install -y apache2
    rm -f /var/www/index.html
    touch /var/www/index.html
    docker pull quay.io/letsencrypt/letsencrypt:latest
    
    mkdir /root/ssl-keys
    echo "email = myemail@mydomain.com" > /root/ssl-keys/cli.ini
    

    Basically I’m just preinstalling Apache and pulling the Let’s Encrypt Client Docker Image.

    Next step is to create an Instance Group Manager (IGM) and Autoscaler. Instance group manager defines what instance template is gonna be used and base instance name whereas autoscaler starts up instances in IGM and makes sure there is one replica running. Last step is to define the backend service and attach IGM to it.

    resource "google_compute_instance_group_manager" "lets-encrypt-instance-group-manager" {
        name                = "lets-encrypt-instance-group-manager"
        instance_template   = "${google_compute_instance_template.lets-encrypt-instance-template.self_link}"
        base_instance_name  = "letsencrypt"
        zone                = "${var.gce_zone}"
    
        named_port {
            name            = "http"
            port            = 80
        }
    
    }
    
    resource "google_compute_autoscaler" "lets-encrypt-as" {
        name                = "lets-encrypt-as"
        zone                = "${var.gce_zone_1_fantomtest}"
        target              = "${google_compute_instance_group_manager.lets-encrypt-instance-group-manager.self_link}"
        autoscaling_policy = {
            max_replicas    = 1
            min_replicas    = 1
            cooldown_period = 60
            cpu_utilization = {
                target = 0.5
            }
        }
    }
    
    resource "google_compute_backend_service" "lets-encrypt-backend-service" {
        name                = "lets-encrypt-backend-service"
        port_name           = "http"
        protocol            = "HTTP"
        timeout_sec         = 10
        region              = "us-central1"
    
        backend {
            group           = "${google_compute_instance_group_manager.lets-encrypt-instance-group-manager.instance_group}"
        }
    
        health_checks       = ["${google_compute_http_health_check.fantomtest.self_link}"]    
        
    }
    

    Next thing we’ll need to do is change the URL map for the load balancer. Basically we’ll send anything matching /.well-known/* to our LE backend service. My URL map is called fantomtest that by default uses the fantomtest backend service. This means any requests that don’t match /.well-known/ will end up on my default backend service (which is what we want)

    resource "google_compute_url_map" "fantomtest" {
        name                = "fantomtest-url-map"
        description         = "Fantomtest URL map"
        default_service     = "${google_compute_backend_service.fantomtest.self_link}"
    
        # Add Letsencrypt
        host_rule {
            hosts           = ["*"]
            path_matcher    = "letsencrypt-paths"
        }
    
        path_matcher {
            default_service = "${google_compute_backend_service.fantomtest.self_link}"
            name            = "letsencrypt-paths"
            path_rule {
                paths       = ["/.well-known/*"]
                service     = "${google_compute_backend_service.lets-encrypt-backend-service.self_link}"
            }
        }
    
    }
    

    Terraform apply it and if you have been successful you should see the letsencrypt service become healthy.

    Now log into the instance running the LE client and run

    docker run -it -v "$(pwd)/ssl-keys:/etc/letsencrypt" -v "/var/www:/var/www" quay.io/letsencrypt/letsencrypt:latest \
      certonly --webroot -w /var/www -d www.mydomain.xyz
    

    If you get

    - Congratulations! Your certificate and chain have been saved at
       /etc/letsencrypt/live/www.mydomain.xyz/fullchain.pem. Your
       cert will expire on 2016-07-17. To obtain a new version of the
    

    You are done and your certificate will be found in ssl-keys/live/www.mydomain.xyz/fullchain.pem. By default LE issues certificates with validity of 90 days and they will nag you starting 30 days before expiration to update them. I will leave it as an excercise to the reader to automate this. Do note that if you are gonna automate pushing certificates make sure you validate the full chain to make sure things look good.


    Signing AWS Lambda API calls with Varnish

    15 April 2016

    A number of months ago Stephan Seidt @evilhackerdude posed a question on Twitter if it was possible to use Fastly to sign requests going to AWS Lambda. For those who do not know what AWS Lambda is here is Wikipedia’s succinct explanation

    AWS Lambda is a compute service that runs code in response to events and automatically manages the compute resources required by that code. The purpose of Lambda, as opposed to AWS EC2, is to simplify building smaller, on-demand applications that are responsive to events and new information. AWS targets starting a Lambda instance within milliseconds of an event.

    AWS Lambda was designed for use cases such as image upload, responding to website clicks or reacting to output from a connected device. AWS Lambda can also be used to automatically provision back-end services triggered by custom requests.

    Unlike Amazon EC2, which is priced by the hour, AWS Lambda is metered in increments of 100 milliseconds.

    Initially I thought this was not going to be possible since I thought I could only make asynchronous calls however Stephan pointed out that there was a way to invoke synchronous calls as well since that is what AWS API Gateway does to expose Lambda functions.

    In order to be able to send requests to Lambda you would need to sign requests going to Lambda. AWS has gone through a number of versions of their signing API however for most services today you will need to use signature version 4. SIGV4 API relies on a number of HMAC and hashing functions that are not in stock varnish but are available in the Libvmod-Digest VMOD if you are deploying your VCL on Fastly this VMOD is already built it.

    Code

    You can find full VCL for signing requests to Lambda here

    https://github.com/vvuksan/misc-stuff/blob/master/lambda/lambda.vcl

    This code has some Fastly specific macros and functions which you can upload as custom VCL however most of the heavy lifting is done inside the aws4_lambda_sign_request subroutine so if you are using stock varnish copy that. Things to change in the vcl_recv are

    set req.http.access_key = "CHANGEME";
    set req.http.secret_key = "CHANGEME";
    

    Change those with your AWS credentials that have access to Lambda. You can also change the region where you functions run. In addition you will need to come up with a way to map incoming URLs to Lambda functions. In my sample VCL I am using Fastly’s Edge Dictionaries e.g.

    table url_mapping {
        "/": "/2015-03-31/functions/homePage/invocations",
        "/test": "/2015-03-31/functions/test/invocations",
    }
    
    # If no match req.url will be set to /LAMBDA_Not_Found
    set req.url = table.lookup(url_mapping, req.url.path, "/LAMBDA_Not_Found" );
    
    # If page has not been found we just throw out a 404
        if ( req.url == "/LAMBDA_Not_Found" ) {
            error 404 "Page not found";
        }
    

    Pros and Cons

    Pros:

    • You get the power of VCL to route requests to different backends including Lambda
    • You may be able to cache some of the requests coming out of Lambda
    • Lower costs since API Gateway can be pricey

    Cons:

    • Only POST requests with payload of up to 2 kbytes and GET requests with no query argument are supported
      • In order to compute the signature we need to calculate a hash of the payload. Unfortunately Varnish exposes only 2 kbytes of the payload inside the VCL. This is a tunable if you run your own varnish. You can adjust by running
        varnishadm param.set form_post_body 16384 
        
      • Any request other than POST needs to be rewritten as a POST hence GET can query no argument
    • You can output straight HTML however returned payload you will end up with leading and trailing ‘ character. You will also need to fix up the returned Content Type since it returns as application/json. You can set Content Type in VCL by doing following in vcl_deliver e.g.
      set resp.http.Content-Type = "text/html";
      
    • Currently it’s impossible to craft POST request froms scratch

    Future work

    Look into using something like libvmod-curl VMOD to create POST requests on the fly.


    Howto speed up your monitoring system with Varnish

    03 April 2015

    If you use a monitoring system of any kind you are looking at lots of graphs. It also happens that as size of your team grows you are looking at more and more graphs and often times member of your team are looking at same graphs. In addition as you grow graphs become more complex and you may have fairly complicated aggregated graphs with 100s of data sources which can become quite a bit of a burden on your metrics system. This resulted in complaints about slowness of our monitoring. To speed it up we figured we should our best bang for the buck would be to cache page fragments. Since we run a CDN based on Varnish it was logical what we were gonna use :-).

    Assumptions

    • Most metric systems poll on a fixed time interval e.g. 10-15 seconds. If you make a graph you can safely cache it for 10 seconds or longer since graph is not going to change
    • There are a number of static dashboard pages we can cache for longer since they don’t change. Only dependent images change
    • Even if we don’t cache or cache for really short e.g. 1-2 seconds Varnish supports Collapsed Forward which will result in collapsing multiple request for same resource into one ie. if 5 clients request same resource /img/graph1.png at the same time varnish will send only one request to the backend then respond to all 5 clients with the same resource. This is ia huge win.

    You can find an example Varnish configuration in this repo. This is Ganglia specific however you can adapt to suit your needs

    Ganglia contrib repository

    Key file you need is default.vcl which you need to put in /etc/varnish/default.vcl

    Notes

    Your caching rules should be put in vcl_fetch function. For example

    if (req.url ~ "^/(ganglia2/)?$" ) {
         set beresp.ttl = 1200s;
         unset beresp.http.Cache-Control;
         unset beresp.http.Expires;
         unset beresp.http.Pragma;
         unset beresp.http.Set-Cookie;
       }
    

    This is a regex match that will match /ganglia2/ or / and cache it for 20 minutes (1200 seconds). Resulting object will also be stripped of any Cache-Control, Expires, Pragma or Set-Cookie headers since we don’t want to send those to browsers.

       if (req.url ~ "/(ganglia2/)?graph.php") {
         set beresp.ttl = 15s;
         set beresp.http.Cache-Control = "public, max-age=10";
         unset beresp.http.Pragma;
         unset beresp.http.Expires;
         unset beresp.http.Set-Cookie;
       }
    

    Similar to the rule above we set cache time to 15 seconds, we unset all the headers except for Cache-Control which we set to 10 seconds. What this will mean is that varnish will cache the object for 15 seconds however we’ll instruct the browser to cache it for 10 seconds.

    You could also get creative an do things based on content type of the resulting object

      if ( beresp.http.Content-Type ~ "image/png" ) {
         set beresp.ttl = 15s;
      }
    

    Have fun.


    Adventures with Arduino part 2

    04 February 2015

    In my last post Adventures with Arduino part 1 I discussed some of the options of wiring up and getting metrics with Arduino. Here is the work in progress

    Arduino Wiring image

    It includes a DHT11 humidity temperature sensor, water sensor and a reid switch. The way things are set up is that it polls the sensors periodically e.g.

    • Reid switch (to see if door is open or closed) every 2-10 seconds
    • Humidity and temperature every minute

    It then sends those values as a simple comma separated file. Data format I’m using is

    device uptime,device name,metric_name=value

    with multiple metric values possibly sent in the same packet. On the receiving side I have a Raspberry Pi that follows this work flow.

    • Uses a modified raspberryfriends daemon from nrf24pihub
    • Daemon receives and parses the payload and ships it off to Statsd - using a gauge data type
    • Statsd rolls up any metrics and sends them over to Ganglia. Ganglia is used for trending and data collection e.g. this shows temperature and humidity in one of my bedrooms. You can notice the effect of a room humidifer on humidity in the room :-)

    Arduino metrics

    • I can also see if I left my garage door open :-)

    Arduino metrics

    Arduino metrics

    • In this particular instance that alert has dual meaning since this particular Arduino is driven by one of those “lip-stick” USB battery packs and Ganglia will expire a particular metric if it hasn’t been reported for a defined amount of time (in my case 1 minute). In this particular case alert state UNKNOWN tells me that most likely battery is out I need to recharge it.