Feb 06
2014

The Start of the Age of Flynn

For about the past six months, I’ve been working on an open source project called Flynn. It’s gotten a lot of attention, but I haven’t written much about it. I’ve been hoping to start a series discussing the design and development behind Flynn, but it seems appropriate to at least introduce the project and provide some context.

What is Flynn?

Before development, I started writing the official Flynn Guide. There I explained Flynn like this:

Flynn has been marketed as an open source Heroku-like Platform as a Service (PaaS), however the real answer is more subtle.

Flynn is two things:

1) a “distribution” of components that out-of-the-box gives companies a reasonable starting point for an internal “platform” for running their applications and services,

2) the banner for a collection of independent projects that together make up a toolkit or loose framework for building distributed systems.

Flynn is both a whole and many parts, depending on what is most useful for you. The common goal is to democratize years of experience and best practices in building distributed systems. It is the software layer between operators and developers that makes both their lives easier.

It’s easy right now to describe Flynn as another, better, modern open source PaaS. But it’s really much more than that. I usually need to underline this in discussions because in most people’s mind, a PaaS is a black box system that you deploy and configure and then you have something like Heroku. Like you can deploy OpenStack Nova and get something like EC2.

Flynn can be that, but it’s designed so it can be used as an open system, or framework, to build a service-oriented application operating environment. The truth is, if you’re building and operating any sort of software as a service, you’re not just building an application, you’re building a system to support your application and the processes around it.

You might be tempted to call Flynn a tool for “devops”. While that might be true, remember that the original ideas around devops were about organization-wide systemic understanding of your application, its lifecycle, and its operation. In reality, Flynn is designed for this type of thinking and should hopefully blur the line between operations and engineering, encouraging both to work together and think about the entire system you’re building.

Why build Flynn, how did it come about?

This is a long story, but it provides context for the vision of Flynn, what problems inspired it, and shows just how long the idea has been stirring. Though keep in mind Flynn is a collaboration and this is just my half of the story.

Falling in love with PaaS

For years I was obsessed with improving the usefulness and programmability of our collective macro distributed system of web services and APIs. Think webhooks. Circa 2006 I was also building lots of little standalone web utilities and APIs for fun. I quickly learned that using technologies like App Engine and Heroku was imperative to sanely operate so many of these free services, and keep costs near zero for what I considered public utilities.

It turns out, for the exact same reasons of cost and automation, these PaaS providers were slowly revolutionizing a subset of commercial application development. The idea of generic managed applications (“NoOps”) and streamlining the delivery pipeline (necessary for Continuous Delivery) has always had huge implications for web apps and their businesses. For me, even though PaaS providers couldn’t have come soon enough, I seemed to always want more than what they could provide. I constantly struggled with the limitations of App Engine, Heroku, and dotCloud. Originally it was limited to certain languages, certain types of computation, certain types of protocols. In fact, there still isn’t a great PaaS provider that lets you build or deploy a non-HTTP service, like say, an SMTP server or custom SSH server.

The divide between PaaS and host-based infrastructure

For all the systems and design knowledge, best practices, and solutions to important problems that Heroku, dotCloud, App Engine, and others have figured out, if for some reason you cannot use them, you get none if it. If it’s too expensive, or your system is just too complicated, or you need to use a protocol they don’t support, you just get EC2 or bare metal hosts and have to work from there. If you’re lucky or smart, depending on who makes these decisions, you get to use Chef or Puppet.

But I’ll be honest, the Chef and EC2 combo is still a huge step down from what a system like Heroku can offer. What’s more is that large scale organizations like Google and Twitter have it pretty well figured out, but they have it figured out for them. The rest of us are left with a myriad of powerful if not mystical solutions to building our distributed systems like Mesos and Zookeeper. If we’re lucky enough to discover and understand those projects, we often avoid them until absolutely necessary and only then figure out how to integrate them into our already complex systems. Most of which we had to build ourselves because the baseline starting point has always been “a Linux server”.

Twilio and service-oriented architectures

For me, a lot of this was learned at Twilio, which was a perfect place to think about this space. The Twilio system, behind that wonderfully simple API, is a highly complex service-oriented system. When I left a couple years ago, it involved roughly 200 different types of services to operate. Some were off-the-shelf open source services, like databases or caches. Most of them were custom services that spoke to each other. Some were written in Python, some in PHP, some in Java, others in C. Lots of protocols were used, both internally and publicly facing. Primarily HTTP, but also SIP, RTP, custom TCP protocols, even a little ZeroMQ. Most people forget that databases also add protocols to the list.

I can’t tell you how many problems come with a system that complicated. Though it did its job quite well, the system was effectively untestable, had an atrocious delivery pipeline, was incredibly inefficient in terms of EC2 instances, and nobody really understood the whole system. A lot of these are common problems, but they’re compounded by the scale and complexity of the system.

The first half my time at Twilio was spent building scalable, distributed, highly-available messaging infrastructure. What a great way to learn distributed systems. However, I ran into so many problems and was so frustrated by the rest of the system, that I dedicated the second half of my time at Twilio to improving the infrastructure and helped form the Platform team. The idea being that we were to provide a service platform for the rest of the engineering organization to use. Unfortunately it never really became more than a glorified operations and systems engineering team, but we did briefly get the chance to really think through the ideal system. If we could build an ideal internal platform, what would it look like? We knew it looked a lot like Heroku, but we needed more. What we ended up with looked a lot like Flynn. But it never got started.

I remember very clearly working out sketches for a primitive that I thought was central to the whole system. It was a “dyno manager” to me, which gave you a utility to manage the equivalent of Heroku dynos on a single host. These were basically fancy Linux containers. Again, though, not important to Twilio at the time. Eventually, I left Twilio and started contracting.

More history with cloud infrastructure

First, I worked with some old friends back from the NASA Nebula days. I forgot to mention, in 2009, before Twilio, I worked at NASA building open source cloud infrastructure. The plan was to implement an EC2, then a sort of App Engine on top of it, and then specifically for purposes at NASA, lots of high level application modules. Turns out the first part was hard enough. We started using Eucalyptus, but realized it was not going to cut it. Eventually the team wrote their own version from what they learned using Eucalyptus, called it Nova, partnered with Rackspace, and that’s how OpenStack was born.

I was on the project pre-OpenStack to actually work on the open source PaaS back then in 2009, but we never got to that before I left. Probably for the best in terms of timing. Another reason I was there was because we also wanted to provide project hosting infrastructure at NASA. This was before Github was popular, and in fact, I was running a competing startup called DevjaVu since 2006 that productized Trac and SVN. As Github got more popular and I was distracted with other projects, I decided to shutdown DevjaVu, admitting that Github was doing it right. But my experience meant I could easily throw together what was originally going to be code.nasa.gov.

Fast forward to my contracting after Twilio, I worked with my friends at Piston Cloud, one of the OpenStack startups that fell out of the NASA Nebula project. My task wasn’t OpenStack related, it was actually to automate the deployment of CloudFoundry on top of OpenStack for a client. CloudFoundry was one of the first open source PaaS projects. It popped up in 2011. This gave me a taste of CloudFoundry, and boy it was a bad one. Ignoring anything about CloudFoundry itself, just deploying it from scratch, while 100% automated, would take 2 hours to complete. Nevertheless, there are still some aspects of the project I admire.

Docker and Dokku

My next big client turned out to be an old user of DevjaVu, but I never realized it until I started talking with them. It was a company called dotCloud. Quickly hitting it off with Solomon Hykes, we tried to find a project to collaborate on. I mentioned my “dyno manager” concept, and he brought up their next-gen container technology. Soon I was working on a prototype called Docker.

While the final project was mostly a product of the mind of Solomon and the team, while working on the prototype I made sure that it could be used for the systems I was envisioning. In my mind it was just one piece of a larger system, though a very powerful piece with many other uses and applications. Solomon and I knew this, and would often say it was the next big thing, but it’s still a bit crazy to see that it’s turning out to be true.

Not long after Docker was released, Solomon and I went to give a talk at GlueCon to preach this great new truth. The day before the talk, I spent 6 hours hacking together a demo for the talk that would demonstrate how you could “easily” build a PaaS with Docker. I later released this as Dokku, a Docker powered mini-Heroku.

Dokku intentionally left out anything having to do with distributed systems. It was meant for a single host. I thought maybe it would be good for personal use or for building internal tools. Turns it was, and it got pretty popular. I had a note in the readme that said it intentionally does not scale to multiple hosts, and that perhaps this could be done as a separate project that I referred to as “Super Dokku”. That’s what I imagined Flynn as at the time.

Flynn takes shape

Now, back around the time I was working on the Docker prototype in Winter 2012, I was approached by two guys, Daniel Siders and Jonathan Rudenberg, that had been working on the Tent protocol and its reference implementation. They wanted to build a fully distributed, enterprise grade, open source platform service. They said they were going to be working on it soon, and they wanted to work on it with me. The only problem is they didn’t have the money, and they’d get back to me after a few funding meetings.

Later, I think around the time I released Dokku mid-2013, Daniel and Jonathan approached me again. They were serious about this platform service project and had the idea to crowdfund it with company sponsorships. You’d think I’d be rather dubious about the idea, but given the growing interest in Docker, the great response from Dokku, and basically testimonial after testimonial of companies wanting something like this, I figured it could work.

We decided to call the project Flynn, and got to work comparing notes and remote whiteboarding the project. I was very lucky that they were very like-minded, and we were already thinking of very similar architectures and would generally agree on approaches. We put together the Flynn Guide and the website copy and funding campaign using Selfstarter, then let it loose.

We quickly met our funding goal for the year and then spent the rest of the year working on Flynn. Unfortunately, the budget only covered part-time, but we had planned to have a working development release by January 2014.

What now?

It’s now February 2014, so let’s take a look at where we are.

Like most software schedules, ours fell behind a little. While the project has been in the open on Github from the beginning, we planned to share a rough but usable developer release last month. We’re so close.

What makes it difficult is we’re out of our 2013 budget! This affects my contribution more than Jonathan’s. I’ve been putting time into it here and there, but it no longer pays my bills. That could change soon, but until then it might move a little bit slower until our initial release. Only after the release can we can go for another sponsorship campaign, so you can see how right now is just a little frustrating.

That said, there’s still more and more interest in the project, we already have a few brave souls that have been contributing to the project components, and like I said, hopefully the money situation will sort itself out soon. A few things are in the works.

In the meantime, although I’m not as active on it at this moment, I do feel compelled to use this time to write about it here on my blog. Hopefully catch everybody up on the architecture, discuss design decisions, talk about the future, and then a lot of that should be useable for official project documentation.

And hell, if I can’t work on Flynn for some reason, after all this, hopefully the writing will allow somebody to continue my work. :)

Comments
Nov 13
2013

Viewdocs: Hosted Markdown project documentation (finally!)

A huge part of the user experience for open source software is the documentation. When writing new software to be adopted, I’ve learned it’s more important to first write decent docs than tests. And when I forget, Kenneth Reitz is there to remind me.

When I’ve outgrown a README on Github, I only consider two options for providing documentation: Github Pages and Read the Docs. Unfortunately, I have problems with both of them. Chiefly, Read the Docs makes me use reStructured Text, and Github Pages means maintaining a separate orphan branch and using a static page generator.

What I’ve really wanted is something like Gist.io, but for my repository. Nobody has stepped up, so I built it.

I call it Viewdocs. It renders static pages on-demand from Markdown in your project’s docs directory. There’s no setup, just follow the conventions and it works. It may even already be working for you, since Markdown in a docs directory is not that uncommon. And keeping your documentation in the same branch as your code means it’s easier for people to contribute docs with their pull requests.

The default layout is borrowed from Gist.io, giving you a clean, elegant documentation site. All you have to do is write some Markdown. That’s about all there is to it.

You can read more on the homepage for Viewdocs, which is powered by Viewdocs. Or here’s a quick video introduction:

Comments
Aug 18
2013

Hacker Dojo: Community Trading Zone

I recently came out of a Hacker Dojo board meeting, as I do every month, but this time with a renewed sense of excitement for Hacker Dojo. We began with the usual board meeting stuff — finances, staff benefits, etc — but there was one final item to discuss that’s more in tune with the reason I’m even on the board. There has been an increasingly pressing issue around what Hacker Dojo is. We used to know what it was and had a reasonable idea of what we always wanted it to be, but we’ve grown, we’ve learned, and our model has to adapt. This discussion led to a rethinking of the conceptual structure of Hacker Dojo.

One of the reasons this came up is growth. We’ve had consistent membership growth with only a couple expected downturns, due to various setbacks. For example, when we were temporarily limited to a maximum occupancy of 49 people, membership dropped because people couldn’t throw the same events as before. Despite setbacks, we’ve had impressive long-term growth. If you describe us as a “hackerspace,” we are the largest in the United Sates, and I believe one of the largest in the world. It’s clear that overall we’re doing quite well, but we want to take it further. We want to keep pushing because Hacker Dojo means a lot to all of us and we want to see it, and the culture and ideals that go with it, reach new people and new places.

We’ve always talked about franchising and starting new locations, but we’ve learned that a single, 24/7 location with 400 members and around 2,000 people coming through every month is quite difficult to run, especially as a bootstrapped non-profit with minimal staff. We’ve had sponsorships, but we work hard for those sponsorships and provide services to receive them. Most of our income comes from membership fees. Despite all this we’re trying our best to continue to take Hacker Dojo to the next level, and it should be noted we have been quite successful so far, for many reasons that could go into another long blog post.

Scaling any organization is hard. Scaling an organization like this one can be extra difficult, especially when we’re trying to maintain the grassroots, bottom-up, culture that started it. We’ve always tried to support the concept of a democratic organization. For the first few years we were 100% volunteer run, with no paid employees. The growth of Hacker Dojo has generated lot of extra work to get done that nobody really wants to do. Raising money, working with the city, organizing contractors, dealing with financial issues, having to move to a new location and rent the previous one … all this requires a LOT of leg work and consistent attention that just wasn’t happening with volunteers.

Eventually, you need to start hiring full or part-time people. We now have a small staff of paid employees who tackle these more time-intensive tasks. While this has helped us manage the growth, it has created a kind of interesting tension between the forces of the democratic, bottom-up nature of our organization and the forces of more traditional, more centralized modes of operation necessary for efficient execution of a vision. The vision being in the short-term to improve the quality of the experience at Hacker Dojo, and in the long-term to bring Hacker Dojo and all that it stands for — what some have called the epitome of true Silicon Valley culture — to more people.

This tension has been healthy and has helped Hacker Dojo often reap the benefits of both worlds. However, as we grow, so does the tension and discussion around it. This was the catalyst for our discussion at the board meeting. Now the realization I’m excited about is somewhat of an aside from this issue of tension, but the issue was the catalyst for revisiting what Hacker Dojo is. I quite enjoy the occasional existential crisis as it often results in refreshed sense of purpose and meaning.

We revisited many ideas, for example that Hacker Dojo is a platform, and like many platforms it can be hard to describe. We serve purposes in the worlds of education, business, social, and many others. We’d talked about how Hacker Dojo has played a part in not just projects and startups, but relationships — partnerships, friendships, and even marriages. We considered different organizations for analogy: universities, incubators, fraternities, and anything else that comes close to an existing framework for all the amazingness that Hacker Dojo produces.

Then it hit us. Communities. Plural.

What we realized is that we’ve effectively been treating Hacker Dojo as one community. We’ve sort of acknowledged that there are sub-groups within Hacker Dojo, but we’ve more or less operated under the assumption that we serve two types of citizens: members and the general public. As we’ve gotten larger, it’s become much more difficult to effectively treat either of those as one group. The reality is that they are all actually part of many communities.

The idea was right under our nose the whole time, just under a different guise. Events have always been a core part of Hacker Dojo because Hacker Dojo started with the idea that it could be a place where people could meet and host events like the event that inspired Hacker Dojo itself, SuperHappyDevHouse. When an event happens on a regular basis it turns into a community. A group of people with a common interest and/or set of values come together, turn into a community, and these communities grow, develop, and spawn really interesting things, just as SuperHappyDevHouse spawned Hacker Dojo.

Hacker Dojo now serves many communities both inside and outside of Hacker Dojo; not only internal communities but external ones as well. External communities can also leverage the infrastructure that Hacker Dojo provides. These communities might already have a member of Hacker Dojo, or someone in the community chooses to become a member of Hacker Dojo, usually to throw an event. Once that community gets into Hacker Dojo, they not only see Hacker Dojo, but all the other communities that come together under our roof. Sometimes this inspires even more people to sign up as members, not just to be a part of Hacker Dojo, but to participate in and see what other communities Hacker Dojo offers.

We’re now thinking of Hacker Dojo not just as a community hub, but as a hub of communities. A community trading zone, if you will. While we will always support and listen to individual members, we should begin to think of communities, in the plural, as first-class citizens of Hacker Dojo. This may seem like a subtle change but it is a big difference. It means acknowledging and supporting the communities that operate in and around Hacker Dojo. It means going to those communities and asking how we can better serve them as a community, not just individual members.

By connecting with communities we can provide infrastructure to foster their growth and development. Imagine going to the Hacker Dojo website and seeing a page devoted to the many communities of Hacker Dojo. When a new member signs up they could indicate their interests and we could provide them with a list of communities they might be interested in. Hacker Dojo would in a sense then be improving the communities’ “deal flow.”

Rethinking Hacker Dojo as infrastructure for communities has led to lots of exciting new ideas. By focusing on empowering them with infrastructure that allows them to flourish, we are then supporting our members in a more meaningful way.

I’m hoping this simple change in the way the board and members invested in Hacker Dojo think about Hacker Dojo will hopefully lead to lots of positive change. We don’t have these conversations very often on the board but we need to have them to maintain the vision of Hacker Dojo and we need to have them in public. Clearly this is a collaborative effort, so we want to know how the general community feels about this idea. So I’m putting this out there, and hopefully it will lead to more exciting discussions.

Comments
Jun 19
2013

Dokku: The smallest PaaS implementation you've ever seen

Dokku is a mini-Heroku powered by Docker written in less than 100 lines of Bash. Once it’s set up on a host, you can push Heroku-compatible applications to it via Git. They’ll build using Heroku buildpacks and then run in isolated containers. The end result is your own, single-host version of Heroku.

Dokku is under 100 lines because it’s built out of several components that do most of the heavy lifting: Docker, Buildstep, and gitreceive.

  • Docker is a container runtime for Linux. This is a high-level container primitive that gives you a similar technology to what powers Heroku Dynos. It provides the heart of Dokku.
  • Buildstep uses Heroku’s open source buildpacks and is responsible for building the base images that applications are built on. You can think of it as producing the “stack” for Dokku, to borrow a concept from Heroku.
  • Gitreceive is a project that provides you with a git user that you can push repositories to. It also triggers a script to handle that push. This provides the push mechanism that you might be familiar with from Heroku.

There are a few other projects being developed to support Dokku and expand its functionality without inflating its line count. Each project is independently useful, but I’ll share more about these as they’re integrated into Dokku.

For now, here’s a screencast that shows how to set up Dokku, along with a quick walk-through of the code.

Comments
Jan 05
2013

Executable Tweets and Programs in Short URLs

A few weeks ago I was completely consumed for the better part of a day that I would have otherwise spent on more practical work.

Yeah, what? Weird, right? It started from a Twitter conversation earlier that day with my friend Joel:

This wishful brainstorming inspired me to start building exactly that. But first, a digression.

The idea reminded me of an idea I got from Adam Smith back when I was working on Scriptlets. If you can execute code from a URL, you could “store” a program in a shortened URL. I decided to combine this with the curl-pipe-bash technique that’s been starting to get popular to bootstrap installs. If you’re unfamiliar, take this Gist of a Bash script:

Given the “view raw” URL for that Gist, you can curl it and pipe it into Bash to execute it right there in your shell. It would look like this:

$ curl -s https://gist.github.com/raw/4464431/gistfile1.txt | bash
Hello world

Instead of having Gist store the program, how could we make it so the source would just live within the URL? Well in the case of curl-pipe-bash, we just need that source to be returned in the body of a URL. So I built a simple app to run on Heroku that takes the query string and outputs it in the body, a sort of echo service.

Letting you do this:

$ curl "http://queryecho.herokuapp.com?Hello+world"
Hello world

Which you could conceal and shorten with a URL shortener, like Bitly. I prefer the j.mp domain Bitly has. And since they’re just redirecting you to the long URL, you’d use the -L option in curl to make it follow redirects:

$ curl -L http://j.mp/RyUN03
Hello world

When you make a short URL from the bitly website, they conveniently make sure the query string is properly URL encoded. So if I just typed queryecho.herokuapp.com/?echo "Hello world" into bitly, it would give me a short URL with a properly URL encoded version of that URL that would return echo "Hello world". This URL we could then curl-pipe into Bash:

$ curl -Ls http://j.mp/VGgI3o | bash
Hello world

See what’s going on there? We wrote a simple Hello world program in Bash that effectively lives in that short URL. And we can run it with the curl-pipe-bash technique.

Later in our conversation, Joel suggests an example “app tweet” that if executed in Bash given a URL argument, it would tell you where it redirects. So if you gave it a short URL, it would tell you the long URL.

Just so you know what it would look like, if you put that program in a shell script and ran it against a short URL that redirected to www.google.com, this is what you would see:

$ ./unshortener.sh http://j.mp/www-google-com
http://j.mp/www-google-com
http://www.google.com/

It prints the URL you gave it and then resolves the URL and prints the long URL. Pretty simple.

So I decided to put this program in a short URL. Here we have j.mp/TaHyRh which will resolve to:

http://queryecho.herokuapp.com/?echo%20%22$url%22;%20curl%20-ILs%20%22$url%22%20|%20grep%20Location%20|%20grep%20-o%20'http.*'

Luckily I didn’t have to do all that URL encoding. I just pasted his code in after queryecho.herokuapp.com/? and bitly took care of it. What’s funny is that this example program is made to run on short URLs, so when I told him about it, my example ran on the short URL that contained the program itself:

$ curl -Ls http://j.mp/TaHyRh | url=http://j.mp/TaHyRh bash
http://j.mp/TaHyRh
http://queryecho.herokuapp.com/?echo "$url"; curl -ILs "$url" | grep Location | grep -o 'http.*'

You may have noticed my version of the program uses $url instead of $1 because we have to use environment variables to provide input to curl-pipe-bash scripts. For reference, to run my URL script against the google.com short URL we made before, it would look like this:

$ curl -Ls http://j.mp/TaHyRh | url=http://j.mp/www-google-com bash
http://j.mp/www-google-com
http://www.google.com/

Okay, so we can now put Bash scripts in short URLs. What happened to installing apps in Tweets? Building an apptweet program like Joel imagined would actually be pretty straightforward. But I wanted to build it in and install it with these weird programs-in-short-URLs.

The first obstacle was figuring out how to get it to modify your current environment. Normally curl-pipe-bash URLs install a downloaded program into your PATH. But I didn’t want to install a bunch of files on your computer. Instead I just wanted to install a temporary Bash function that would disappear when you leave your shell session. In order to do this, I had to do a variant of the curl-pipe-bash technique using eval:

$ eval $(curl -Ls http://j.mp/setup-fetchtweet)
$ fetchtweet 279072855206031360
@jf you asked for it... Jeff Lindsay (@progrium) December 13, 2012

As you can see by inspecting that URL, it just defines a Bash function that runs a Python script from a Gist. I cheated and used Gist for some reason. That Python script uses the Twitter embed endpoint (same one used for the embedded Tweets in this post) to get the contents of a Tweet without authentication.

The next thing I built installed and used fetchtweet to get a Tweet, parse it, put it in a Bash function named by the string after an #exectweet hashtag (which happens to also start a comment in Bash). So here we have a Tweet with a program in it:

To install it, we’d run this:

$ id=279087620145958912 eval $(curl -Ls http://j.mp/install-tweet)
Installed helloworld from Tweet 279087620145958912
$ helloworld
Hello world

We just installed a program from a Tweet and ran it! Then I wrapped this up into a command you could install. To install the installer. This time it would let you give it the URL to a Tweet:

$ eval $(curl -Ls http://j.mp/install-exectweet) 
Installed exectweet
$ exectweet https://twitter.com/progrium/status/279087620145958912
Installed helloworld from Tweet 279087620145958912
$ helloworld
Hello world

Where would I go from there? An app that calls itself into a loop, of course!

$ exectweet https://twitter.com/progrium/status/279123541054595074 && recursive-app
Installed recursive-app from Tweet 279123541054595074
Installed recursive-app from Tweet 279123541054595074
Installed recursive-app from Tweet 279123541054595074
Installed recursive-app from Tweet 279123541054595074
...

Obviously, this whole project was just a ridiculous, mind-bending exploration. I shared most of these examples on Twitter as I was making them. Here was my favorite response.

You may have noticed, it just happened to be 12/12/2012 that day.

Comments