In the last two months, we started our journey towards a new microservices architecture. Among other things, we found that our existing CD tools were not ready to scale with those new requirements. So we tried a new approach, defining our pipelines in code using LambdaCD In combination with a Mesos cluster we can deploy new applications after a few minutes to see how they fit into our architecture by running tests against existing services.
Part 1: The underlying infrastructure
Part 2: Microservices and continuous integration
Part 3: Current architecture and vision for the future
Virtualization became very popular in the last ten years and now it is part of many data centers in almost all companies to make better use of the given hardware. Of course we use it for applications and it is very flexible. If I need a new VM I just open an operations ticket and wait for an hour. But there is the problem. We talk about virtual machines. If it is virtual, there should be a way to get rid of the process by automation. For me as a developer it is not important to know on which physical or virtual machine my application is running which is why it would be nice to have a platform which lets me deploy my application via a simple REST API.
This first part of my article will give you the big picture of the infrastructure we use to run microservices.
Each microservice is shipped in its own Docker container and we can deploy it on any machine which has installed the corresponding runtime. You can think of Docker containers like lightweight VMs that are based on arbitrary layers. You choose a base image, e.g. Ubuntu 14.04, and add software and files you need to run your application. Each command (copy files into it, install new packages, set an environment variable, ...) creates a new layer. The advantage is if I want to run my container on an other machine I only have to transfer the layers which are not already there. In the best case this is only the last layer containing your code changes.
Back to our overview, it is very nice to deploy your application in a container which requires only the Docker runtime because we don't have to care about the underlying hardware. To run a quick test, start the container on your local machine and for the deployment, do the same on the server without preparing the server by installing packages in a special version. The only thing you need is Docker.
Mesos
Apache Mesos is an open-source project abstracting all hardware resources in your cluster.
Every slave tells the master how many CPUs, how much RAM and disk space it can offer.
The master collects all this information and acts as contact point for Mesos frameworks. At OTTO we use Marathon to run applications in the cluster. Mesos sends the collected hardware information to the framework and this selects a slave which offers the necessary hardware.
The Mesos master forwards the decision of the framework and the task (start Docker container with specified URL) to the slaves, which run the task and allocate the resources. Finally, the slaves send back their updated resources offer to the master.
It is important to know if you only want to run one instance of an application you can't distribute it over different slaves. You have to choose one which has the resources you need. But if you run your applications in HA mode with more than one instance you can tell Marathon to use a different slaves for every instance. If you don't set this option it could happen that all your instances are located on the same slave and when this one goes down your application is offline until Marathon restarts it on another slave.
One problem we encountered is that all applications running in the Mesos cluster can't persist any data. After a restart the application can't access the files it wrote to before. It is possible to mount a filesystem from the host but then you have to guarantee that your application is deployed to the same slave and that the same filesystem is assigned to it. To solve this problem we use databases running outside of the cluster in traditional VMs. This, however, is only a workaround for the next months.
If you want to start an application in your Mesos cluster you can't send your requests directly to the master. You have to use a framework which knows the protocol to understand hardware offers from the slaves and to define your task. A very popular framework to do this is Marathon which provides a user friendly API to deploy Docker containers with their hardware requirements.
For every application you have to create a JSON configuration file which defines a unique id, required resources, the URL of the Docker container, etc. Marathon stores these configuration files and you can restart an application with an old one if you notice difficulties after deployment.
To start a new application or to restart a running application you have to use the Marthon REST API or the web UI. Both have the same functionality and so we use the former for automated processes, i.e. deployments in our shell scripts, and the latter for manual processes like scaling, restarting and troubleshooting.
If your application crashes, Marathon starts it again to avoid downtimes caused by manual processes. To use this feature every application has to implement a simple service which responses to health checks. This check can respond with an error code when internal processes go wrong or when the application should restart, e.g. to reload its configuration.
Another important feature is the rolling restart. You can configure how many instances should always be running which is necessary if you want to guarantee a zero downtime deployment.
The third component in our infrastructure is a proxy server which also acts as a load balancer.
When Marathon starts or scales an application we create a new configuration with URLs depending on the application IDs. On the slaves every public port of an application is mapped to a random port to avoid clashes.
Let's have a look at the deployment process to see how all components interact with each other.
Our infrastructure is very powerful and gives us the ability to deploy new applications within a few minutes. We can integrate our microservices in an early state why we have a fast feedback loop. Because of this we avoid developing our services in a wrong direction.
For you as developer it is very easy to test new implementations and you don't have to care about someone else spending time to create a VM for you. Just deploy your application and destroy it if you don't like it. This infrastructure allows us to create microservices in very dynamic way.
In the next part we will start digging into the world of microservices. You will learn how we define the term "microservice" and why we prefer this architecture style. Additionally you are introduced to LambaCD which lets us define CI pipelines in code.
[…] Microservices: CI with LambdaCD – The underlying infrastructure (1/3) […]
[…] a new and much more flexible Infrastructure based on Mesos and Docker. Simon Monecke has written a post about that, which is only the first of a series of […]
[…] LambdaCD is an open-source project initiated by Florian Sellmayr. It is written in clojure and promises to let you define pipelines in code which empowers you to customize this CI tool completely. By using a programming language to define entire pipelines you are not restricted by the completeness of any DSL and you can add all the functionality you need in your steps. It has many more advantages such as you can use your favorite IDE and SCM for writing and organizing your pipelines. Furthermore you can test your code with common tools and write your own library to reuse tasks in different pipelines. We do this to manage our deployment to Marathon at a central place. To automate the creation of new pipelines we created a Leiningen template which contains a typical configuration. After specifying a project name and a Git repository we have a production-ready pipeline which can run immediately in our new Mesos cluster. Because every pipeline is just a clojure project you can use LambdaCD to build and deploy it. You have read correctly: You can use a LambdaCD pipeline to build and deploy our LambdaCD pipelines! For this task every pipeline we create also has a second pipeline called the meta-pipeline. If you change the definition of a pipeline it is triggered by your commit and deploy itself. LambdaCD is a highly configurable CI tool which solve the most our problems. Surely, it is a challenge to switch to a new CI tool but it fits better as common CI tools in our dynamic infrastrature. […]
[…] Vamp oder Very Awesome Microservices Platform vereinfacht das Durchführen von Canary-Releases. Vamp unterstützt im Moment Mesos und Marathon, später soll aber auch Kubernetes als Container-Manager hinzukommen. Die Plattform wird von der niederländischen Firma magnetic.io entwickelt. Ähnlich wie Kubernetes ist Mesos ein Dienst zum Starten von Docker-Containern in einem verteilten Rechner-Cluster. Mesos ist zuständig für Ressourcenverwaltung im Cluster und kann mit Marathon das Deployment von Microservice-Anwendungen durchführen. Marathon übernimmt dabei die Aufgabe eines Schedulers, der die Verteilung und Ausführung von Docker-Containern steuert. Viele große Firmen setzen ähnliche Technologie-Stacks schon erfolgreich in Produktion ein, darunter sind z.B, Apple, Holidaycheck und Otto. […]
[…] http://dev.otto.de/2015/06/01/microservices-ci-with-lambdacd-the-underlying-infrastructure-13/ (Infrastructure Plans for our Application and the delivery of it) […]
[…] Beschreibung der von uns verwendeten Technologien und unserer Infrastruktur ist auf unserem Blog zu […]
[…] Beschreibung der von uns verwendeten Technologien und unserer Infrastruktur ist auf unserem Blog zu […]
We have received your feedback.