I was present at the BED-CON in Berlin on April 3 & 4, 2014 with a talk about our experiences in the LHOTSE project from the perspective of Operations. The conference was held on the campus of the Freie Universität Berlin - so there was a very nice campus flair. The talks were Java & development-heavy and therefore fit very well to OTTO and our store. Here is my personal conference report.
During the talk "Search Driven Applications" by Tobias Kraft and Florian Hopf I felt confirmed. It was recommended to move the search systems more into the center of modern web applications and to relieve the database systems behind them. Especially in the case of complex search queries or faceted searches, this can greatly relieve the backend database system. At Otto, we take a similar approach and also map parts of the store with the help of our search system instead of querying every product from the database.
I also found "REST: Implementation Approaches on the JVM" by Stefan Tilkov and Martin Eigenbrodt very interesting. It is really exciting to see how different programming languages and frameworks achieve the same thing in the end. Stefan Tilkov's preference for Clojure lightened up the talk nicely. Clojure really looks very interesting, in the truest sense of the word.
Eberhard Wolff has dug the grave for Java app servers - and throws servlet containers, which host only one application, right after them. He is right with his ideas. In and of itself, a) it all fits into the build artifact of a development team and b) an app server is always specifically tailored to an application anyway. However, I still see the integration of operations as controversial. In the talk, Eberhard Wolff more or less presented that admin tools and dev tools (e.g. WAR vs. RPM artifacts) are complementary. From my point of view, it becomes exciting especially for operations when you get deeper insights and combine both worlds. Nevertheless, the approach of running web applications leaner and bundling more parts, such as the servlet container, directly into the software artifact is absolutely correct.
In his presentation "Let there be light!", Oliver Fischer reported on how he set up "in-application monitoring". I found the use of metrics and the integration of graylog2 into his environment very exciting. In and of itself I expected something about graphite or nagios, but was pleasantly surprised with something different. I took away some approaches here that I would like to try out at our company.
Michael Plöd was able to report from a wide range of experience about caching in business applications. He explained various best practices for the use of caching systems and gave many tips for the use of caching systems. Really interesting was the live example in which he showed the performance of a Java application with and without caching. The cache was a distributed system based on 5 Raspberry PIs, which looked like a small data center when assembled as a Lego box. For otto.de we use varnish as a cache server, but the presentation was more oriented towards application caches.
A highlight was the presentation "Continuous Delivery with Docker" by Dr. Halil Cem Gürsoy. Docker is a system for managing, creating, deleting Linux containers - and looked promising even before the presentation. The live demo in particular proved to me that Docker is not only extremely fast, but also much more flexible than full virtualization. The fact that Docker or the use of Linux containers is a hot topic was proven, among other things, by the completely overcrowded room. We have already taken our first steps with Docker, and the talk has encouraged me to make running tests out of it.
Kibana 3 in combination with logstash and elasticsearch is also used by us. That's why I didn't miss Alexander Reelsen's talk. Logstash is developing more and more into a universal tool, which can achieve good results even with a lot of data. The flexibility of Kibana 3 is quite impressive - Alexander Reelsen showed how you can create very insightful dashboards in a very short time. The whole thing with a live example where he loaded data from the REST interface of meetup.com. Very interesting.
Last but not least, I attended Lennart Koopmann's talk on graylog2. Similar to Kibana, with graylog2 you can do analysis based on log files. If you have the feeling with Kibana that it happens to be able to handle logfiles, graylog2 rather makes the impression to be made especially for logfiles. The focus here is not to be as flexible, but to gather insights as quickly as possible based on log files. Also, the possibility of authentication is available in graylog2, which is missing in Kibana3.
My personal highlight was my own talk"From Ops to Platform Engineering - The story of an agile transformation using the example of otto.de", which René Lengwinat and I held. We really enjoyed sharing our experiences. In a really well-attended room, it took us just over an hour to present the agile transformation in otto.de's operations. We went into detail about our key learnings, such as:
Based on the many questions and lively discussions after the talk, I was struck by how closely we work as an operations team with software development compared to other companies. Things that we now take for granted obviously triggered an "aha" effect in some listeners. Deploying multiple times a day is not yet "state-of-the-art" everywhere. The provision of independent Continuous Integration servers per development team - provisionable from a central puppet repository - is also apparently rarely found elsewhere.
At otto.de, for example, we correlate metrics with each other. Specifically, this means that when a deployment occurs, that is a metric that is captured. In the same system, however, we also record the number of HTTP errors or CPU utilization. So we can see exactly when a deployment increases error rates. So we can find reasons faster when there are peaks in graphs. Standard tools rarely come with something like this - so we build it ourselves.
We rely on a lot of open-source tools for www.otto.de and have only sporadic support contracts. "How did you get that through to management?" was a very quick question in response. It wasn't that hard by the way, in our case it's more important to solve a problem rather than have a support vendor who usually takes time to solve a problem as well.
In summary, I felt confirmed that we do things "right" at Otto - not only with the right technologies, but also with the right ideas and the right amount of fun.
Here are our presentation slides: Shared_Nothing_But_Ops
[…] team has all those tasks unified with clojure. All those tasks were also solved well by our platform engineers at Otto, but not with such an homogenous toolset. Ali is looking forward to the fastloading feature of […]
We have received your feedback.