Are you interested in data and cloud technology? Would you like to understand more about different roles in development teams? Are you looking for a new job? This blog post is about our way of working and our job profiles at Otto Group data.works.
Today, meet Sven Peeters, Data Engineer at Otto Group data.works since February 2021.
In short, as a Data Engineer I am responsible to design and implement new features for different data-driven services. Our products range from computing customer profiles in a large scale to hosting HTTP services which categorize content of webpages using machine learning. The data we provide is mainly used to enrich programmatic advertising with behavioral and contextual information.
"I am responsible to design and implement new features for different data-driven services."
Knowing that the interpretation of role profiles and titles can vary in different companies, we want to give a rough overview on how we define these roles.
I am most proud of the rework of the full text search backend of our contextual targeting product. Due to heavily increasing load on the product our old PostgreSQL backend was stumbling and reached the end of scalability. Despite me being a Junior Data Engineer at that time, my team enabled me to take a leading position in the design and development of the new backend. This included choosing Apache Solr as our new full text search engine, designing the cloud architecture, and migrating or rewriting affected components of the old backend. The new backend has been online since April and runs like a charm.
In our daily work we use many recent and impressive technologies like Kubeflow Pipelines, Argo or Apache Beam on Dataflow. Hence, choosing one that stands out is not easy. If I need to select one as my favorite, I would choose Dataflow in conjunction with Java. The serverless approach of Dataflow and the pipeline semantics of Apache Beam which unifies batch and stream processing offers me the opportunity to write Java code that processes a huge amount of data without spending hours to think about how I can scale the program and distribute the processed data over machines effectively. Jobs that would run days on a single machine are seamlessly executed in some minutes on over 100 machines in parallel. It’s always fun to see how my code gets distributed and executed in parallel over a huge number of machines in the Google Cloud Platform.
In my free time I like to play table tennis with friends in my local table tennis club. Besides playing table tennis, I have a huge passion for motorsport. I watch almost every session of each Formula 1 weekend and love to spend some time on virtual motorsport circuits.
Otto Group data.works offers me the opportunity to pursue and learn a lot about my passion in big data processing and machine learning while developing smart and cool large scale data products in a young and dynamic team of data enthusiasts.
This article was originally published on Medium and can be read there as well.
Want to be part of our team?
We have received your feedback.