1
 
 
Account
In your account you can view the status of your application, save incomplete applications and view current news and events

What is the article about?

In the inc(AI) team, we always try different tools and methods. The slack time has proven a good means for this, as we learn on our own or together with others.

Of course, the motivation for such slack time originates from the challenges we have to meet at work. Our strategy is to first of all look at the open-source community to find solutions that suit us. The background: The probability is very high that other people already have solved a similar problem. Moreover, the solutions should fulfill our criteria: be simple, scalable and production-friendly.

So, we examined a new tool called Dagster in our slack time a while ago. Dagster is an orchestration tool that controls and manages various tasks in the area of data and ML pipelines. Dagster controls where and when the different steps of a pipeline will be carried out and stores meta data along the way.

Abb. 1: Data Lineage eines Use Cases
Abb. 1: Data Lineage eines Use Cases

Abb. 1: Data Lineage eines Use Cases

The use of Dagster contributes to OTTO's moal (mid-term goal) of "effective and efficient organization". In our area in BI, we focus on "best practices for leveraging technical synergies".

This is a standard task of our day-to-day work. Comparable tools were available also before Dagster, and they are used in BI with their different specifics. However, Dagster does quite a few things differently. For example, Dagster introduces the concept of "software assets", which improves the structuring of pipelines, and increases reusability. Moreover, the data lineage can be visualized in a simple manner – a major advantage when it comes to troubleshooting.

Abbildung 1: RLHF (Quelle:https://huggingface.co/blog/rlhf)
Abbildung 1: RLHF (Quelle:https://huggingface.co/blog/rlhf)

Fig. 2: Meta data of an asset

Dagster can be easily installed and run on a local computer. This property is advantageous especially when a pipeline is developed, because it significantly shortens test cycles. A Dagster cloud instance then ensures productive operation. Dagster's web user interface is very user-friendly, so that even non-technophile people can operate the production line.

There is no simple path without obstacles:


Of course, we had some difficult moments also with Dagster. One example is the nature of open-source software: Very fast development results in incompatibility and instability. We jointly discuss in detail to find out what change or fix would be needed for our project. After that, we share our findings. 

Once we had successfully tested Dagster also in our internal projects, and had handed over Dagster to individual teams, we were able to successfully migrate an existing use case from Argo to Dagster together with team Warp. 

At the moment, several teams are evaluating the use of Dagster for their use cases, and interest is growing. So is the group of people who exchange their views on Dagster, and support each other in their day-to-day work. This is a double hit for inc(AI): We create a solution for us, and also for the other teams.

Want to become part of the team?

11 people like this.

0No comments yet.

Write a comment
Answer to: Reply directly to the topic

Written by

Christian Kalla
Christian Kalla
Senior Machine Learning Engineer (Otto BI)
Jürgen Jäger
Jürgen Jäger
Senior Business Owner (Otto BI)
Konstantinos Stavropoulos
Konstantinos Stavropoulos
Senior Data Scientist (OTTO BI)
Tobias Krause
Tobias Krause
Senior Software Developer (Otto BI)
Tung Dang
Tung Dang
Senior Software Engineer (OTTO BI)

Similar Articles

We want to improve out content with your feedback.

How interesting is this blogpost?

We have received your feedback.

Allow cookies?

OTTO and three partners need your consent (click on "OK") for individual data uses in order to store and/or retrieve information on your device (IP address, user ID, browser information).
Data is used for personalized ads and content, ad and content measurement, and to gain insights about target groups and product development. More information on consent can be found here at any time. You can refuse your consent at any time by clicking on the link "refuse cookies".

Data uses

OTTO works with partners who also process data retrieved from your end device (tracking data) for their own purposes (e.g. profiling) / for the purposes of third parties. Against this background, not only the collection of tracking data, but also its further processing by these providers requires consent. The tracking data will only be collected when you click on the "OK" button in the banner on otto.de. The partners are the following companies:
Google Ireland Limited, Meta Platforms Ireland Limited, LinkedIn Ireland Unlimited Company
For more information on the data processing by these partners, please see the privacy policy at otto.de/jobs. The information can also be accessed via a link in the banner.