1
 
 
Account
In your account you can view the status of your application, save incomplete applications and view current news and events
September 23, 2016

Use of a domain-specific language

What is this article about?

The software that handles transformation and data delivery at OTTO is called ProPHET (Product Data Partner Feed Handling & Export Tool). This article shows the advantages of switching to a domain-specific language (DSL) to describe the transformation of this data.

When exporting store data to price search engines, operators make precise specifications as to the format in which the product data must be submitted. For example, one operator wants the item price as a number in cents, while another wants it as a string with a comma and a euro sign. There are also very different specifications for specifying deliverability, shipping costs or images.

The software that handles transformation and data delivery at OTTO is called ProPHETProduct DataPartnerfeedHandlingExportTool). This article shows the benefits of moving to a domain-specific language (DSL) to describe the transformation of this data.

Previously, the workflow here was structured so that the preparation of the data was done in chains of functions. First, to pick up on the above example, a function was created that removed special characters. This function had an output attribute, which in turn could be used as an input attribute for the next function, which generated a decimal number (29.99) from a number (the eurocent, e.g. 2999). Another function generated a string from this according to German format ("29.99"). Then there was another function that appended the Euro sign ("29,99€").

The original design focused on the reusability of each step of this process. Each function generated an output attribute from an input attribute, which was also available to all other functions.

However, the system had a few weaknesses:

  1. It creates a vast number of attributes that actually only represent intermediate steps and are only used for a single process chain at all.
  2. The number of functions increases more and more over time. As a result, clarity naturally suffers and more and more functions are easily created that differ only minimally and the overview, even for the experienced business user, is lost.
  3. The maintenance and chaining of these functions with each other had to be done one after the other in a UI, which made editing very tedious (create function1, create function2, then link functions...).
  4. All intermediate functions have to be calculated one after the other (for the OTTO assortment it is a function on several hundred thousand items). Since the functions/steps are simple in their calculation, they do not require much computing time, but persisting the intermediate results does require time and memory. In particular, the need for RAM increases more and more, since the data must be kept completely in RAM, if possible, in order to reduce even slower accesses to the mass memory.
Function execution before the introduction of DSL
Function execution before the introduction of DSL
Previous ProPHET function calculation result
Previous ProPHET function calculation result

The third point could certainly have been countered for quite a while with more hardware. But clarity and usability are also fundamental requirements for a system, and so the team came up with the idea of giving the employees who are responsible for maintaining the exports to the individual partners a different approach.

Here is an example of function creation according to the old system. For each individual step in the transformation, a separate function must be created, saved, and then linked to the other functions using input and output attributes. Each of these actions in turn requires roundtrips to the server for saving and editing and picking the created attributes for the next step.

Function creation according to previous logic
Function creation according to previous logic

As a solution, the team proposed a domain-specific language (DSL) that would enable the department to transform the data in a clear editor. After initial preliminary considerations, we decided to present the idea to the department at an early stage in order to obtain feedback.

The initial skepticism there, when it came to using something like a programming language, we countered by pointing out that the things they were already doing there with Excel on a daily basis went far beyond simple programming. And we had no doubt that they could handle it.

After some preliminary considerations, the decision was made to implement it with Groovy, because custom parsers can be created very easily here and it integrates very well with the rest of the system, which works in the JVM.

In the frontend, CodeMirror is used, a flexible code editor written in JavaScript that can be easily configured and extended. It offers code completion, syntax highlighting and syntax errors can be easily marked in the edited code and provided with a meaningful error message.

For the computation of the functions these are first of all precompiled and bound for the execution at the respective product to a new context, in which all variables are available, which need them for the computation. Each DSL script returns exactly one result.

The DSL scripts now use the control structures that Groovy also already provides, such as if, else, etc. Except for string and arithmetic operations, all commands and libraries that are not required are blacklisted and thus not available in the first place. Further own DSL functions are defined, which are technically necessary for the transformation of the data, like GREATER_THAN, EXTRACT_ALL, SUBLIST, GET_LIST_ITEM, MATCHES, REPLACE , etc.. These are based on the notation of the Excel macro functions by name.

The functions written in the DSL are now passed an object for processing, in which all relevant information about a product is stored. The information can be accessed in the scripts normally via dot notation (Bspattr.var_descriptiontouse the description of the product).

The DSL functions within the script automatically use the result of the previous function as an input value to itself. Only in cases where this is not desired, either a constant is defined, or a value from the passed object with the product attributes is used. This increases the readability and clarifies the program flow.

CONCAT("Hello ", "World") // "Hello World" will be the current state REPLACE("Hello", "Moin") // "Moin World".

Within a script all necessary transformations are now performed to create the target attribute. Further intermediate steps are no longer necessary. For efficiency in processing, the Groovy shell object is reused and the functions are kept compiled. Thus, only the context must be passed and the result must be retrieved.

For example, a DSL script could look like this:

// Does the product have a name? if (NOT_EMPTY(attr.var_name)){ // "Testmarke® Notebook, Elektron // Replace all characters except word characters, whitespaces, periods and commas // The result of the REPLACE will be the current state REPLACE(attr.var_name, "[^\w\s,\.]", "") // "Testmarke Notebook, Elektron // Split the current state value at whitespaces into a list EXTRACT_ALL("[^\s]*") // "Testmarke|Notebook,|Elektron," // Take the first entry from the list GET_LIST_ITEM(FIRST) // Testmarke }

For both transparency of operations and flexibility, this represents a huge step forward. In addition, the language can grow along with the requirements of the business department on this foundation.

Now entire transformation is configured in a single script in the editor
Now entire transformation is configured in a single script in the editor

This changeover of the system prevents the proliferation of functions that are chained together. Currently, about one fifth of the attributes that were previously generated are still calculated. Correspondingly fewer have to be kept in the main memory and persisted in the database storage. Each transformation is now self-contained, which also virtually eliminates unforeseen dependencies. The employees have a convenient editor in which they can perform their transformation from start to finish and, with a preview function on any article, can immediately see the result, which enables comprehensive functional testability.

The enthusiasm of the specialist department during the presentation and briefing on the new DSL was then proof for us that the project was a success.

0No comments yet.

Write a comment
Answer to: Reply directly to the topic

Written by

Jan Henning Becker

Similar Articles

We want to improve out content with your feedback.

How interesting is this blogpost?

We have received your feedback.

Allow cookies?

OTTO and three partners need your consent (click on "OK") for individual data uses in order to store and/or retrieve information on your device (IP address, user ID, browser information).
Data is used for personalized ads and content, ad and content measurement, and to gain insights about target groups and product development. More information on consent can be found here at any time. You can refuse your consent at any time by clicking on the link "refuse cookies".

Data uses

OTTO works with partners who also process data retrieved from your end device (tracking data) for their own purposes (e.g. profiling) / for the purposes of third parties. Against this background, not only the collection of tracking data, but also its further processing by these providers requires consent. The tracking data will only be collected when you click on the "OK" button in the banner on otto.de. The partners are the following companies:
Google Ireland Limited, Meta Platforms Ireland Limited, LinkedIn Ireland Unlimited Company
For more information on the data processing by these partners, please see the privacy policy at otto.de/jobs. The information can also be accessed via a link in the banner.