This post is going to revolve around a domain specific language (DSL) which is part of one of our services. But instead of a detailed introduction to the language itself, I’m going to talk about the process we established around operating it and touch a bit on how parsing the DSL works. This seems far more interesting and the concepts I’m going to describe are more generally applicable than the language features and syntax itself. I am part of the the tracking team of otto.de: We provide APIs to collect tracking data - think web analytics - to other teams within in the company. The data is processed and enriched in our streaming data pipeline before being loaded into Adobe Analytics and other downstream systems.
MerchandisingVars:
DefaultFunction: replace($mEvar, "|", "~")
Vars:
var1: /access_path
var6: ./topic_id
var19: concat(findFirst(./assortment, /assortment), ./search_assortment, ";")
...
Events:
event1: "1"if /breakpoint_change == "true"
event2: "1"if /first_visit == "true"
As we are not a 100% on par with the official YAML specification, the mapping files are parsed by our own parser. The Scala parser-combinator library allows this to be implemented rather concisely: The actual parser is only about 200 lines of source code. A parser combinator approach encourages splitting the parser code into small, composable functions. The library provides the “plumbing” to stick those functions back together to build a powerful but comprehensible program. For example this is how the parser of the “replace” function looks like:
/**
* Parses the replace function: 'replace(./foo, "ü", "ue")'
*/
def replaceFnParser: Parser[ReplaceFn] = {
"replace(" ~>
(expressionParser <~ ",") ~
(stringLiteralParser <~ ",") ~
(stringLiteralParser <~ ")") ^^ {
case expression ~ substring ~ replacement => ReplaceFn(expression, substring, replacement)
}
}
The example shows how easily multiple small parsers, for expressions and string literals, are compiled together to form a larger more complex parser.We chose to use parser combinators because it is a good fit for our usual programming model: functional, readable and highly composable. Parser combinators are ahead of seemingly simpler solutions like regular expressions, which are neither easily composed
nor very readable. We specifically opted for the Scala parser combinator library, because it is a mature implementation, which even was part of the Scala standard library before becoming a community maintained library. Scala being the team’s primary programming language made it easy to integrate the library, there was no need for any intermediate representation format. The parser reads in the mapping files and parses them directly into Scala objects when the service starts up.
Once the service starts consuming input messages, those rules, now encoded in objects, are applied to the incoming messages by an interpreter. The interpreter encapsulates most of the services business logic in a component separate to the DSL parser, which again favours testability and maintainability. Zooming out farther, this interpretation is done in a single pipeline step in a kafka-streams based service.
This gave a rough overview on how we integrate a simple DSL into our operations process. I also highlighted some of the advantages of parser combinators and how they fit into our programming model. There are of course other approaches to such problems. For example I could imagine a graphical interface, which lets users configure how data is to be mapped from one format to another. Though I doubt this would have been a sustainable solution for our backend-heavy team. We established a solid foundation with parser combinators and Scala, which we can now build and optimise upon.
We have received your feedback.