Is naming things a hard problem in Computer Science? Those of you familiar with Betteridge's law of headlines may already have guessed the answer. The others can look it up in Wikipedia. (Spoiler alert!).
As widely known, there are only two hard problems in computer science. Namely they are:
This post is the humble announcement, that in my team we solved half of them.
It might even be the most exciting breakthrough since Columbus solved the infamous problem of balancing an egg on its tip.
Columbus' egg. An egg can never be be balanced on its tip. Except when breaking it. Or placing it in a small heap of salt. The problem becomes less pressing after the egg is eaten.
Let's have a look at the problem at handNaming thingswhich is widely considered to be not only hard, but one of the hardest problems there is.
I think it is undisputed: When an entity (a symbol, a function, a namespace, an artifact, a library) is falsely named, it is annoying in many ways. Maybe most importantly it is misleading when reasoning about the entities and their interactions.
To qualify as one of the hardest problems, however, it has to be quite annoying and also there should be no simple way to fix it.
I'm going to distinguish between bad naming within an artifact (read: in code) and the naming of artifacts (like the name of a service) identifying them towards the outer world. I am going to make the case that - while both are annoying for sure, the former is not a hard problem - at least not anymore - because it was solved eventually. The latter is not a hard problem because there exists a trivial fix for it.
When thinking about naming and the pain related with it, the first thing that springs into my mind is my heavy-with java past.
Have a look at this code:
https://gist.github.com/stammi/6ba477caca338426660c15f49aa0f3f6
Does it have naming problems? Plenty. Did I make it up? Yes, admittedly. But it's not to far fetched, either. My past self produced code like this for a living. A lot of it. Having to work with a larger accumulation of such code is a major pain. Reading it alone is exhausting. The sheer mass of repetition makes the brain hurt. It might not even fit most screens. All this is true even when assuming all the names are basically correct.
Ok. We all know that Java has - how do I put this politely? - verbosity issues. But equally ugly code can easily produced in a more elegant language like Clojure:
https://gist.github.com/stammi/66b1f9da3eddbecb362d457ae60cffd8
Did you notice, I used function pointers instead of instance fields? Can you imagine how much this alone simplifies writing unit tests?
Renaming of an code entity like a symbol, function or namespace might once have been a hassle. Given todays IDEs, however, one can hardly call it a hard problem. You just have to select the rename refactoring from the context menu. Most likely renaming is one of your best memorized keyboard shortcuts (for me thats shift-F6 in Cursive)
What? That's an unfair comparison to the java example above. Ok, thats a fair point. It works just the same way for less elegant languages, too.
A pair of programmers might rename the same entity several times in a few minutes. With every renaming the pair will gather deeper understanding of the entity's true nature. Even when you prefer programming on your own, you will often close in on an unknown piece of code by refactoring it for understanding and renaming entities in much the same way an exploratory pair would do.
You will have to agree: In this case naming things is hardly a problem at all. Rather it is an absolutely worthwhile technique to acquire insight into a software
Once the code is compiled to deployable artifacts, bad naming of code entities becomes invisible, thank goodness. But naming problems don't stop just there. Do you give your services names such as'article-similarity-data-preparation', 'article-similarity-datamart-preprocessing'and'article-similarity-model-training'Clumsy as as it is, this naming scheme can even be a step ahead from one where you would prefix each name with a teamname and postfix it with'service'So the conversation about the data flow between'myteam-article-similarity-data-preparation-service', 'myteam-article-similarity-datamart-preprocessing-service'and'myteam-article-similarity-datamart-preprocessing-service'quickly becomes painful and tiring. The names are hard to recall, laborious to pronounce and easy to mix up. See this very paragraph as a testimonial on how badly they can clutter a piece of text.
An artifact, say a microservice, leaves its name in many places. Examples are: the name of the git repo, the name of the package, the name of the continuous integration pipeline. The service's URL of the deployed service alone is usually stored in many places from loadbalancer configurations over the config and code of consuming systems to users' browser caches. Not the least place an artifact's name resides in is the brains of the people working to develop that same artifact.
When you recognize an artifact was named wrongfully in the first place or that its nature changed in the course of development, it basically boils down to two options: Either you have to change the name of the artifact, or you continue to live and work with an artifact which has a - slightly or severely - wrong name.
Choosing the former option, you will likely forget some of the places and you will produce at least some inconsistencies where the old name is still valid. This pain will be alleviated by the degree of automation. Even with our high degree of automation, though, the problem is still real and I would suspect it may never fully go away. This makes renaming artifacts much harder than entities in code. So just letting the names evolve with the thing does not really seem feasible.
The latter option will continuously mislead and puzzle you, your coworkers, and especially any newcomer trying to figure out the what the artifact's nature and purpose might be. Both options are technical debt in essence
For your random artifact, here is the simple solution: Don't try to find descriptive names in the first place! Give it a random name. In our team we now deploy 50 or so artifacts. Our naming scheme changed quickly, including but not limited to the world of Bud Spencer films, Scientists, Star Trek, The Hitchhikers Guide to the Galaxy and Politicians. The only rule: The less related to the purpose of the service the better. Does that work? Yes! It works amazingly well.
As it happens, it is very easy to talk about data prepared by'zaphod'for further processing in'adenauer'and final consumption in'berkson'The human brain is well prepared to use names this way.
"Hey, did you know our friend Konrad does not live in Hamburg anymore? He moved to Copenhagen. Also he has a dog now." Thank goodness, we memorize him as'Konrad'and not as'that-Hamburg-based-bloke-not-having-a-dog'Otherwise we would now have to rename our friend causing much of the very same pain we would have renaming evolving artifacts.
It is true. We tend to forget people's names. Especially when they move out of town and become less important in our lives. But we are also perfectly able to pick a forgotten name up again quickly if the need arises. It is the same for artifacts. There is no need to remember the names of 50 or so artifacts. What counts is that we are able to talk about those that matter without difficulty.
Libraries are different. I think you can go with the same random naming scheme and pick names that are not directly linked to function aero and ... sprif to mind). If necessary, release a new library under a new name. It is ok to discontinue an old one. Changing libraries and not renaming them is evil. (Yes Gnome I am looking at you now)
That's it. One of the, allegedly, hardest problem in our trade is not only solved, it just vanished into thin air. Myth: Busted!
Much like the egg problem, you can spend a lifetime of futile fiddling finding perfect names for the things you are developing. However hard you try, eventually the egg will topple and you have to start over. Much like the egg, the solution is breaking it: Just find the least meaningful name and you are done.
Find here for you contemplation a painting by the late swedish post impressionistic painter Nils Dardel It shows Christopher Columbus having solved the famous egg problem and his contemporaries going complete bananas over the fact. I guess they all had spent a lot of time balancing eggs. One may well be impressed by Dardel's level of historical accuracy. While Dardel himself was well aware of face palming, it is not happening in the painting. It was only invented about two centuries after Columbus' death in the early 1720's by one Martin Sonneborn.
We have received your feedback.