| Image by Sheila Joy via Unsplash Copyright-free
The main idea behind this blog is to share our experience on this matter; explain our decisions and illustrate the metamorphosis of the problem and the intermediate steps that led us to those decisions. I’m not going to provide some general guidance on “how to” resolve all possible problems. No, as we all know very well, there are no such solutions that will meet all your needs - there are always trade offs. Hence, I’m going to draw every dark shadow of consequences we have to deal with in our routine.
We had to build a transfer orchestrator which obviously (you can deduct from the name) should have a wide range of responsibilities and cover various business responsibilities. Some of them are: creating a transfer and managing its flow; collecting information about an incoming transfer; enabling cancellation process etc. Initially we didn’t have a well polished design so we had to improvise.
I have to confess that the temptation to go with well known monolith design was truly irreducible. Finally, “we are brave” we told ourselves and we looked at the so well-advertised Microservices. Before I proceed I wanted to state that Microservice design, personally to me, looks like Nuclear science. One can use it to build weapons, and another can bring a cheap power source. In the first case, there is only pain and frustration, and in the other, there is great benefit to all mankind.
This gives us plenty of new food for our brains. We slipped into a rabbit hole with all sorts of rough theories. Each discussion immersed us deeper and deeper until we agreed on few points that served us as a bottom from which we could push to start our way up. As Kate Morton said, “The way down is a breeze, but climbing back's a battle”. Don’t worry, you finally reached the point when the technical part steps in. :)
We all like DDD and had massive experience with it. We have always considered Domain-Driven Design’s Bounded Context as a guideline for defining Microservices boundaries. But then we realised our mistake. In the context of Microservices this means one simple thing: a Bounded Context is the exact opposite of a Microservice! And here is why: A Bounded Context defines the boundaries of the biggest services possible, services that won’t have any conflicting models inside of them. If you cross the boundary, those conflicting models will eventually lead to a big ball of mud. If you follow the Bounded Context directly, you will get monoliths. Those will be “good” monoliths, since there won’t be any conflicting models in them, but still, they are not Microservices. But if you decompose the Bounded Context further, you’ll find those sought-after Microservices.
That is what we followed, but also spiced this up with some practicality. We have defined a checklist for us that we must follow in order to understand if some part of functionality deserves its own Microservice (or even more!!). Please find them listed below.
- Isolating changes. By isolating changes, we decrease the risk of breaking something that has not been changed. Usually changes in one Bounded Context change independently of the other.
- Isolating static services. Do we isolate services that don’t change frequently? While some requirements change frequently, others may change so infrequently that we can treat Microservice as static.
- Isolating critical services for business. Business critical operations must be as much isolated from non-critical functionalities as possible. This will minimise the likelihood of business-critical operations being affected by problems with non-critical functionality.
- Free choice of technology. Can candidates be implemented more effectively with different technologies? Microservices enable polyglot implementations that can use different programming languages and frameworks.
- Cost of implementation of distribution transaction. Do we need to provide strong consistency for some operations? If such, what developer effort would it cost. Hence, strong consistency in distributed systems is a complex topic. Are there any atomic business operations across the candidates? It may not be straightforward to achieve with separate services.
- Security perspective. Is one of the candidates processing highly sensitive data while the other is not? The least common mechanism security principle says that it’s safer to make processing of highly sensitive data separate from other data.
- Cost of separation. All of the above looks fancy, but unfortunately we are bound to reality which is full of deadlines and time constraints. Thus, we also had to consider those points.
Now you know what are the guidelines that we followed, so now let’s see how we apply them in practice. The entire orchestrator has emerged from four separate services, each of which has been designed with well-developed functionalities and narrow responsibilities. Please refer to the images below:
And below you can find public topics used by functional domain in context of the organisation. A topic that was required to feed or to be referenced by other domains.
I’m not going to explain all of the details of each of the services because it would be irrelevant. Instead, I will explain why we accepted this level of granularity; what we gained; what techniques we used; what are the pros and cons of such a decision.
- First of all, it allowed us to test Kotlin as the language of our services. As a team, we loved it and decided to proceed with it for all of them.
- We were able to split different parts and let different members work asynchronously on different parts, reconciling contracts only. This allows us to avoid merging conflicts and “I owe you merge” developing models. We also noticed a greater personal engagement.
- Then we found a part of the functionality that would less likely change. This service was Runner, which we initially designed, implemented and well tested. Since then, nothing has changed there, which allowed us to avoid unnecessary regression testing.
- We were able to distinguish small functionalities that are easy to understand separately. Moreover, each service has its own “public” api, and that can be easily reused by other services without knowing its internal dependencies.
- In addition to all of the above, we have experienced a huge number of API change requests expressed as topics and Avro payloads associated with them. For this reason, we also wanted to split services in the way which allows us to isolate those beyond our control from changes. Let’s say that the runner or the monitor almost do not share models from the initialiser. This significantly reduces time spent on redeploying and applying changes.
- We were able to test different approaches and libraries. Some of the services use Mongo, and some rely on stream internal state stores. Moreover, we tried almost all available Kafka related frameworks, such as: Kstreams, rough Consumer/Producer, Spring Kafka, Spring Cloud Bindings, Spring Cloud Functional Bindings etc. We also had plans to consider RXJava and Camel. This allowed us to choose the most appropriate and preferred approach based on real experience instead of just theory.
- Again, we were able to introduce the Canceller after we developed all of the other components without breaking any of them.
- We also archived descent flexibility in new products onboarding. Let me explain here in more detail. We knew that the only part of the whole new design that may vary depending on product would be initialisation and validation. This variety can be expressed by different numbers of dependent aggregates, validation logic, or contract changes, or anything else. It was extremely hard to predict how generic we need to design class relationships and interfaces to satisfy all possible changes. This is why it was a predominant argument to move initialiser to a separate service. Now we can easily introduce the new initialiser as a separate service, specific for each product, and benefit from all other services, which proved to work correctly. And of course, this reduces the testing effort.
- We have to modify a huge amount of services, if some library version update comes in place. For example, we introduced a library for enabling tracing and logging standards and had to deal with applying it in all repos. It can be painful.
- We have to deal with new service onboarding process like security, access tickets etc.
- We need to document these component contracts more precisely. In the case of a single service, according to DDD, the code should be self explanatory and reflect domain relationships. And here it’s not entirely obvious which service is listening to what topic etc. And who loves to write documentation? Definitely not me!
- In some places, we had to implement some compensation logic for distributed transactions.
- This can slightly increase your maintenance costs as each service requires its own data source. Of course, we tried to reduce the amount of data to be duplicated, but in some cases it was impossible.
To sum it up. As a team, we are currently satisfied with the approach we have chosen and the way we have implemented it. Maybe because we liked advantages more than the disadvantages. We also value knowledge and experience. We all gained a lot from this. Although there is still a huge window for improvement or redesign, we hope to have the opportunity to address it.
There is no certain answer to whether Microservices are a pain or relief. I would say that the correct answer would be "depends”. This is a complex problem and requires complex answers. Personally, I believe that if you want to find the benefits that a specific solution can offer, you will definitely succeed in it.
Thank you for reading :)