How do you prioritize the work to break a monolith into microservices against other work (features, bug fixes, etc)?
Wednesday, November 7, 2018
We discussed the problem of reducing SPOF in a container platform.
Any 1 system has at least 1 SPOF (direct like a machine, or abstract like a cluster). A good system has a SPOF that (A) causes a minimal functionality degradation, and (B) is easily replaceable.
Corollary 1: most work should be owned "below" the SPOF.
Corollary 2: subunits should be able to approximate as many of the SPOF's functions as possible. For example, subunits should be able to directly receive traffic if traffic is directed there, rather than requiring the central load balancer to be able to function.
Corollary 3: state in SPOFs should be avoided. State is anti-portability and therefore anti-replacement.
We validated the federation model (of distinct functional clusters with a central control plane abstraction). This model has been used in many platforms (Borg, Kubernetes, Titus, etc) with mixed usefulness.
We identified a key weakness of federation usability is lack of federation-wide awareness. For example, some workloads we want spread out, and others we want tightly knit or close to something else. An abstract concept of topology could be used for workload affinity or anti-affinity. This could help replace much deployment logic for administrators.
* We have a database that we wish to spread between clusters (cluster anti-affinity)
* We have an app that we want to serve to global users (geographic anti-affinity)
* We have a job that we want to schedule close to a specific service (workload affinity)
* We have a mapreduce job that we want to keep close together (geographic affinity)
In all of these cases, the typical solution is to have an administrator choose a specific cluster (or set) to deploy to, despise having no strict reason to deploy explicitly (and only) there.
More technical and use case research is needed.
PS - I'd be happy to put contact information out for anyone working on functionality like this.
Tuesday, November 6, 2018
Monday, November 5, 2018
I had a question about a puppet master who was holding my department back. It was suggested that this is a business issue, and that person should have to leave. I could start an open and transparent discussion about important technical issues, possibly with voting. The visibility of the decisions would help with accountability. It would also help building support for my issues, so that the process could bootstrap itself.