A Journey from Component Teams and a Release Train to LeSS
(by Gordon Weir)
It all started in 2005, after I joined UBS I was at a bit of a loss. I had been working on very large system integration programs at IBM, most of them would be described as challenged or failures. Rather than making a career out of it, I wanted to leave, and I was convinced there must be a better way to build software systems. Little did I know then, just how this one thought would transform my life and the life of so many people around me.
Whilst at my last project for IBM I had heard of this thing called RUP, I had also stumbled over a thing called DSDM. Honestly, I had never heard of this Agile thing. Another thing that happened to me is I had met a more enlightened person on one of my projects, who introduced me to the wide world of actually learning how to do your job. I was swallowing books such as Mythical Man Month, Death March, Peopleware, Waltzing with Bears, Disciplined Software Delivery and many many more. It was like a lighting bolt for me. People have done deep research on this topic, and what is more. Nothing has changed in so many years.
I was asked to run a project to support a thing called Asset Services, I had no idea what that actually was. I gathered a few of the team together, having no idea how budgets and finance worked in this place, more people than I had money for. We were all determined to do it right, proper analysis and design modelling driven from Use Cases, deep engagement with users, we had a very strong stakeholder, who knew the solution he wanted, just no what it was supposed to do for him. Interesting times. I was of the mind set, “this software will take as long as it needs to, we are all about doing it RIGHT”.
Unfortunately the real world got me first. Dividends processing hits the busy season from March. If the application was not implemented before then, it would have to wait until July or possibly August. That would not do. We thought hard, and one of the guys in the team said, “why don’t we just implement the first Use Case, it gives them a lot of what they need; they no longer have to print out SWIFTS and put them in dated filing cabinets, it does most of the calculations for them, just not the posting and automation that will really make a difference”. This got us to thinking, hey we could implement this and then build more function out over time, implementing the second use case and then the third later. We time boxed, and we had what we thought was a an achievable scope.
We got it in, and post implementation we had meetings with the key users every day, we started implementing enhancements every week. This is did as well as the larger scope of the two additional Use Cases. Our first, extremely successful foray into iterative delivery, and it was the most fun any of us ex-consultants had every had delivering a software system.
Around the corner was a very large program of work, the replacement of the existing IBM Mainframe based processing system, which ran COBOL and Java on it. The motivation was to remove the considerable cost, inflexibility and knowledge risk (it was originally an ICL mainframe and relied on developers skilled in ICL’s Application Master language). The success of the Asset Services delivery had repaired confidence in our ability to deliver. I wanted to do what we had done and started doing some research, imagine my surprise when I discovered this stuff had a name. Iterative and agile development.
And what’s more the iterative, incremental, and use-case-driven approach we had used was well documented in a book by Craig Larman, “Applying UML and Patterns”, this book became our new bible. We also talked to Ivar Jacabson consulting who were just setting out after splitting with Rational after the IBM merger, they were developing a lightweight RUP method, EssUP - the Essential Unified Process. We liked it, it was a structure around what we had done.
Combining the iterative, incremental, use-case-driven approach from Larman’s book and the component-centric and “mini-waterfall” ideas from these other systems was a good place to start. But looking back it maintained our “culture” of separation. We had a Business Analysts producing Use Cases and Domain Models, this was being translated to architecture models by the “design team” and implemented by my team at the time - the developers, followed by QA and cycles of testing. It still makes me laugh, at one time there were 50 people on the project and of that number a grand total of 6 developers building the solution!
After two years the new Java solution was well under way, a good amount of the business flows were migrated, but there was still a long long way to go. The group had grown significantly, we had been using the Unified Process and we were finding that it was no longer fitting the context of the work we were doing. Many parts of the new system had solved it’s technical risk, and had a pretty solid architecture. We needed to scale and deliver, not go through Inception and Elaborations.
We adopted a model which we invented ourselves, based on some scaling ideas we had read about. It was a component-team-based model, with core components acting like a service delivery to the business applications. We included the existing mainframe as a core component as it was still performing the majority of the processing and control functions for the business. Every new flow or feature in the new system, required changes in the mainframe. We implemented “release trains” which were 4 two-week iterations in length. Where the final iteration was what we called a “stabilisation” iteration. We had daily cross-component meetings, to try and co-ordinate the work that was being done. We still had separate analysis and testing functions, although we were getting them closer and more integrated with the programming teams. None the less, conflict and frustration was common, “it’s the developers fault, they have not fully implemented my use case”. which was normally responded to by, “the use case is no use to me, it cannot be implemented” more often, “this analysis is rubbish”. At this time we also started to experience something we had never seen before, serious production issues, memory leaks, corruption … it started to get quite scary. Something was wrong. But we could not see what it was. Probably because I was very proud of the delivery model I had created, it seemed so logical. So proud that I talked about it at a QCon In Finance conference in London.
Then, one of my team members told me Craig Larman would be in London at the start of 2008…
Because of his book UML and Patterns I was keen to know what Craig thought of the delivery model we had created. I wanted to show the guy who started it all for us, that we could really do this stuff. Imagine my surprise and slight disappointment when we talked to him on a conference call and, after we had explained in detail how we were working, and were waiting for his response and maybe a few ideas how to “tweak it”. He pointed out all of the failings in it. He knew without digging too deep that we had plateaued, we were going a lot faster than before but could not get more out of the organizational system. Additionally, he asked if the quality was starting to drop and the organizational system failing. Some people were working incredible hours, in some cases during the “stabilisation iteration” they were working through the night.. HOW DID HE KNOW?
He then explained the problem with component-team-based delivery models: hand-off complexity, knowledge scatter, local optimisations, complex planning with dependency and coordination management problems, function going into the wrong place, duplication, and parts of the system which were causing bottlenecks that we could not see (with our limited measuring capability, at least).
Well, frankly, this was not what I expected. Initially I was not happy. I was at that stage of my Agile and Lean learning, 3 years in where I knew everything. I’d read dozens of books, had industry experts in such as Ivar Jacobson, Dean Leffingwell, Scott Ambler, and many others. I had talked to people across the bank about how good we were at this, I mean, how dare he?! I was the expert… err wasn’t I?
The problem was, the more I thought about it, the more he made sense to me. It was not working, it was hard work. Our cross-component stand ups were turning into hour to two hour “sit downs” there was so much inter-component and application complexity to deal with. And no matter how detailed our planning, people just wouldn’t stick to it. Our idea of integration testing at the end of each two week iteration never, and i mean NEVER worked. The “release train” wrecked. And there was always one component or sometimes two that just weren’t ready. And the .NET GUI… well, that was never in line and was always being finished off, actually it felt like started during the “stabilisation”, the view was it could not do anything until the underlying data was ready.
After digesting this for awhile, I decided that change was definitely necessary. Tweaks - kaizen - had reached it’s limit in this model, a radical overhaul was necessary. And what Craig had been saying, did actually make sense, to have cross-component feature teams that were also cross-functional with analysts and testers integrated with the teams. Crazy. We asked him to come and do a week of training, his 3-day CSM+ course on Scrum and Large-Scale Scrum (LeSS) and 2 days of specific leadership coaching and organizatioal design consulting and “showing us how to do it” in real workshops. I strongly believe that training on its own is nearly useless, learning by doing and practical application is essential.
After the 3 days it started to make sense, I cannot imagine doing that training having not gone through 3 years of this like we had. The core concepts we understood, the deeper lean thinking and practices we had never been exposed to. A new cult of Lean reading and learning happened, “The Toyota Way”, “The Machine that Changed the World” appeared on desks…
We did a new cross-component cross-functional team-forming session to create feature teams, aligned product ownership, and started. We did not make a big deal of it, just did it. We knew the concepts would be a hard sell outside of our immediate technology family.
The change was dramatic by moving from component teams to cross-component teams. Literally within one month our delivery speed was up beyond what we could ever have hoped. We moved from one release every 8 weeks (within the older release train model) to releasing two weeks. No more train; just create and ship a completely done product every 2 weeks.
I was reminded of something I had said to my 6-person development team nearly 3 years prior, “I want to get to a release every 2 weeks”, at a time where we were doing a release every 6 months, and 6 week iterations, they laughed so hard they could have hurt themselves…
And as Craig predicted, all this exposed serious organisational weaknesses: we did not have enough capacity in the GUI team, and the different technology type was hurting us. Our initial solutions as to put Visual Studio on everyone’s desktop, and allow the GUI to be evolved along with features, this was a big mind shift for people and have you ever run Eclipse and Visual Studio at the same time? Try our remote teams doing via a Citrix VM at 130% capacity… What we ended up doing is moving to a lightweight GUI in GWT. We also did not have enough skills in Business Analysis, we ended up evolving a robust and easy analysis techniques, so we started using Gojko Adzic’s excellent specification-by-example model and we bought Smart Boards so we could jointly create them with users, and remote teams. Allowing our limited BA’s to work more on a consulting and support basis than as the only producers.
We loved the feature-team concept, where everything lives in a team and they own delivery from inception to production (and support). At a team level this works exceptionally well. What we did find is that teams become tribal; they start to believe they are the only team who is any good, and all the issues are some other teams fault. They develop a personality - which is both good and bad.
For a long time we didn’t apply Craig’s advice in LeSS to focus on agile modeling for design and architecture in design workshops, and to seriously pay attention to overall architectural design with a hands-on architecture community of practice providing real architectural leadership. Instead, teams started to work in a way that I call “hand to mouth”, a naive belief that the architecture will evolve as you go, especially if you are using Acceptance Test-Driven Development. At a scale like this, up-front thinking about architecture and design is essential, which is something Craig had emphasized in his “Applying…” book and his coaching on adopting LeSS. Otherwise you start to find the same function built by different teams in different parts of the architecture, nicely tested, but with different test cases deployed in different ways. Heroes start to emerge in the teams who spend most of their waking lives trying to keep everything clean and aligned, they become villains as they are not aligned with the team based culture of “team is first, not individual”. It has taken me a long time to understand why this happened and why it was important for the success of the program that it did.
I now believe that the need for good structured design thinking at an enterprise level at this scale is more than just essential; it’s fatal without it. To this end, I also believe that a team similar to those described in the “New New Product Development Game” that are truly cross functional and multi-skilled is needed to drive the product and the “delivery factory”. You need a team with the organiational navigation skills, deep architecture and analysis, and development know-how that shapes the product and vision that is adopted and implemented by all the teams. This team is hands on, and will code, and lead from behind.
LeSS is a framework, one that helps us implement Scrum at a scale it was never intended to be. The deep thinking that goes with it is what is most important, not blind adoption of the practices of Scrum (doing that is a good way to learn). It does come with a warning that is often missed, a term my very good friend Dan North coined in a talk we did together recently. “Caution: May Contain Thinking”. And that’s it. The context that you have; the business, the technology, the people, the routines and the leadership will drive you to use different practices. The one I suggest in this story is for very large scale enterprise solution delivery. That may not be what you are doing, and isn’t for me in my current context. But the thinking, the principles, the experiments, they are what you get from LeSS, take them—and think.