Automated testing

Development of unit tests before the code following the RED/GREEN/REFACTOR phases

Automated testing

What is TDD?

Test-Driven Development (TDD) is an extension of plain old Unit Testing (UT), one of the many Shift-left testing strategies. It mainly consists of writing unit tests first as per Kent Beck’s framework named Extreme Programming (XP) [Beck 2004]. The 1st step of this “Test First” (TF) approach starts with a failing automated test before changing any code. Once the new UT fails, the Developer provides the minimal piece of code to make the test pass. Then code refactoring may take place before adding a new UT. This cycle is named “Red/Green/Refactor” [Beck 2002]:

Kent Beck’s Red/Green/Refactor cycle for TDD [Beck 2002]

This Red/Green/Refactor cycle helps in maintaining the TDD practice [Tarlinder 2016].

Usually, Developers dislike testing because they represent testing as a dummy visual check. UT is:

  • Mainly a development activity
  • AND a testing process that delivers both validation and regression testing.

But doing all those things at the same time is not enough because development is a compelling activity and Devs get easily lost in coding [Beck 2004]. It then reduces the testing effort while the TF approach kind-of guarantees testing occurs. 

In a refactoring or code change context, testing first is actually a healthy activity to make sure that regression will not occur. If testing occurs once the change is done, test results may be altered by the new code. Moreover, TF also avoids [Beck 2004]:

  • Scope creeping - if a piece of code is introduced “just in case”, a new test is generated
  • Coupling - strongly coupled units are hard to test, testing first thus prevents this issue
  • Losing confidence in the existing code
  • Getting lost in coding

Along with the TDD cycle, Uncle Bob, the Pope of Software Craftsmanship, provides the “3 laws of TDD” [Martin  2014]:

  1. You are not allowed to write any production code unless it is to make a failing unit test pass.
  2. You are not allowed to write any more of a unit test than is sufficient to fail; and compilation failures are failures.
  3. You are not allowed to write any more production code than is sufficient to pass the one failing unit test.

To write a UT that fails, the Developer may use any of the Red Bar Patterns, notably [Beck 2002]:

  • One Step Test”: a single idea that would teach something not obvious on the code
  • Starter Test”: a UT that would introduce a quick Red/Green/Refactor cycle; therefore to understand what to test, a simple UT based on trivial inputs/outputs of an API is a good start to provide a feedback within few minutes
  • Explanation Tests”: UTs that would explain behaviors or an example on how to use an API - this helps spreading the use of automated tests
  • Learning Tests”: UTs that would experiment a new 3rd party component by using it with more and more tricky usages of the component
  • Another Test”: Whenever you have an idea aside a discussion, just add a test to a list and get back to the conversation instead of going off the topic; since, this would break the law #2, it should not be coded
  • Regression Test”: the smallest UT that would fail from a reported defect in a Defect-Driven Development

Once the unit tests fail, so-called Green Bar Patterns can be involved [Beck 2002]:

  • Fake It (’Til You Make It)”: simply return the expected value
  • Obvious Implementation”: simply code what is obvious and apply some critical thought on “how obvious” the required code actually is
  • Triangulate”: when unsure about the correct abstraction of calculation, adding an extra UT to test the same code provides a path to transform gradually the constant with variables and code and use UTs to ensure there is no regression
  • One to Many”: when it comes to handle a collection of objects to handle, a single element will be handled and, when it works, the code must be adapted to make it work with the whole collection (and a empty list as well)

At Refactoring time, several patterns can be involved, notably [Beck 2002]:

  • Reconcile Differences”: unifying two similar looking pieces of code
  • Isolate Change”: isolating the part that has to change
  • Migrate Data”: duplicating temporarily data to transition from old to new format and remove the old data formats
  • Extract Method”: turning a small part of a method into a separate method and call the new method
  • Inline Method”: move methods contents where they are invoked to gather the code

There are some more refactoring techniques shared by Martin Fowler that can be looked at [Fowler 2019]. Refactoring prevents code smells and aims to reduce the technical debt [Huumo 2017] on a regular basis, little by little, as per the “boy scout rule” [Martin 2011]. 

As a Shift Left strategy, TDD should be adopted very soon and should generate more test cases than integration or system tests to avoid building an ice cream cone. Although efficient, this testing technique provides only a microview on the solution and needs to be complemented with higher testing levels [Beck 2004] and multiply the types of tests on the solution in the feedback loop [Kohl 2006].

Extreme Programming (XP) requires planning and feedback at many levels and many frequencies [DonWells 2013]

TDD is sometimes confused with ATDD mostly because they sound the same but also because they are structured the same and intertwined in a double loop  [Argyris 1977] [Smith 2001] framework [Ambler 2006].

The ATDD/TDD double loop [Ambler 2006]

The double loop inferred by the combination of ATDD/TDD is beneficial for TDD because it  guides coding towards expectations to avoid the “getting lost in code” syndrome and focus on the Customers’ needs.

Impact of TDD on the testing maturity

While it is well known, TDD appears to be one of the hardest Software Craftsmanship practices but if TDD is properly understood, it is the simplest way to achieve high test coverage and better code quality [Stemmler 2021]. 

It takes time to tame TDD. Before applying TDD to production code and the time pressure that goes along, it is extremely important to practice deliberate training, which means, spending time on a regular basis to learn by practice sessions [Dan North 2012]. This training is a mix of

  • Katas”, exercises done and repeated alone, fully concentrated on the goal during multiple sessions
  • and “Dojos”, a training room where other practitioners share through katas done in group or through a focus on a knowledge to extend designing abilities

Katas are usually small problems to code, very easy or mind blowing to solve through unit tests to pass. The problem is exposed and a framework is made available (a set of functions described in existing UTs, eventually within an environment into the code can be run and tracked to share with a community. There are tons of katas available on multiple websites such as

When practicing katas, the goal is actually not solving the problem but training for good code writing, basically in Red/Green/Blue mode to start acquiring the 3 laws of TDD, eventually with more constraints as learning becomes habits and then reflexes to adopt any coding conventions even under constraints such as “Object Calisthenics” [Pissarra 2021] [Moustier 2019-1].

When dojoing, katas may take place:

  • individually with a retrospective at the end of the session
  • in Pair Programming [Williams 2003],
  • in Ping Pong mode [Bulgu 2020] - Dev A code a UT, Dev B make it green, refactors, write an new UT and hands the keyboard to Dev A, 
  • in Mob Programming
  • or in anything else that would involve sharing between participants.

TDD arrives in different styles:

  • Detroit school: the classic TDD approach based on a digging deep approach [Beck 2002], this is a bottom-up approach [Henke 2018]
  • London school: a mocking-based approach [Freeman 2009], this is a top-down approach
  • Uncle Bob’s Test Priority Premise (TPP): TDD is done in baby steps through a series of 12 progressive transformation types to introduce genericity in the code [Martin 2013]
  • Kent Beck’s Test-Commit or Revert (TCR): along with the Red/Green/Refactor cycle and the 3 laws of TDD, anytime UT pass, the code is committed and versionned into the code repository; if they don’t, the code is reverted from the repository [Beck 2018]

From this point on, it should be clear that TDD requires time to master from the apprentice to the companion thanks to some learning patterns [Hoover 2009]. However, TDD is good for agile since it demonstrates what works [Pettichord 2007] but UT should be FIRST [Ottinger 2009]

  • Fast: hundreds of UTs must be run per seconds
  • Isolates: origins of issues must be obvious
  • Repeatable: UTs must be run independently, in any order, at any time
  • Self-validating: the UTs results should not require any interpretation
  • Timely: UTs must be written at the right time, the sooner the better, ideally before the code

This FIRST quality is especially valuable in a DevOps context since the automated delivery pipeline can be launched tens of times per day, eventually at each code commit.

Even if TDD is hard, it has some advantages [pulkitagarwal03pulkit 2020]:

  • You only write code that’s needed
  • Your design is more modular
  • Your code is easier to maintain and easier to refactor
  • Your code is documented by UT
  • You debug less
  • Your code coverage is higher - although it is a pitfall to set a 100% of code coverage as a goal [Solnica 2014]

However, TDD is not a silver bullet, this is a slow process, all Team members should adopt the approach and UT must be maintained when specifications change [pulkitagarwal03pulkit 2020].

TDD has some good practices to help mastering it [Cigniti 2022]:

  • Avoid functional complexity
  • Focus on what you need to achieve
  • Maintain code austerity
  • Test repeatedly
  • Maintain code integrity
  • Improve application knowledge
  • Know when to use TDD - use plain old UT when the risk is low [Moustier 2019-1], and TDD when there is low confidence on the code

TDD imposes certain challenges [Kohl 2006]:

  • the testing strategy should not rely only on TDD and automated tests
  • the testing strategy should not rely on mock up (London school)
  • excessive TDD takes us away from good design
  • maintaining UT
  • test suites can become cumbersome over time
  • writing GUI code with TDD is difficult

Agilitest’s standpoint on this practice

Testers should help TDD.

From a non T-shape point of view often met with pure Functional Tester, Testers should wonder why they should sit beneath a Developer to have a look at UTs…the thing is 

  • testing should be done as early as possible in the Software Development Life Cycle which infers that someone, why not Testers, should promote this practice. TF clearly addresses this testing principle [ISTQB 2018]
  • UTs generates transparence: old-fashioned Testers are then able to assess the quality standard of a release so that they would know where to search as per the “Defect Clustering” testing principle [ISTQB 2018]
  • practicing Gemba walks by the Developer groundfield provides tons of informations
  • testing techniques known by experienced Testers can be shared with the Developer to enhance his UTs and testing skills; in return, Testers know better the code; finally, it contributes to the T-shape mindset

Most of the time old-fashioned Testers will also oppose

  • “coding is not my job”: this silo is clearly not an agile mindset, it also introduces bottlenecks in the delivery flow
  • “I can’t understand code”: such Testers will be able to do a code review but they can ask “Tell me what does this UT do?” to spot flaws in the testing logic or propose to apply testing techniques such as “Equivalence Partitioning” [Beizer 1990] [Otsuki 2012] [ISTQB 2015]

Knowing is not mastering, it may not give any legitimacy to teaching this practice but this gives the chance to innovate and skill up the development team.

To go further

© Christophe Moustier - 2021