Scripting tests to automate testing is not enough because
- scripting is only about test cases while test cases are not testing [Bach 2014]
- if your scripts introduce extra complexity to your solution, they will be put aside and the quality will of the solution will decrease
- testing is inevitable and the Agile/DevOps approach advocates massive automation to compensate manual testing
- to reduce the complexity of test scripts, their engineering must be “FIRST”.
FIRST is an acronym which stands for [Martin 2008]:
- “Fast”: tests should run quickly to have fast feedback in an easy way to let the run of test script be easily part of the development activity; moreover, when integrated to a pipeline, keep in mind the whole C/CD process should last less than 10’
- “Independent”: Tests should not depend on each other. You should be able to run each test independently and run the tests in any order. This is probably why some author name this characteristic “Isolated” [Ottinger 2009] and eventually “Hermetic test pattern” or “Test is an Island Pattern” [Kovalenko 2014]
- “Repeatable”: Tests should be repeatable any time, in any environment
- “Self-Validating”: The tests should have a boolean output. No interpretation should be required, either they pass or fail.
- “Timely”: The tests need to be written in a timely fashion, the sooner the better, even before production code
FIRST differs from “Test First” which is an XP practice [Beck 2004]. In fact, since the “Timely” is part of FIRST, Developers should practice “Test First” and then TDD on a regular basis.
FIRST is not the only way to think about improving your testing approach.
For instance, we have another acronym given by Andrew Hunt few years before FIRST with “Good tests are A TRIP” [Hunt 2003]:
- Automatic: tests should be run systematically and (eventually) automated
- Thorough: not only major tests cases should be run, but also corner cases
- Repeatable: if run several times, the result should be kept unchanged
- Independent: tests do not rely on others
- Professional: the quality of the tests must be as high as possible for the team, given the state of the art - Community of practice and Software Craftsmanship practices should leverage professionalism inside the Team
There is also “FAST” [Ferguson 2017] [Moustier 2019-1]:
- Fast: to deliver fast, tests must be fast
- Actionable: failing tests can be isolated and handled for fixing
- Scalable: the more tests you have, the more work you would have, unless they are easy to maintain, notably when the application changes and false positives arise. Changes should be as smooth as possible.
- Trustworthy: Above all, you should have good trust in your tests; otherwise, test scripts perceived added value will lower and test automation will be abandoned
Those proposals take care about the speed of delivery process including testing and the quality of the underlying test scripts at scale.
Impact on the testing maturity
Even in simple contexts, when there are a huge amount of tests, it can become tricky to handle test automation; the thing is, software development implies complex contexts [Snowden 2007]. When there is a massive amount of test scripts combined with a complex environment, FIRST is not enough, you have to “Keep your tests clean” - [Martin 2008].
The problem is that tests must change as the production code evolves. The dirtier the tests, the harder they are to change. As a consequence, they will become more and more unusable, ultimately, they generate complaints [Martin 2008]. This occurs with mature Teams because without a reliable test suite the Team could not ensure that changes to one part of their system will not break other parts of their system: immature Teams will let the scripts rot then a “Broken window” effect settles and technical debt increases. If you let the tests rot, then your code will rot too. The so-called “Clean Test” approach from Robert C. Martin advocates that test scripts have to be kept flexible, maintainable, and reusable to adapt to production code changes [Martin 2008].
The theoretical difference between scripting and coding is that scripting languages do not require the compilation step and are rather interpreted [GeeksforGeeks 2022] [softwaretestinghelp 2022]. Actually, Coding is a genre, Scripting Is a subgenre [Morris 2022]. This implies to treat scripts as code and let them be understable and then adopt the Domain-Specific Language (DSL) that fits the context. Tools such as Gherkin bridge the gap between the DSL question at User Story (US) level with a development language, and Unit Tests at code level do “speak” the language of Developers, as per the Domain-Driven Development. The leap between those levels of abstraction and domains can be actually addressed for instance by combining ATDD and TDD. This link between macro and micro has been addressed under the term “double loop learning” [Argyris 1977] [Smith 2001]. It enables focussing the organization on the customer and uses testing as a glue. Having FIRST tests eases double loop learning due to fast and deterministic feedback.
However, when tests become more macroscopic, they are harder to handle, notably testing data wise which usually impedes repeatability. Available tools automate the many aspects of testing such as test management, load testing, test scripting, and contribute to testing automation but the intrinsic limits of tools are a ceiling glass to FIRST testing at a certain level.
Because automation involves some code, we may then see how the Object-Oriented Design SOLID [Martin 2008] [Metz 2015] principles could be adapted to tests:
- Single-Responsibility Principle (SRP): "There should never be more than one reason for a class to change" - the design should help you to have one sole reason to update your asset; therefore you should have one single assert per Test and one single concept per test [Martin 2008]
- Open-Closed Principle (OCP): "Software entities ... should be open for extension, but closed for modification": applied to test scenarios, it would be better to add tests instead of updating them, introducing then “immutable tests”, especially when your software solution also implies immutable components. This paradigm will lead you to let tests live as long as a feature is in production and remove them from test suites when the feature is decommissioned
- Liskov Substitution Principle (LSP): "Functions that use pointers or references to base classes must be able to use objects of derived classes without knowing it" - when possible, the test must be independent from the implementation details and environment parameters such as data or infrastructure - they should work regardless of the context, as much a possible
- Interface Segregation Principle (ISP): "Many client-specific interfaces are better than one general-purpose interface" - tests should rely on interfaces with 1 assert per test; this paradigm will provide many interface-specific tests, but then there is a need to introduce a feedback loop at a higher level as per the “double loop learning” principle that would fit levels of abstraction with appropriate DSL changes
- Dependency Inversion Principle (DIP): "Depend upon abstractions, [not] concretions." - tests should involve the appropriate levels of abstraction with appropriate tools to ease handling (eg. Gherkin-based tools for US Acceptance Criteria and a xUnit framework at code level)
Last but not least, tests must take great care of False Positives, especially when quality is based on an automation strategy. For this, the Lean pillar named “Jidoka” must be part of the test automation strategy [Moustier 2022].
Therefore, your (automated) tests must be
However, such an approach can be really tricky when
- the amount of tests becomes huge
- it is technically difficult to handle because of non-form-based tests with complex Graphic User Interfaces or when there is no technical interface to rely on
- changes on the solution occur faster than test engineering does
Agilitest’s standpoint on this practice
Agilitest recommends the FIRST practice to all its Users. Since the tool facilitates robust test scripts, some Customers have a tendency to generate very long test cases (up to 3000+ steps!) with multiple validation points. But when it comes to traceability and requirement coverage, producing some KPI becomes uneasy and repeatability relies then on strong test data management, not to mention self-validating and timely characteristics.
Since Agilitest provides a folder-based structure with subscripts. It is then possible to group scripts with levels of abstraction such as business level (process) and implementation detail level (screens) to make the scripts more readable, provided that the sub-scripts are named consistently with the level of abstraction to ensure self-explanatory test scripts.
The immutability question is also addressed with test data sets actionable even at subscript level to ensure independence. Since those data could be in JSON or CSV format, they can eventually be generated by third parties to enable data migration management required by application version changes. To reinforce the immutability matter, Agilitest generates scripts in text format, which easily enables configuration management and baseline management to match production code versioning in your DevOps pipeline.
To go further
- [Argyris 1977] : Chris Argyris - « Double Loop Learning in Organizations » - Harvard Business Review - SEP/1977 - https://hbr.org/1977/09/double-loop-learning-in-organizations
- [Bach 2014]: James Bach & Aaron Hodder - APR 2014 - “Test Cases Are Not Testing: Towards a Culture of Test Performance” - https://www.satisfice.com/download/test-cases-are-not-testing
- [Beck 2004] : Kent Beck & Cynthia Andres - feb 2004 - “Extreme Programming Explained: Embrace Change” - ISBN : 9780321278654
- [Ferguson 2017] : John Ferguson - « Behaviour Driven Development at the heart of any DevOps transformation story » - http://johnfergusonsmart.com/wp-content/uploads/2017/07/bdd-at-the-heart-of-devops.pdf
- [GeeksforGeeks 2022] : GeeksforGeeks - APR 2022 - https://www.geeksforgeeks.org/whats-the-difference-between-scripting-and-programming-languages/
- [Hunt 2003] : Andrew Hunt & David Thomas - 2003 - “ Pragmatic Unit Testing: In Java With JUnit” - isbn:9780974514017
- [Kovalenko 2014] : Dima Kovalenko - 2014 - “Selenium Design Patterns and Best Practices” - isbn:9781783982714
- [Martin 2008] : Robert C. Martin - 2008 - “Clean Code a Handbook of Agile Software Craftsmanship” - isbn:9780359582266
- [Metz 2015] : Sandi Metz - MAR 2015 - “GORUCO 2009 - SOLID Object-Oriented Design” - https://www.youtube.com/watch?v=v-2yFMzxqwU
- [Morris 2022] : Scott Morris - c2022 - “” - What’s The Difference Between Scripting And Coding?” - https://skillcrush.com/blog/coding-vs-scripting/
- [Moustier 2019-1] : Christophe Moustier – JUN 2019 – « Le test en mode agile » - ISBN 978-2-409-01943-2
- [Moustier 2022] : Christophe Moustier - APR 2022 - “My Flaky Test is Agile” - https://youtu.be/Ptg5NICosNY?t=2701
- [Ottinger 2009] : Tim Ottinger & Jeff Langr - c2009 - “F.I.R.S.T” - http://agileinaflash.blogspot.com/2009/02/first.html
- [Smith 2001] : Mark K. Smith - 2001 (updated in 2005) - “Chris Argyris: theories of action, double-loop learning and organizational learning” - www.infed.org/thinkers/argyris.htm
- [Snowden 2007] : David J. Snowden et Mary E. Boone - NOV 2007 - “A Leader’s Framework for Decision Making” - https://hbr.org/2007/11/a-leaders-framework-for-decision-mak
- [softwaretestinghelp 2022] : softwaretestinghelp - MAY 2022 - “Scripting Vs Programming: What Are The Key Differences” - https://www.softwaretestinghelp.com/scripting-vs-programming/#Benefits_Of_Scripting_Language