In assembly plants, manual assembly lines handle incidents thanks to “Andon”, an alert mechanism which involves immediate action from people. In this context, the Line Manager is there to support incident management or the Support Team can be escalated to solve the issue as fast as possible and also to fix the root cause in a second step. When the line is automated the Line Operator watches for failing robots to fix incidents.
Instead of fixing units, robots are upgraded to handle new situations with alerting the Line Operator.
This automation that automatically fixes issues is named “Jidoka”, a Japanese word (自働化) based on the ideograms
- 自: oneself
- 働: work
- 化: change
which can be translated by “Autonomation”, a portmanteau word composed with “Autonomy” and “Automation”. Jidoka is used to improve robots by detecting each situation and by adapting robot behaviors with the appropriate handlings.
Jidoka is most important in CI/CD context because False Positive (FP) issues from the test scripts stops the integration pipeline and requires the Developer to fix the issue, just like the Line Operator in an automated plant.
Actually, FP occur 72% of the time whenever an issue is raised, mostly due to script issues (with 46%) [Philipp 2019]
Suppose the overall tests scripts reliability reaches 99.7%; if the pipeline embeds 400 scripts, its reliability would be only 30% (99.7%^400); therefore, the pipeline will fail 70% (100%-30%) of the time, and in 50% cases (70%*72%) this is due to FPs.
With 700 scripts, the overall reliability will then be at 12% with 63% of FP.
But if script reliability is improved to 99.9%, then 700 tests would rise to 49.6%, with 36% of FPs; this proves reliability improvement is major. Jidoka is a way to improve this because it aims to enable the script to be more resilient to every situation.
Impact on the testing maturity
There are different types of Jidoka techniques adapted to test scripts automation that can be explored:
- Retry-based Jidoka
- Testability-based Jidoka
- Poka Yoke-based Jidoka
- KPI-based Jidoka
The Retry-based Jidoka commonly seen is the dumbest testing approach: simply imagine you face an unexpected behavior, would you only retry your scenario or would you rather do some deeper tests and try workarounds?
Dumb-retry a scenario is also time and resource consumming, which slows the performance of the pipeline and actually only prevent system laggings. It would be smarter to test internal parts to adapt to known situations as per Jidoka. In fact, the dumb-retry technique reflects how little we know about the testability of invoked parts.
Finally, dumb-retry based scripts are not reliable and provide non deterministic results [Axelrod 2018].
While some testing schools [Pettichord 2007] would prefer providing a fail status instead of doing further investigation, it only leads to the FP issue seen above. It is then better to try achieving the test objective which is the basic idea of doing Jidoka.
Therefore, the “Retry” technique should then be turned into “try workarounds from detected situations” instead of the dumb-retry, and eventually raise the amount of alternative paths to learn where your weakpoints are regarding the need of Jidoka.
Testability-based Jidoka: As seen with retry techniques, the more you can dig into testing subparts, the more you will be able to know about their configurations and statuses to spot situation changes and define the appropriate behavior to avoid FP.
Testability covers different aspects
- extrinsic testability: covers obvious and visible testing means
- intrinsic testability: covers available testing means from subparts
- technical testability: available technical interfaces from which tests can be done
- social testability: the sharing of “how to test this” is also part of testability since means be existing but unknown to the Tester
If the testability is poor, it will be difficult to interpret results since feedback will not show if there is something wrong with the system or simply misjudged the situation. It will result then a need to take actions to counteract ineffective control, which will add work load [Bainbridge 1983]. This is why you should look for testability means to automate in order to detect situations and guess the appropriate applicable workaround to improve your Jidoka.
Jidoka could also be based on Poka-Yokes. Poka Yoke-based Jidoka, as an extension of the retry-based Jidoka, may be progressively introduced from a simple mitigation to its elimination:
From these components, KPIs may be computed from some information that can be retrieved to spot accurately unreliable scripts and enable Jidoka improvements.
You may then
- count the amount of failures (FP and bugs) to guess in a bayesian-like approach [3Blue1Brown 2019] if a FP highly probable
- highlight script stability based on FP and the age of the script - old and untouched features are less prone to FP
- count involved basic scenarios and workarounds to see how stables the situations are
These figures will
- help decision making from the test results to focus the monitoring of the most relevant tests
- provide a health status on the script robustness - each workaround and FP found can be tracked to tell how dodgy each script is to start measuring your overall reliability.
This approach aims to prevent issues in testing robots
- at ideation time:
- Introducing Andons in the product (Logs, issues in the Backlog, Alerts for immediate actions)
- Upgrading the Design to disable issues
- Providing Test Harnesses to enable fast and reliable feedbacks
- at run time: detecting/preventing issues in production (within the CI/CD pipeline)
- at any time:
- Sources inspections & retrospectives
- Automate everything you can – including monitoring
As a start, only sanity tests are launched in the CI. The sanity suite should cover most features, but just a bit of each of them [Axelrod 2018] because it’s easier to maintain. Permutations and edge cases are eventually run in a different pipeline. However, only the combination of all those components will help reach the 100% reliability bar provided that Retrospectives are involved.
This layered loop structure has been discovered by Chris Argyris under the name “Double Loop Learning” [Argyris 1977] [Smith 2001]. This component of the PanTesting avoids narrowing an outcome (a product, a robot, or whatever) to its simple compliance with provided goals by introducing directions and feedback at a higher level, eventually with longer feedback cycles as with Lean Startup, OKRs or ATDD/TDD. Double loop learning provides a mix of short and middle to long term objectives.
In a nutshell, Flaky Tests should lead you to have solid automated tests scripts in the course of the Sprint. This challenge is actually the one Teams face:
« Tell me how you test, I will tell you how agile you are! »
Agilitest’s standpoint on this practice
Agilitest is mainly a dumb-retry based solution. However, since it is a #nocode scripting approach, the amount of FP due to scripts is lower than code-based scripting. Moreover, Agilitest embeds some heuristics to ease widgets finding and reduce the amount of FP in case of GUI changes.
Agilitest is also featured with REST API interfaces to probe subparts, provided that Scripters promote testability or have access to testability means.
The counterpart of the #nocode is that Poka Yoke is easier on the production code since there is no branching instruction except on the browser type. This reinforces the need to do some retrospective on FP to interact closely with Developers and eventually merge the development / test automation ecocycles.
To go further
- [3Blue1Brown 2019] : 3Blue1Brown - DEC 2019 - “Bayes theorem, the geometry of changing beliefs” - https://www.youtube.com/watch?v=HZGCoVF3YvM
- [Argyris 1977] : Chris Argyris - « Double Loop Learning in Organizations » - Harvard Business Review - SEP/1977 - https://hbr.org/1977/09/double-loop-learning-in-organizations
- [Axelrod 2018] : Arnon Axelrod - 2018 - “Complete Guide to Test Automation: Techniques, Practices, and Patterns for Building and Maintaining Effective Software Projects” - isbn:9781484238318
- [Bainbridge 1983] : Lisanne Bainbridge - 1983 - “Ironies of Automation” - https://www.sciencedirect.com/science/article/abs/pii/0005109883900468?via%3Dihub
- [Moustier 2020] : Christophe Moustier – OCT 2020 – « Conduite de tests agiles pour SAFe et LeSS » - ISBN : 978-2-409-02727-7
- [Pettichord 2007] : Bret Pettichord - MAR 2007 - “Schools of Software Testing” - https://www.prismnet.com/~wazmo/papers/four_schools.pdf
- [Philipp 2019] : Ingo Philipp - « How to Reduce False Positives in Software Testing » - https://www.tricentis.com/wp-content/uploads/2019/01/How-to-Reduce-False-Positives-in-Software-Testing-white-paper.pdf
- [Smith 2001] : Mark K. Smith - 2001 (updated in 2005) - “Chris Argyris: theories of action, double-loop learning and organizational learning” - www.infed.org/thinkers/argyris.htm