Maintainability testing

Active testing

Reusability, modularity, stability, testability, ...

Active testing

What is maintainability testing?

There are different types of maintenance [Deyin 2021]:

  • Responsive Maintenance or Corrective Maintenance
    - Planned or Scheduled Corrective Maintenance
    - Unplanned or Unscheduled corrective maintenance
  • Proactive maintenance
    - Time-Based Maintenance (TBM) - based on a schedule
    - Predictive Maintenance (PDM) - based on probabilities and history
    - Failure Finding Maintenance (FFM) - based on regular testing on the product
    - Condition-Based Maintenance (CBM) - depends on the situation of the product
    - Risk-Based Maintenance (RBM) - based on the highest risks met

Maintainability aims to ease maintenance. For the ISO 25010, Maintainability Testing (MT) aims to raise effectiveness and efficiency with which a product can be improved, adapted, updated or fixed through maintenance activities or maintenance in use by experienced maintainers [ISO 25010]. Maintenance also includes preventive activities [Gao 2003].

To achieve this, maintainability testing usually addresses [ISO 25010]:

  • Modularity [IEEE 24765:2010], to limit the impact of maintenance and facilitate upgrades without disrupting the entire system
  • Reusability [IEEE 1517-2004], to avoid duplication thus impacted repetitive maintenance operations
  • Analysability, to ease diagnoses
  • Modifiability, to ease changes - introduces some potential conflicts when parts are delivered independently, flexibility introduced by modularity reveals to be more efficient since it avoids a monolithic process, eases scalability and at the same time, modularity raises maintainability to a critical asset with component-based software [Gao 2003]
  • Testability [IEEE 24765:2010], to check hypotheses and thus, enable analyses

Testability aspects to ease analysis when code is not available [Moustier 2020]

The components introduced by modularity provide interfaces that will be used for unit test provided that a good testability has been built and also to test how well the dependencies work with integration tests. These integration tests will be eased with Test Harnesses.

To build some MT campaigns, it is important to locate rapidly the test cases that would deal with the maintenance impacted components. Another drawback of modularity is the potential unavailability of the source code to enable analysis, hence a strong need of testability to let maintainers use available technical means and let them be at least aware of the existence of such means.

The inevitable modularity paradigm provide 3 types of modules [Gao 2003]:

  • Components Off The Shelf (COTS)
  • In-house-built components
  • Newly generated components

The category into which a component falls will help to define the MT Strategy regarding the context in which maintenance happens:

  • when happening in an unplanned context due to component upgrade or failure, the selected test cases should aim simple recovery of the product to have acceptable  Mean Time To Recovery (MTTR)
  • when it comes to prepare the newest release of a component before it is deployed, “exhaustiveness” may be targeted to ensure at best a “painless” deployment
  • to proactively spot issues and perform maintenance before Users do, a regular maintenance may be planned to monitor the good health of the system

Another factor that helps to define the appropriate strategy is the available testability means which may include the source code, some documentation, tools and some skills on the component under maintenance.

From those 3 items, appropriate testing techniques can be involved to build a MT Strategy that should aim for a low Error Budget of the system and ease reliability.

Few Factors for a Maintenance Test Strategy

Impact on the testing maturity

Some products also include hardware parts. The maintenance of those parts require a material workflow, from the field to the maintenance lab. In Consumer Electronics, the procedure is usually named “Return Merchandise Authorization” (RMA)  which is done to avoid returning devices that can be easily fixed locally so that it reduces callback impacts.

This RMA process hinges on different support levels from very simple instructions that can be done by any End User, experienced or Power Users that may locally perform some fixes, until the callback escalates to your R&D labs. To enable this level-based maintenance approach, built-in quality [SAFe 2021-27] is compulsory. The testability that goes along with this built-in quality approach, enables diagnosis at every level.

The issues detected from the field should be retrieved to enable analyses from those incidents and learn from the End Users; therefore, when a RMA process is engaged, a Root Cause Analysis (RCA) should be done to see if there would be some preventive actions to perform, and ultimately a product callback, in order to avoid general dissatisfaction, severe damages and trials.

This proactive MT can be rather costly, money and reputation wise; therefore, a good component traceability will help to limit the amount of returned units or component parts to analyze. This standard example shows some aspects of MT and there are as many as there are contexts as per the test principles [ISTQB 2018].

There is though a common ground to all the contexts that products will face, because any product has a backlog, some code, documentation and test cases. All of those parts should be taken care of as parts of a supply chain and imagine some logistics around them to ease the flow of the delivery pipeline. This general maintenance approach is named “5S” and consists actually in 

  • Sorting things to find them quickly
  • Setting things to get them ready for immediate use
  • Sweeping and cleaning things to remove any malfunctioning parts and lose time in fixing it rather than using it
  • Standardizing things with conventions, processes and responsibilities
  • Sustaining the system with self-discipline

As a heuristic, the 5S scope could be limited to the most critical parts or subject to changes and error prone, but it becomes clear such actions should be included in the course of the Sprint to maintain the assets and keep them ready for an intensive use, especially whenever there is an issue in production.

This “be ready” mindset is something you need to institutionalize, especially when it comes to systems that are launched once in a while. In such a situation, rehearsals are necessary to keep the knowledge active. This may be obvious when Engineers have just released the solution but after a couple of years, it may appear it is not the case any longer - keep in mind there are still some COBOL stacks running on legacy servers with people that maintain them after decades while people move…

In some business models such as the automobile industry, maintenance needs to be profitable. The manufacturer sells cars to End Users and licenses to Garage Owners; the license proposes reliable profitability. Under those circumstances, the system testability is industrialized so it can be taught to Licensees to keep the End Uses’ satisfaction high.

Now it is clear that Maintenance, Reliability, MTTR and Error Budget are so intertwined that each part facilitates the others. This situation leads 

  • to have everything ready for immediate corrective actions which is facilitated by principles such as SOLID principles, 12 Factors and plenty architectural principles that help to know where things should be located as promoted by 5S
  • to build a system which is resilient to unexpected situation, eventually by voluntarily disrupting the system to go towards Jidoka and limit human interactions and apply the “Pet vs Cattle” paradigm [Bias 2016]
  • to take care of any technical debt

This approach leads to “Reconfigurable Manufacturing System” (RMS) [Finn Miller 2017]. It is based on six core characteristics:

  • Modularity
  • Integrability
  • Customized flexibility
  • Scalability
  • Convertibility
  • Diagnosability

When possessing RMS characteristics, RMS increases the speed of responsiveness of manufacturing systems to unpredictable events, such as sudden market demand changes or unexpected machine failures. RMS relies on 5 Principles [Finn Miller 2017]:

  1. The RMS functionality is rapidly adaptable to handle new needs
  2. Scalability built by design
  3. Parts modularization by design
  4. Configurability
  5. Means are cost-effective to respond to unexpected events

All of these practices should be introduced, bit by bit, through small enablers [SAFe 2021-23] as part of each sprint planning. Again, the Developers are responsible for the MT, but the Product Owners are responsible for the MT. This "Baby step" approach is a classic way of managing an environment that enables MT and Jidoka in particular by gradually and deliberately disrupting the system.

Agilitest’s standpoint on this practice

Test scripts are assets that deserve maintainability. Applying RMS characteristics and principles to test scripts guaranties: 

  1. Valid test means with flaky tests
  2. Large test campaigns
  3. Test ranging from parts to assemblies to enable progressive End-to-end testing as the product is assembled and to facilitate Shift Left Testing and easy to maintain scenarios
  4. Script reusing notably with test data
  5. Test scripts engineering and management reduces the holding costs [SAFe 2021-06] and test maintenance is cost effective

To discover the whole set of practices, click here.

To go further

© Christophe Moustier - 2021