Living Documentation

Automated testing

Documentation is as close as possible to the code to be up to date

Automated testing

Description

Documentation is an attempt to meet certain sharing needs depending on the context of the project. Among the factors that influence this need, the size of the project is major, as are regulatory needs or customer requirements. This need can also be diminished by the experience of the teams and the tools that allow knowledge to be transferred across space and time.

The Waterfall and Agility models have different attitudes toward documentation.

With the so-called “Rigorous Development Methodologies” [Clear 2016], generally promoted by the Waterfall model, inconsistencies in configuration management emerge from those "thought experiments"; unfortunately, this approach culturally leads to a bunch of assumptions that are hastily verified near the end of the project. In fact, the underlying principle of the document-driven approach should be to have documentation produced online from the tangible means of end-users to test assumptions as soon as possible [Clear 2016] [Moustier 2019-1].

Agility promotes interactions over documentation [Beck 2001], with as much rigor as needed [Rüping 2005] [Martraire 2019] and the Team can cope [Moustier 2019-2]. Agility is actually notably based on Peter Naur’s theory of building (see https://pages.cs.wisc.edu/~remzi/Naur.pdf), the programming team develops a jointly owned “theory of the world” to become frozen into software; therefore, the code is the main artifact and primary source of documentation for the project [Clear 2016] [Beck 2004] along with an implicit sharing technique from what has already been done, the stigmergy [Martraire 2019], which is the scientific term for mimicking previous works.

While the term “documentation” arose at the end of the XIX century (see https://www.cnrtl.fr/etymologie/documentation), the concept has been formalized from the book “Traité de Documentation - Le Livre sur le Livre - Théorie et Pratique” by Paul Otlet in 1934 who described this emerging discipline to basically handle bibliographical references. From this first formalization, cultural discrepancies across countries appeared to eventually be named “Information Science” in the English culture instead of a simple list of documents [Ortega 2009]. This lack of consensus on the concept itself reflects its versatility and complexity. 

Regarding the link between documentation and code, Knuth has launched an initiative called "Literate Programming" (LP) from the TEX project in the 1980’s [Guravage 2000]. LP consists of a “way to expose and elucidate a program’s design by presenting its parts in an order and at a level of abstraction that places a premium on understanding”. 

LP relies 3 properties [Guravage 2000]

  • a single literate program source should, when processed, produces both a runnable program and a nicely typeset document
  • a literate program must exhibit a flexible order of presentation.
  • a literate program, and the tools that process it, should facilitate the automatic generation of cross-references, indices, and a table of contents

LP makes the code look like a book. Nowadays, this stream is rather related to the “Clean Code” approach [Martin 2008].

Living Documentation” (LD) is a concept which has been widely promoted by Cyril Martraire (https://www.youtube.com/watch?v=Tw-wcps7WqU)  who finds its origins in the book “Specifications by Examples” by Gojko Adzic [Adzic 2011]. It is an answer to Gerald Weinberg’s quote 

Documentation is the castor oil of programming.
Managers think it is good for programmers and programmers hate it!

The core principles of LD are [Martraire 2019]:

  • Reliable by ensuring all documentation is accurate and aligned with the software being delivered, at any time.
  • Low-Effort by minimizing the amount of work to be done on documentation without any repetition of the knowledge
  • Collaborative by promoting conversations and knowledge sharing between involved people
  • Insightful by drawing attention to relevant parts and encouraging feedback and deeper thinking to aim at better decisions

Those principles infer several characteristics such as [Martraire 2019]

  • A single source for the documents (eg. in the code)
  • A reconciliation mechanism to extract data from many places (eg. tests generated from BDD)
  • Conversations over Documentation
  • A collective ownership, which is leveraged by Product  ideation in group
  • Embedded Learning: each document, even the code, is so good that newcomers can learn by reading it and by running related tests
  • and of course an automatically testable documentation.

The “Conversations over Documentation” principle is obviously aligned with the agile Manifesto [Beck 2001]. It is particularly relevant when the rate of change of knowledge is high [Martraire 2019]; moreover, it appears that after 1 year, less than 1% documents are still used [Ortega 2009]. Discussions are also preferable when the topic is complex and document stubs can be generated to understand the intended final content [Jan 2013].

Knowledge transferring is always more efficient in face-to-face sharing sessions [Moustier 2019-1]

This stub-based approach is eased with the involvement of people’s matters within the Team. The more this involvement is asynchronous with the Team, the more formal the documents need to be [Jan 2013].

Automation and tools are extensively integrated in LD. It helps to generate documents from documentation shards to [Martraire 2019] [Hargis 2004] [Pelclová  2014] [Cockburn 2006]

  • target intended audiences
  • make them accessible 
  • maintain relationships with other documents 

Automated documentation is part of the LD cycle along with [Martraire 2019]

  • No documentation” management - tacit knowledge sharing practices such as the Pair Programming
  • Stable documentation - the “Solution Intent” from SAFe [SAFe 2021-47] illustrates this concern
  • Refactoring-Friendly documentation - Clean code, colocation, LP are helpers on this part of the cycle
  • Runtime documentation - notably with code annotations and DSL, visible calculations and tests
  • Beyond documentation” - deals notably with learning and hygienic transparency

To support automation, tools are inevitable. For instance, you may include:

  • BDD implies some Cucumber-like tool
  • Some languages, such as Java, enable annotations, this can be used to build a glossary [Fauvel 2020]. They are also capable of a reflection mechanism to parse the code within the code itself
  • Graph renderers to visually acquire / check some information - Graphiz being often used to generate graphics
  • Document generators such as QDox, CukeDoctor, YeST, markup-based tools (DocBook, DITA, HTMLBook, AsciiDoc, Wiki, Markdown, …) or proprietary systems (Adobe, MadCap Flare, Confluence, …)
  • Orchestrators such as Jenkins, Gitlab-ci, Bamboo, Azure DevOps can be used to script the documentation generation automatically

Now, the name of the game is “Find the proper tools and glue them to build documentation”.

This game includes many stakeholders, point of views, and maturity to make this happen. It cannot be the effort of a single person notably because LD infers conventions because of the involved tools.

Impact on the testing maturity

Several authors state that each project decides what documentation is required [Ambler 2002] [Hoda 2012]. As an extension, agility usually lets people manage documentation as self-organized teams, which is consistent with the decentralized decision-making principle.

Unfortunately, this approach introduces cultural bias with people getting bored with completing their requirements specifications. Therefore, Teams highly depend on some senior management support as a guide in agile methods [Hoda 2012] which can be supported by a good Scrum Master and/or Product Owner. The management acts notably as a facilitator on interactions with third parties such as Ops Teams or Suppliers and Customers (both internal and external). This conscience of the caring of the Team’s environment usually leads the Team to  improving their Definition of Done (DoD).

To get efficient in documenting the project, documents usually model [Ambler 2002]

  • to communicate (processes, hands off, or hand overs) 
  • to understand and may address processes, any ideation parts (marketing, specifications, architecture or code)
  • or to support the thinking associated with each stage during the project

To provide some consistency between all those documents, aiming an Ubiquitous Language is then key to find the same vocabulary anywhere the same context exists.

To gain some efficiency, it is possible to experiment with some patterns [Martraire 2019].

Engineering patterns:

  • Creating a concrete example together, right now, during this meeting”: building documents in workshops avoids ping-pong mails and every participant knows and agree on its content - examples: 3 Amigos, Pair Programming, Mob Programming, Event Storming / Model Storming [Brandolini 2015] [Moustier 2020]
  • Generate from static code”: code and building scripts bear some documentation notably through a naming process [Belshee  2019] to increase the code readability as per the LP paradigm. From this grounding, documents can be aggregated through compilation handling thus documents as code (see https://www.writethedocs.org/guide/docs-as-code/
  • Generate from executed code”: it could be some workflow states [Fauvel 2020] or from executed tests that would generate some readable logs as a test report (see Sebastien FAUVEL’s piece of code on LD : https://sfauvel.github.io/documentationtesting/ + https://github.com/sfauvel/livingdocumentation)
  • “Provide valuable knowledge”: flexible and goal-oriented contents [Fauvel 2020] such as tutorials, how-to's, explanations, or a reference catalog, with parts that can be skipped to allow the "reader" to quickly reach unknown parts; highlighted differences from previous versions or related parts also address this pattern
  • Low-Fidelity”: inviting people to add their inputs to consolidate the knowledge in a “just enough documentation” state of mind [Hoda 2012] [Jan 2013]
  • Low-Tech”: fast media such as whiteboards which drawing captured by a photo is more flexible than tool-based drawing which eases changes - at some tipping point, tools become inevitable along with stable informations
  • Recording changes”: documenting change decisions on wikis with time-stamps and reconciliation mechanisms [Hoda 2012] to rebuild the final picture like the Event Sourcing mechanism [Fowler 2005] [Moustier 2020] or Architecture Decision Records (see here https://github.com/joelparkerhenderson/architecture-decision-record

Refactoring patterns:

  • Bubble Context”: the Decentralize decision-making principle is applied to isolate some knowledge and protect some change
  • Applying 5S” : keeps the documentation clean
  • Applying DDD-like approach”: structures the documentation from both business and technical perspectives
  • Archeology”: practicing archeology must be facilitated by maps of the ground made possible with annotations, tags, summaries, models
  • Biodegradable Transformation”: once the transformation process is over, the document is removed
  • Enforced Legacy Rules”: automating rules with deprecation mechanisms supports people’s efforts

Sharing patterns:

  • Visual facilitation”: drawings are faster to grasp than texts
  • Yokoten”: transverse sharing practice; notably included in some retrospectives
  • Maxims to spread principles”: guidelines worded as easily memorized punchlines to be naturally spread and repeated (eg. “Don’t feed the monster”, “In Rome, do like the Romans do”, …)
  • Highlights”: FAQs, digests from knowledge, promoting news, posters, information radiators, animations & events
  • Spaced Learning”: repeated reminders help people building reflexes


Whatever your implementation of those patterns is, you will need to find opportunities to mix activities and do several things at the same time such as

  • coding along with documentation
  • providing (mostly with some code) testable documentation
  • testing both code and documentation
  • agree on the content with stakeholders
  • sharing the content with stakeholders

For instance, Sébastien Fauvel’s approach to LD is an attempt to merge this activity by using test scripts to generate some readable documentation, the documentation is then used as reference to highlight discrepancies with newer versions [Fauvel 2020] [Fauvel 2022].

Along with those patterns, you should also pay attention to some antipatterns such as:

  • Human dedication”: dedicating specific people to documentation will lead Developers to get rid of this task
  • Polished diagrams”: tools supporting standard notations such as UML or BPMN lead people to dig their diagrams deeper as they discover the features of the tool
  • Redundant knowledge”: this lead to the obsolescence issue with deprecated content

Many other patterns and antipatterns can be found [Martraire 2019] and knowing them is the first step to avoid pitfalls and try some practices to start innovating in LD.

To go further

  • [Martraire 2019]: Cyrille Martraire - 2019 - “Living Documentation: Continuous Knowledge Sharing by Design” - isbn:9780134689418

Special thanks to Sébastien FAUVEL
who provided me with his view on LD
and some precious comments on this article.
© Christophe Moustier - 2021