Master the Art of Test Writing: A Guide to Crafting Tests like Shakespeare

Keep your automated tests readable with the Screenplay pattern

What is the problem? Why should I read this?

Complex automated tests – especially UI tests – tend to lack a certain code quality. At least that’s my personal experience over the last 10 years.

A lot of automation code I read was very detailed, leaving me guessing at the overall intention. Other examples were very abstracted, forcing me to dig deep into the code to find the root cause of a failure.

Overall it was quite hard to change, to refactor and sometimes even to understand the code as so many things were either implicitly connected or duplicated.

I’m not talking about code other people wrote. I’m talking about my own code here. For a long time I struggled with the right abstraction for my test code.

PageObjects are not necessarily a solution here.

First, they only apply to page-like applications. It would be awkward (not impossible) to test an API or a CLI application using that pattern. But also modern single page applications require some creativity to apply PageObjects.

Secondly, the abstraction is typically limited by the limits of pages. That’s not necessarily the right one. Sometimes an application puts us through a multi-page workflow. Sometimes a single page contains several sub-applications supporting very different workflows.

‍

Many BDD frameworks allow us to write tests in any abstraction level, but the underlying code still might lack organization and the debuggability.

UserObjects

So for some time I experimented with tools, frameworks and abstraction levels. One thing I found very helpful was to stick to the users’ language. Automating workflows in a way that was understandable for product owners–at least on top level. I learned that this made test code very stable.

My own approach for some time was to create a “UserObject”, a representation of a user, rather than a representation of a page. That UserObject provided methods to do certain tasks a real user could do. Tools required to do these tasks – e.g. a Selenium WebDriver or an HttpClient object – could simply be kept in fields. Some context – like credentials, email addresses etc – could also be fields.

Within the task methods I still used PageObjects for letting my users navigate the application.

So I had my code in three logical layers:

tests written in a nice use case like language,
the UserObject’s methods were in the language of the application and
the PageObjects in the language of the UI/HTML.

This allowed me to avoid code duplications and made tests quite easy to read and write.

Huge UserObjects

This all worked quite nice, but the UserObject tended to become very huge. All use cases of the applications eventually became a method in the object, making it a very long class and it was easy to lose track of what was already implemented.

Another problem is the granularity of tasks. Sometimes I may want to do step A, then B, then C, so I created a method doABC. But then I needed a variation in that flow (e.g. A → B’ → C). So I ended up with doA, doB, doC and doB’ plus doABC and doAB’C - all being methods of the UserObject and hence hard to overlook being all on the same code level.

Finding the right place

While the model looks quite logical in theory, in practice it was quite hard to decide where to put certain logic. For example: login could be a method on the LoginPageObject, but also one on the UserObject.

OK, we should not call PageObjects from the test. But should the searchForItem method on the UserObject return a list of WebElements?

When you think about it, every semantic functionality a single page provides could ultimately be described as a task method on the UserObject.

Analyzing failures

Another problem with the approach was analyzing failures.

Apart from actual regressions, where the system under test is to blame, the root cause for a failing tests could be on any layer:

Sometimes the UI will have slight changes: a button has been renamed, a div has a new ID, an input’s name has changed… All of this should be fixed and found in the PageObjects.
In other cases the application’s details have changed: there’s a confirmation popover at the end of the checkout, the search bar has a new auto-completion feature… that stuff needs to be handled in the UserObject. We need to adjust task methods to handle the additional behavior.
It is also possible that the actual use cases have been altered in a more fundamental way: registering a new user requires putting in the birthday date, deleting an item no longer requires a confirmation, but can be undone… things like that should most likely be reflected on the test level.

The one additional layer – compared to PageObjects only – might be worth the effort, but in my experience is a costly investment.

Sharing code

Eventually sharing code and opening the code base to contributions from different teams became a requirement for the test framework.

Sadly UserObjects are very hard to extend. Of course a team missing functionality could extend the UserObject and add missing functionality for their own code base. However, letting these extensions flow back into the code base was problematic as the teams often add a lot of special cases that are of no value to other teams, while bloating the code base and adding code duplications.

On the other hand, keeping extensions local, probably causes equal solutions to be implemented in different teams.

Screenplay?

After several iterations of UserObjects I became aware of the Screenplay pattern and eventually checked out the documentation and the original article “Page Objects Refactored” to understand how this is different from UserObjects and how it solves the problems I found.

Actors with Abilities

The central entity in Screenplay is the Actor. Similar to an UserObject, an Actor represents a user of the application. In contrast to an UserObject, an Actor does not directly hold tools as fields, but a list of Ability objects, which basically wrap the actual tools.

For example, there might be a BrowseTheWeb Ability which simply provides a Selenium WebDriver object. There could also be a CallRestApis Ability which holds an OkHttpClient.

The Actor implements methods to access Abilities by their type.

Actor alex = new Actor("Alex")

.can(new BrowseTheWeb(BrowserType.CHROME));

WebDriver webDriver = alex

.uses(BrowseTheWeb.class)

.getWebDriver();

Performing Tasks

Actors also don’t implement any task methods, but provide one method, which takes a Task object instead. A Task only implements one method “performedAs”, which takes an Actor as argument.

The Actor simply calls the performedAs method of the given Task giving itself as argument. Within the Task the actor instance can be used to get its Abilities to perform the interactions with the application.

alex.does(new Login("alex", "p4ssw0rd"));

class Login implements Task {

String username;

String password;

// …

@Override

public void performAs(Actor actor) {

WebDriver webDriver = actor

.uses(BrowseTheWeb.class)

.getWebDriver();

webDriver.get("http://parabank.parasoft.com/");

webDriver.findElement(By.name("username"))

.sendKeys(username);

webDriver.findElement(By.name("password"))

.sendKeys(password);

webDriver.findElement(By.name("login"))

.click();

}

At first I was very tempted to use PageObjects on this level again to structure my code further. This is very discouraged by the inventors of Screenplay, who advertise to put technical details – like waiting logic, element sectors – right there in the Task.

Obeying this rule ultimately solved two problems:

There was no question where to put interactions. They could only go into a Task.
An interaction was always immediately tied to intent and context. Analyzing failures became much easier due to that.

The simple rule also allows optimization for the exact purpose of one task. E.g. when searching items, it is necessary to wait for the search field to be clickable, but when logging in, we can ignore it completely.

Composite Tasks

In some cases, Tasks may become quite complex. Consider the checkout in a shop application. It might include the following Tasks:

Entering a delivery address,
providing valid payment details,
confirming the checkout.

While there are certainly tests that require alterations to that process–like using a special delivery address, or different payment types – there might also be tests that only want a happy case checkout to happen in one step.

In such cases, we can create a Checkout task, which calls other Tasks. Such macro/composite Tasks should ideally not contain any additional interactions, but only call other tasks, so they don’t become another failure analysis layer.

Asking Questions

Quite similar to Tasks are Questions. The only real difference between the two concepts is that a Question does return something, so we can find out about the state of the system under test.

class LoggedInState implements Question<Boolean> {

@Override

public Boolean answerAs(Actor actor) {

final var webDriver = actor.uses(BrowseTheWeb.class)

.getWebDriver();

return webDriver.findElement(By.linkText("Log Out"))

.isDisplayed();

}

In theory, a Question can do the exact same things as a Task. However, even if there’s no way to enforce it, answeredAs should limit its actions to the necessary minimum to find the needed information. Reading some information on the current page is perfect. Navigating to another page is probably also fine as Tasks and Questions should usually not expect to be on a specific page. Changing the inner state of the system under test is definitely nothing that belongs in a Question!

Remembering Facts

An extension to the original concept of Screenplay that I found extremely useful are Facts. Facts are very simple data objects Actors keep in their individual memory and that can be accessed by Tasks and Questions – quite similar to Abilities.

For example, payment details or postal addresses are rather complex data that need to fulfill certain criteria. Putting such data as fields into a Task or a Question can be quite cumbersome.

Actor ryan = new Actor("Ryan")

.learns(PostalAddress.DEFAULT_NEW_YORK);

PostalAddress address = ryan

.remembers(PostalAddress.class);

That way we can just implicitly rely on an Actor’s memory, instead of stuffing all of them into Tasks and Questions.

Conclusion

After the migration of a huge test library from pure PageObjects, over UserObjects into Screenplay, I’d definitely say that Screenplay is by far the more appropriate way of organizing such test code.

Finding the right place for logic has become much easier, but still isn’t trivial. We still need to think a lot about where one Task ends and another begins. But at least changes are much better isolated.

While even the simplest version of the UserObject became quite huge, all Tasks and Questions in the new version fit into a single screen.

Analyzing failures is still not a fun job, but as Tasks and Questions always provide us with some degree of context and intention, a lot of previously strange effects became quite understandable. Stacktraces now actually provide us with semantic information and are no longer full of technical details.

Also contributing to the test suite became quite easy for the teams, as they now can simply provide their own Tasks, which can – but don’t need to – rely on other Tasks.