The Beleaguered Page Object Model

Table of Contents

Note: I originally wrote this back in the beginning of 2023 before I knew where my Xebia assignments would take me. Those assignments weren’t related to testing web applications, but they did keep me busy enough to forget to publish this blog. It contains some good insight on how to architect web tests, so I decided to publish it here

When I first joined Xebia, where many of my colleagues are involved in automated Cypress and more recently Playwright, I heard a lot of shade being thrown on the Page Object Model (confusingly for Java programmers, also referred to as POM).

Now I vaguely recall the POM being a significant advance in writing Selenium tests 10 years ago. I also had most of my browser automation experience equally spread between Cypress and TestCafe, the latter of which has explicit support for the page object modelling. What happened? Why were POMs suddenly (to me) falling out of fashion?

When I brought this up to one of my esteemed colleagues, Robbert van Markus, he clarified his own feelings on the application of POMs to these main points:

Testing Library Selectors are Clear(er) #

Previous browser automation frameworks required the use of arcane CSS selectors to interact with the page. Hiding such selectors away in a POM behind a readable name made tests clearer.

Today, testing-library style selectors are widely supported, and are closer to the actual user experience and flow. They are inherently clearer when using them directly in tests - and as such, lowers friction in introducing new tests by people not already familiar with the larger code base.

A more complete example follows, but for now compare the following (CSS style selector) which you would probably hide in a method on a POM:

cy.get("input[name='user']").type(username);

and a similar function you could write directly in the test itself:

cy.findByRole("textbox", { name: "username" }).type("user");

While they may be comparable initially on reading, the first actually requires knowledge of how the page is structured, the tag that is used for the password field, etc. The second one uses a Testing Library style selector that can be written with little knowledge of the actual page.

Having “flat”, transparent tests where behavior isn’t abstracted away - particularly early in development - may also allow the tests to adapt more rapidly to a rapidly changing Application Under Test (AUT).

Testing Library Selectors are Robust #

Testing library selectors are already an abstraction over the implementation details of HTML and CSS. By selecting by “role”, label or other accessibility features your tests are going to be significantly more robust by default - hence, lowering the need to abstract them into a common POM.

As a bonus, by using selectors which use accessibility features like roles, your tests will help teams write more ‘accessible’ code in the codebase as a whole - not just the tests! As a consequence, you are going to be able to provide early UX/accessibility feedback during development from TA/QA/Dev without additional accessibility experts. This particular type of early feedback is quite often underrated.

POMs can be Surprising Effort Sinks #

Many an engineer in test has been tempted into over-engineering their page object models. This can take many forms, but one of the most common in practice is the pain of managing “state” between the actual screen and the POM classes. It gets even more complex as you try to integrate with asynchronous code and behaviors, such as those in Cypress where the actual execution of actions is deferred.

Add into this that the concept of page is much more fluid in modern applications than in the past - you’re going to find yourself moving behaviors between various “pages” much more often. Modern applications are more easily modeled around concepts of components and flows.

The More Complete Example #

The POM way:

// different file
class LoginPage {
  navigateTo() {
    cy.visit("/login");
  }

  enterUsername(username) {
    // binds to the actual input name attribute; by refactoring 'user'
    // to 'username' it will need to update test without any functional
    // difference for the user
    cy.get("input[name='user']").type(username);

  enterPassword(password) {
    // probably the most used way; if code is refactored and
    // data-testid attribute is not exactly mapped to the same elements
    // test will fail again
    cy.get("[data-testid='password']").type(password);
  }

  submit() {
    // unclear selector; hard to refactor
    cy.get(".submit").click();
  }
}

// test
const LoginPage = require("./LoginPage");

describe("Login", () => {
  let loginPage;

  beforeEach(() => {
    loginPage = new LoginPage();
  });

  it("should allow a user to log in", () => {
    loginPage.navigateTo();
    loginPage.enterUsername("user1");
    loginPage.enterPassword("password1");
    loginPage.submit();
  });
});

The Testing Library Way:

describe("Login", () => {
  it("should allow a user to log in", () => {
    cy.visit("/login");

    // the following will use label element text, referenced node in
    // aria-labelledby attribute, aria-label attribute, etc or any
    // other 'accessible name'
    // see: https://www.w3.org/TR/accname-1.1/
    cy.findByRole("textbox", { name: "username" }).type("user");

    // could be more readable for form pages, the next will use label
    // text to find input; there's an implied assertion on the
    // existence/visibility of the label, plus the ability to use regex
    // can make it robust against text changes.
    cy.findByLabelText(/password/i).type("password");

    // Finally we're also asserting that there's a button like element
    // that contains the text "Submit" before clicking it.
    cy.findByRole("button", { name: /Submit/i }).click();
  });
});

My Take-Aways #

What I have come to understand through this discussion is that while the Page Object Model is not necessarily bad, per se, it shouldn’t be the default architectural pattern for your tests either. At best it’s not providing much over “flat” tests, at worst you could end up spend more time getting the model to work in an asynchronous environment than writing something more valuable - the tests.

That said, there’s always a place for encapsulating behaviors into repeatable actions, such as the page actions model, or custom Cypress commands. Robbert also mentioned that there is a new feature in Cypress 12 to express application state through Custom Queries (see here for examples: [https://docs.cypress.io/api/cypress-api/custom-queries]) that may suggest other ways to architect tests.

My thanks go out to Robbert (and also Joël Grimberg) for the insights on modern browser test automation architectures and approaches - and most of the material for this blog post!