Test helpers for React Testing Library.
rtl-utils has four value-adds on top of standard RTL:
- Automatic waiting to avoid tedious/boilerplate
waitFor
s - Quick & easy element lookup by
data-testid
s - Wrapping
N
number of context providers around the component-under-test - Slightly smarter, more ergonomic actions like
click
,type
, etc.
In standard RTL, tests must use waitFor
after every async action (e.g. a GraphQL request on page load, or an API request fired by a button click), to ensure any async behavior like useEffect
s or React 18's async rendering have finished, before asserting against the updated DOM.
With rtl-utils, we wanted tests that are both as robust and as succinct as possible, and so we "auto wait" in two places:
- The initial
render
, e.g.await render
will auto-wait for async behavior to settle, and - After specific actions that produce async behavior, e.g.
await clickAndWait(button)
Unlike waitFor
, which requires knowing a test-specific condition to keep polling for, our "automatic waiting" infrastructure does not poll and is 100% generic across tests. Instead, our wait
(and the various ...andWait
action methods) watches for async behavior to "stop happening", and then assumes the DOM must be stable, so lets the test continue.
To determine when async behavior has stopped, rtl-utils requires that teaching any "async causing" behavior in your tests, i.e. GraphQL operations, to pass their promises to rtl-utils's addToWaitQueue
. I.e. an example is a MockLink
from our rtl-apollo-utils
project:
import { addToWaitQueue } from "@homebound/rtl-utils";
/** Subclass the MockLink so we can hook up `request` to the `rtl-utils` wait queue. */
class RtlMockLink extends MockLink {
public request(operation: Operation): Observable<FetchResult> {
const observer = super.request(operation);
let resolve: any;
// Tell `render` to wait until this operation is done
addToWaitQueue(operation.operationName, new Promise((_resolve) => (resolve = _resolve)));
// Pass resolve twice b/c we want to resume on either success or error
observer.subscribe(resolve, resolve);
return observer;
}
}
This setup allows rtl-utils to detect multiple, cascading GraphQL requests (or other follow-on async behavior), so every test can reliably, "for free", wait for async behavior to settle before moving on.
Granted, instrumenting your async behavior to call addToWaitQueue
requires an up-front investment, but the upshot is that, once you do that, all your tests can benefit (aside from knowing when to click
vs. clickAndWait
) and be completely agnostic about how/when async behavior has settled.
Note that you can still call the non-await
version of actions to test & observe loading states, and "you should", but pragmatically most of our tests assume the loading infra is handled for them, and instead of re-testing it in every test, the majority of our tests focus on the happy-path, post-loaded behavior of their specific business case.
While standard RTL provides a wide variety of findBy...
lookup methods, our applications at Homebound
lean into data-testid
s as our primary lookup method (see "Why data-testids?" section for rationale), and so our render
result can immediately look elements by their data-testid
:
const r = await render(<Component />);
expect(r.firstName).toHaveTextContent("first");
Which is a shorthand for r.getByTestId("firstName")
.
Granted, this is pretty minor, but across ~thousands of tests, the succinctness of r.firstName
vs.
r.getByTestId("firstName")
adds up.
Standard RTL provides a sensible recommendation to "use what your user can 'see' with their eyes/screen reader" to find elements, i.e. getByText
or getByRole
(see this post), e.g.:
screen.getByRole("button", { name: /hello world/i });
This is certainly better than long/esoteric DOM/CSS selectors that are coupled to DOM structure or CSS class names, which cause tests to be brittle as your application changes (very similar to mocks vs. stubs and how what mocks actually test is not high-level business cases, but instead the current structure of your application's codebase).
That said, we still prefer data-testid
s because:
-
data-testid
s always "just work".While kosher lookups like
getByRole
work 80-90% of the time, there will inevitably be some aspect of the DOM/UX you want to assert against where they don't work, and at which point you have to reach for something else, which is hopefully not brittle DOM/CSS-based selectors.When this happens,
data-testid
s are probably your best bet, but now you need the infra fordata-testid
s anyway, and our tests have mixedgetBy...
based on the use case.By using
data-testid
s across the board, the programmer writing the test "just knows" how to find elements, and doesn't have to make a judgment call on a case-by-case basis (and nor does the code reviewer reviewing the code have to reason about/2nd guess the chosen lookup method for every assertion).And, of course, the tests are extremely consistent.
-
data-testid
s are very succinct, e.g.:// Standard RTL click(screen.getByRole("button", { name: /hello world/i })); // rtl-utils click(r.helloWorld);
This is a small difference in the small, but when repeated across 100s/1000s of tests in a large codebase, the brevity adds up.
-
Because our Beam component library usually auto-assigns
data-testid
s to match the component's label anyway, we're effectively still achieving the "resiliency to change" and "use what the user sees" goals of the core RTL assertions, just more succintly.
When creating data-testids
, we have three recommendations:
-
Use camel-casing, i.e.
fooBar
instead offoo-bar
orfoo_bar
.Camel case names "look like a method", i.e. are valid JavaScript identifiers, and so work best with our
r.firstName
shorthand. -
Don't worry about
data-testids
being unique, i.e. if producing elements in a loop/table, prefer outputting multiplefooBar
s instead of ensuring uniquedata-testid
s by including a row id/index identifier likefooBar_${row.id}
orfooBar_${i}
.Our rationale is:
- Producing unique
data-testid
s can become tedious boilerplate, especially if you're using a custom component like<TextField />
within a loop and also want it to internally recursively generate uniquedata-testid
s that are themselves unique to "this row". - A stable
data-testid
meansdata-testid
s are potentially suitable for analytics use cases like Heap or Fullstory or DataDog, to answer questions like "how many times did users click this row's call-to-action,data-testid=someAction
", without the analytics tool (or the PM configuring the tool) having to parse/substring out the row id/row index that would be there if thedata-testid
was 100% unique.
Note that our shorthand
r.firstName
also supports indexed-based offsets, e.g.r.fooBar_0
will get the 1stdata-testid=fooBar
element, andr.fooBar_1
will get the 2nddata-testid=fooBar
, and this "get me the 1st row's button" / "get me the 2nd row's button" is typically more intuitive for tests anyway. - Producing unique
-
Set up your component library to create meaningful
data-testid
s from field names/labels.For example, if a page does
<TextField label="First Name" />
, yourTextField
should camel-case the label and use it as a default fordata-testid=firstName
.This will make your
data-testid
s intuitive and consistent, while also avoiding the boilerplate of having to set both alabel
anddata-testid
that are trivial mappings of each other, i.e.<TextField label="First Name" data-testid="firstName" />
. For one form field, it's not too bad; for 100s in a large webapp, it gets tedious.Granted,
TextField
should still accept custom/explicitdata-testid
s if the page needs to explicitly set one.
The standard RTL render
provides a wrapper
opt for providing a single wrapper element, like an Apollo mocked provider.
However often you need a number of wrappers, e.g. an Apollo mocked provider, a component library/theme wrapper, an in-memory router for react-router, etc.
The rtl-utils
render
accepts a Wrapper[]
where the Wrapper.wrap
method wraps the component-under-test with its corresponding context provider.
For example:
const r = await render(<Component />, withRouter("/contact"), withApollo(mock1, mock2), withComponentLibrary());
Will result in a component tree that looks like:
<ComponentLibraryProvider>
<ApolloProvider>
<MemoryRouter>
<Component />
</MemoryRouter>
</ApolloProvider>
</ComponentLibraryProvider>
Note that instead of calling with...
wrappers directly from your individual tests, our recommended pattern is to make an application-specific src/utils/rtl.tsx
file that uses an opts
hash for your tests to declaratively request the wrappers they need:
import { render as rtlRender } from "@homebound/rtl-utils";
// Your application-specific providers
type RenderOpts = {
modal?: true;
mocks?: MockedResponses[];
at?: string;
};
export function render(component: ReactElement, opts: RenderOpts = {}) {
const { modal, mocks, at } = opts;
return rtlRender(
component,
...[
// E.g. if testing your modals needs a wrapper to work
...(modal ? [{ wrap: (c: any) => <OpenModal>{c}</OpenModal> }] : []),
// Always wrap component in our theme
{ wrap: (c) => <ComponentLibrary>{c}</ComponentLibrary> },
// Use withApollo from rtl-apollo-utils
withApollo(mocks || []),
withRouter(at || "/"),
],
);
}
There are three benefits of this approach:
- Ordering of providers is often important, e.g. Apollo should be "outside" the
ComponentLibrary
(or what not; this ordering will depend on your application's specific providers), and having your ownrender
provides the correct order for all tests to use. - It keeps your tests'
import
s clean b/c they're importing only yourrender
method and not having to import the variouswithApollo
wrapper functions. - Along with succinctness and ergonomics, an application-specific
render
method also provides a "mini-gateway pattern" that isolates your tests to a single abstraction that you control, so you can more easily migrate your entire test suite to new/different providers.
The standard RTL fireEvent.click
-style methods are very-low level: typing into a text field requires a somewhat unwieldy fireEvent.input(element, { target: { value } );
), and they also fire only their single event (whereas a user typing would also focus and blur the field).
So rtl-utils provides a few higher-level action methods that are more ergonomic:
-
type
/typeAndWait
(focuses, inputs, then blurs) -
click
/clickAndWait
-
select
/selectAndWait
Note that testing-library's user-event serves the same purpose, and admittedly:
-
we did not know about it when we started rtl-utils's actions
:shrug:
, but also -
user-event
's actions are quite a bit more sophisticated.Arguably this can be a good thing, but our actions are "just smart enough" for our needs.