Test Smells: A Different View

In the last years I have seen many tests written by various project teams. Some of these tests had produced really headache for me. When comparing the problems of the tests with the known anti-patterns of testing (e.g. http://xunitpatterns.com/TestSmells.html) I found some matches. But much more I was surprised, that some really annoying problems I found aren’t known as test smells and even as good practice. That’s why I would like to share a different view to test smells.

Some points I will discuss here are IMHO different for unit tests and integration tests. The usual guides focus more on unit tests while I will focus on integration tests here.

Let’s start with the first smell:

1.) The name of the test method tells what happening in the test

A big NO to this pattern from my side. Having a class with 50 methods and searching the three methods in the class which are failing is really annoying if the method names are like

  • IfIMadeAFligthBookingAndITryToDoASecondBookingStartingLessThan50MinutesAfterItWillFail
  • IfIMadeTwoFlightBookingsAndTryToCancleTheFirstOneItWillSucced
  • IfIMadeAFligthBookingAndIBookASecondOneStartingMoreOrEqualThan50MinutesAfterItWillSucceed

It is really difficult to read these method names while trying to understand what they are doing. At least for me, after reading to the end of the method name I have already forgotten the beginning. Additionally, you are not able to find the searched method in the class without using the search tool.

I suggest building the names of the methods in a class with some kind of name schema

  • testTwoFlightsToLowGapException
  • testCancleFirstFlight
  • testTwoFlightsSufficientGap

If these names are not clear enough there is still the possibility to write one or to lines of inline doc inside the test method.

But actually there is no reason why someone should be able to completely understand the method just from the method name. Usually you are looking at a test method because it is failing, then you have to read the whole method in any case. And If you are searching for a method to duplicate and modify it, well even then the long method names doesn’t really help you.

2.) Test should test exactly one thing

At http://xunitpatterns.com/Obscure%20Test.html#Eager Test the authors define, that a test should not verify too much functionality with a single test. Some developers have interpreted this as “every test should contain only one assert”. This may be right for unit test, but it is really a bad idea for integration test. I would interpret this as “a test should verify an atomic part of a use case”.

What happens, if you insist on only one assert per test?

In case of an integration test of a method doing the booking of a flight, you for example expect

  • Method  throws / doesn’t throw an exception
  • Booking is stored/is not stored into database
  • Transaction information is stored to the database
  • Some business information are written to log-file
  • State of other objects is correct

For different conditions executing the booking method you expect different behaviors of these points. If you just test one or even just a part of these points in one test, there will happen one of three things.

a.) Not everything is tested

You may easily forget to test some points. But it could get much worse: I have seen many tests being green, not because everything was as expected but rather because they tested not all expectations. For example a test that verifies there is no booking stored in the database because there is no free seat left succeeds because a new mandatory attribute of the customer is not filled and the booking is canceled many steps before.

b.) Repeat yourself

Mathematics teach us, that with 7 different conditions and 5 points to test you have to implement 35 tests. It is nut just boring to write 35 times the same test just modifying one or two lines.

But what if with a later change on the system it is necessary to add an additional line to each of these test? And not only the 17 failing tests and the 8 not failing tests (see also 2a.) of these 35, but at 2734 tests of your 15386 test of the whole system?

c.) Building complex code

To avoid the duplication of the test code like in 2b you can refactor your test code and extract the repeated code in some way. Usually the test code will be much more complex then.

Additionally, testing all five points in one test speeds up your integration tests five times because usually the setup and execution of the tests takes much more processing time than the checks.

3.) Tests should create an expected object and compare it

I have often seen test implementations creating an expected object and comparing it with equals or toString to the object returned by the method to test.

In my experience this is not a good idea. Sometimes in the lifecycle of the software there will be the need to modify the object and add or remove attributes which are unessential for your test. But comparing the objects makes it necessary to modify your test.

Of course, you could implement the equals or the toString method to just ignore these attributes, but because it is the equals or toString method of an object from the production code and not from the test code, you can do this only for one case and even that only if you don’t need the method in your production code.

A better and much more flexible solution is to write a method to verifying the returned object to the expectations. If you prefer an expected object you can even give the expected object to the method for the verification.

4.) Tests should contain everything important in the test method

If you write integration tests for complex systems you have often to create complex objects and object structures. It is true that in the test method the important things of the test should be readable. But that doesn’t mean, that everything should be done in the test method. It would be much better to create helper methods creating the needed objects and providing the important values as parameters.

This will reduce the duplication of code, which makes it easier to fix tests after a change on the system made tests fail.

Additionally you will get more stable tests: If you write everything you need for the test into the method you will create your objects just with these informations needed to succeed the test. But in fact the constructed object is not a realistic object. And if you extend your production code sometimes your test will fail because of the unrealistic object. By using helper methods you can reuse the code to create the object structures and create realistic test cases.

5.) Tests could copy code many times

In my opinion it is a bad practice to copy code within test methods. During the hopefully long lifetime of the system there will usually happen many extensions and changes over time.

Depending on the change it may last much longer fixing all failing tests then doing the change. And if you don’t copy code, you have to fix it only on one place instead of many.

6.) Test should be in the form given-when-then

The “given-when-then” form is intended to it make easier for the reader to read and understand the test.

But in fact some assertions after the setup can make a test much easier to understand e.g. if the method to test increases a value it is a good idea to check if the value is before the call two and after the call three.

Additionally, assertions before calling the method to test helps to make sure, that after the setup the conditions are as expected.

And last but not least: When do you read a test? In my experience in 90% of the cases that happens only when the test fails. Usually you need some minutes up to hours to fix the test then. Compared to the amount of time for fixing the test the 30 seconds, you need more time to understand the test because it is not in given-when-then form, is insignificant.

7.) Test should tell what failed

The most important thing for a test to do is telling you, if something is wrong with your system. Of course, it is nice if you can see from the test result exactly where the error happened like in http://xunitpatterns.com/Frequent%20Debugging.html described.

But if you write an integration test there are hundreds of possibilities what could go wrong. And it would make the test much bigger and complex, if you test everything possible of the whole execution to exactly locate the problem.

Usually if an integration test starts to fail you have to start debugging the code to find the problem and understand what’s going wrong. And the knowledge you gain while debugging the code is the knowledge you need to fix the code or test afterwards.

8.) Tests should document the complete behavior

According to http://xunitpatterns.com/Goals%20of%20Test%20Automation.html#Tests as Documentation tests should be a documentation of the behavior.

For me the first and most important documentation of the system is the production code. This is the only source of truth. If the production source code are ten lines of good readable and understandable code, why should I write two hundred lines of test code to define this behavior?

Tests are written to guarantee important behaviors. And perhaps they additionally serve to document some behaviors not obvious visible from the source code. But they’re not intended to define the whole behavior.

This is also the fact why I don’t like the TDD principle: For me TDD focuses too much on the tests. Everything not covered by the test is ignored. IMHO it is important that production code behaves as anticipated also outside the test cases and it is not necessary to enforce every expected behavior with a test (in fact it is even not possible!).

9.) Use assertThat and many matchers

As stated before, I usually look at a test if it has failed. In this situation one of the most annoying things for me is the failing at an assertThat method with a bunch of matchers.

Instead of immediately getting a hint which of the expectations fail, I have to rewrite the statement into multiple statements to see which of the matchers fail. And even then I see only at which expected value the rewritten test fails, not what the real value is.

Alternatively, I could inspect the object with the debugger and compare the inspected values with the expected values myself. But that’s also an error prone and painful approach if you test a complex object hierarchy.

And all this pain while searching the problem arises only because you use the “better readable” assertThat instead of assertEquals and having just one assert statement?

10.) Use static imports

I also don’t like the use of static imports. Especially in tests they are often used to “focus” on the important things.

But in bigger projects there are often different libraries used for testing. Some method names like “any”, “and”, “then”, “equals”, “one”, “single”, … and also class names like “Matcher” are used from these libraries for different methods. It is really confusing if you move from one test class to another and “any” means something completely different.

Bernhard Mähr @ OPITZ-CONSULTING published at https://thecattlecrew.wordpress.com/

Über bmaehr

Ich bin ein Technik-Fan, der in der Informatik seinen Beruf und seine Berufung gefunden hat.
Dieser Beitrag wurde unter Software Engineering abgelegt und mit , , verschlagwortet. Setze ein Lesezeichen auf den Permalink.

Kommentar verfassen

Trage deine Daten unten ein oder klicke ein Icon um dich einzuloggen:

WordPress.com-Logo

Du kommentierst mit Deinem WordPress.com-Konto. Abmelden / Ändern )

Twitter-Bild

Du kommentierst mit Deinem Twitter-Konto. Abmelden / Ändern )

Facebook-Foto

Du kommentierst mit Deinem Facebook-Konto. Abmelden / Ändern )

Google+ Foto

Du kommentierst mit Deinem Google+-Konto. Abmelden / Ändern )

Verbinde mit %s