Friday, August 21, 2009

Integration Testing

My colleagues and I have been learning a lot about testing this week. One of my friends' thoughts are here. He asks the question of whether it is a unit test if the method in question calls other methods. I agree with him and would argue that no, it is not, since the culprit is unclear if the method fails (until you go into debugger). Arguably, this could be worst for code that is properly refactored. This is something I don't think code coverage tools will help you recognize.
StackOverflow has some good definitions of unit and integration testing here, and here. I like Michael Feathers' definition:
A test is not a unit test if:
  • It talks to the database
  • It communicates across the network
  • It touches the file system
  • It can’t run at the same time as any of your other unit tests
  • You have to do special things to your environment (such as editing config files) to run it
Josh Brown, a colleague of mine, also suggests adding:
  • the code under test uses an external framework or library (Hibernate, Spring, some internal library, etc.)
  • the code under test calls other methods (that you didn't mock)
I wholeheartedly agree with both of these.

I had a case that was interesting to me. I was working on some code that I've become the caretaker of to get it better tested (a good thing too, as a bug was discovered). The project takes big files and chunks them into smaller files. It handles any xml or delimited file. The way I was testing it was to take different types of files, chunk them then merge them and make sure the record counts stayed the same. This isn't really (even if I had more granular asserts) a unit test. It relies on an external resource (the sample files) and involves multiple methods. Someone else suggested unit tests using files loaded through classpath, which I disagreed with since they're not really unit tests. Those File objects should be mocked (or possibly stubbed if using Groovy).

I did attempt to make these an integration test with FailSafe. I decided against this when I figured out that I could not override the behavior to not run the integration test on deploys. In my opinion, these should not be run on deploy, integration-test should be later in the Maven lifecycle. The reason for this is that deploys are often done across different environments and the whole idea with integration tests is that they depend on external resources. An example of this is a project my friend is working on, which works with the dev database, where it would be fine to run the integration tests on deploy, but when it is deployed to prod, we definitely don't want to modify our production database as part of running tests. To continue to be able to access the QA environment or dev environment would mean adding special firewall rules. What's more, when I set the configuration to skip the tests, it ran the tests anyway (maybe this is why it's alpha?). According to their documentation, it shouldn't have even run the unit tests. I just don't feel quite comfortable deploying something so young and apparently unstable into production code. Maybe someday this will change. There are some other suggestions here. The page he links to from Codehaus states that there are rumors of a future version of Maven supporting integration tests with a src/it/java and its own integration-test phase. Its kinda surprising that with so many organizations using continuous integration and it not being something all that new.

In the end, I decided to do what others have done, which is to have the integration tests in a separate module that is only built if the argument is passed for it.
The parent pom should have
Also, do not list it in the modules section. This way, the integration module will only be built (and the associated surefire tests run) when the -Dit argument is present. I think for most projects, this makes sense for most projects, though I'm still a little torn on the issue. While Failsafe lets you still build even if the integration test fails, doesn't this defeat the purpose of continuous integration? This is especially the case if your integration tests depend on resources that may be going up and down all the time, it doesn't make much sense to run something every time if half the time you just ignore the results anyway. I also wonder how practical this is for organizations that have resources on multiple subnets, where a deploy from one environment to another can result in failed integration tests not because of any problem with the code, but because of a technical failure.

The next part (and for me, the harder part) will be mocking out (and maybe stubbing with Groovy's metaclass) the pieces needed so I can isolate the methods in the classes for unit tests, as there currently aren't any for this project. I'll post any interesting results I get from that. For other initiates, such as myself, I've found this article helpful: