Should a project have 100% test coverage?

Recorded March 26th 2020 – watch the full episode on YouTube.

Many people believe that projects should aim to have 100% test coverage, but is that really worthwhile? What would you say to a client who asks for it?

Ellen Shapiro: I think one thing that's really helpful in Xcode is that you can show code coverage and you can show that a file that's not 100% covered, actually the pieces that aren't covered are not necessarily pieces that need to be covered. These are emergency guard bailouts. These are things where you can sort of see, “hey, if something's gotten to this phase, things are really messed up.” And so you would need to just bail out of it, but you don't necessarily need to test that that has happened. Because writing that test takes away time from, are we doing more tests on the mission critical part of the application?

“100% test coverage doesn't mean it's never going to not work.”

Another thing that I tell people is, 100% test coverage doesn't mean it's never going to not work. It doesn't mean it's never going to not crash. You can have tested every single line of code in your application, but if somebody put some weird input in there that you weren't really prepared for, there's a significant possibility that it might not work. The other thing is if you test everything once rather than testing some of those mission critical things repeatedly, you have this sort of false sense of assurance that, "Oh yeah, I've tested everything." Whereas if you really make sure that you've thoroughly tested a lot of the possibilities of how things could go wrong on your mission critical code, that may not necessarily add up to 100% test coverage, but you are way less likely to have problems.

Paul Hudson: It's like that old QA joke, a QA engineer walks into a bar, orders one beer, orders 999 beers, orders minus one beer, orders a lizard. Throwing garbage in, but really hard fuzz testing. The other thing I think is that ultimately it's a number and it's not even a good number, because all it means is while this test was run, these lines of code were run. They could have been completely unrelated to what you're trying to test or not even evaluated even slightly. They were just run as part of the test that was being run. So that doesn't mean anything – it doesn't prove your code works in any meaningful way whatsoever.

Ellen Shapiro: Yeah. One thing to remember also with Swift is there's a bunch of stuff where the type system handles a bunch of the things that would be requiring more test coverage in other languages. So in Python you might have to test like, “is this thing that I'm getting back actually a string?" And in Swift, if it's not a string, it doesn't compile. So it's definitely something where having that context is helpful.

So I think those are some strategies I would use around that. Honestly, usually the people that I have to convince are usually not the clients, but usually the managers, because managers love the statistic of 100% test coverage. They think it looks amazing. And I think it is something where there's plenty of places where they sort of pick an arbitrary number for test coverage and they try to hit it. And I think that having a number is a lot less important than actually taking the time to go through your code every once in a while and look at what isn't tested. Because some of it's stuff where maybe you might actually be able to delete it. Some of it's stuff where, okay, yeah, it makes sense to not test. Some of it's like, “oh, I should probably test that.”

But it's something where if you're able to do that regularly, that's going to be much more effective than constantly trying to hit an arbitrary goal of a certain percentage of test coverage in practice. I've generally found somewhere between 75% and 85% is the best in terms of balancing what do I need to test realistically, versus what do I need to sort of move on with my life? But that's a very, very wide range.

And I think it's something where that's also inclusive of an application that has a UI. There are certain things with UI that are going to be harder to test, especially stuff like animations and drag and drop, and stuff like that. Some of that is a real pain to test. And I think it's much easier to have even higher test coverage on an SDK that has no UI involvement.

But it's also something where there's not a number that I think is something that people should shoot for all the time. If people are looking for a number, ask them why they're looking for that number, what is the assurance that they're trying to get from saying that they have 100% test coverage? Is it that it's never going to crash? 100% test coverage doesn't guarantee that. Is it that it's always going to work? 100% test coverage does not guarantee that. And so that's something where finding out the underlying reason why people want something is usually much more helpful.

Paul Hudson: How do you make sure your test coverage percentage doesn’t decline over time?

Ellen Shapiro: I think it is something where I'm looking at, okay, we already have this much coverage. We don't want it to keep dropping below a certain percentage, I think is reasonable. I think it's also a way of sort of saying, “Hey, with these changes that you're making, you've actually added 5,000 lines of code and you've added zero tests. And these 5,000 lines of code are not actually exercised by anything.” Maybe you would like to add some tests to these changes. I think it's something where you do have to have some wiggle room. You can't just say something must have at least 75% test coverage, but it is something where having the tools to identify when somebody is adding code to your repo that isn't tested is actually really helpful.

Paul Hudson: So let's talk about some of the functions that we have apart from just XCTest, because to backup the way my brain is thinking, does my code work the way I intended? I think of things like assertions or preconditions as well as fatalError() – you mentioned that already. Are you also using assertions and preconditions?

Ellen Shapiro: Personally, I tend to prefer to do a guard and then do a fatalError() within the guard rather than using assertion or precondition, just because personally I find that leads to me figuring out what the problem is a lot easier. I think that's somewhat more of a personal preference than anything else. I think it is something where there are things where you might add an assertion failure in, so that if something happens during development, you find out about it immediately. But if it happens in production, it's not the end of the world to sort of bail out and be done with it. That's sort of the use case that I see for assertions and preconditioned failures.

But I think honestly at this point, most stuff, if it's doing something that would cause an assertion failure, something has gone pretty wrong. And in many cases it's probably better to have that be a fatal error. But it's also something where the question is, what are the consequences of this not working? Are the consequences of this not working something where this really should work every single time? And if it doesn't, then something is messed up. That should be a fatal error. If it's something where there is a reasonable possibility that this could not work in production, then that's where you use an assertion failure.

This transcript was recorded as part of Swiftly Speaking. You can watch the full original episode on YouTube, or subscribe to the audio version on Apple Podcasts.