Mobile app testing matters and the snow ball effect (XCTest and Xcode)

If you are building a 99 cent iPhone app or a free app for a client, you might consider setting up automated tests to give you maximum code coverage as overhead. You app only has 4 screens right, you can just monkey test through the screens and your done. You can even upload your app the Amazon Device Farm and select the Built-in Fuzz Test Suite and it will automatically do this for you on different device types. Here is my take on why this is really a false economy, leaving aside the whole software engineering and TDD arguments.


The Snow Ball Effect



BlogSnowBallEffect

Firstly, what exactly is the snow ball effect? It is a process that starts out from a clear or low complexity status and then builds upon it's self becoming larger and more complex like a snow ball rolling down a hill. Taking the diagram above, the initial state is version 1.0 for your app. You have spent months working on it and all the code and dependencies are clear in your mind. Then comes the first bug fix or new feature release, version 1.1. The testing effort now, if you perform a full regression test is all the testing of version 1.0, plus the new features of version 1.1 and changes to the dependencies within your code. Each release compounds the amount of effort required to fully regression test until the entire app is simply too large to test adequately manually.

As we build software we construct modules of code in some format implementing first order logic piece by piece to solve our problem. Over the course of many months we become so intimate with our codebase that we can assume when version 1.0 is almost ready we know what to test. Unfortunately we have not only programmed the computer we have also programmed our own thinking into how the app should work. So if we only perform manual tests then we will either be testing against our function spec. or use cases which are an abstract representation of our code assuming they are still up to date. Or we will be testing against our programmed thinking in how this app will be used and how it should work. The problem is once the app is released in the wild someone else without our pre-programmed thoughts on the app will start using it and bang suddenly something which we would never have thought have causes a crash. It is hard to blame a user for then writing a 1 star review and this then both effects your app's reputation in the App Store it also lowers your average rating which helps the App Store how to rank and promote or not promote your app in comparison to your competitors apps.

Of course we can fix the code with a patch and retest these parts and release version 1.1. Shortly afterwards we may want to add some new features and start implementing version 1.2. Now this new code will be fresh in our minds but the old code from version 1.0 becomes fuzzier over time. Now our manual tests start to get less accurate as we focus on our new code changes and assume old code not changed in this release should continue to function as before. However, it does not take too long before the volume of code interdependencies exceed our ability to keep track of them and then sure enough something which has worked just fine since version 1.0 suddenly has a problem in version 1.2.

So the obvious answer is a full regression test of our code with each release. This is the crunch point for manually testing our apps. To setup unit tests as we develop in version 1.0 may have taken more time than just manually testing the functionality for a simple app. But once we are up to version 1.2 or 1.3 etc then to manually retest the same code over and over again plus the new code with each release suddenly becomes a major time issue and as humans are not great at repetitive work we also tend to lower the quality of the manual testing. Each new release increases the size of the snow ball effect for our testing effort.

The solution is to automate your unit testing from the start. It will take a little longer to write the code and unit tests for the code for version 1.0 but for each new version we only need to add new test cases and go back and modify or retire existing ones where the old code base has changed or a new interdependencies has been created. A framework such as XCTest for Xcode (iOS and macOS Apps) allows us to automate the test runs providing information about tests which passed or failed.

Once setup we can then automate further with a continuous integration test environment running on Xcode Server or even upload to the Amazon AWS Device Farm and run our app and XCTest test cases against real hardware devices, such as different types of iPhones for example making use of the camera etc from the device.

In this case, the amount of work to implement the test cases will mostly be inline with the amount of work on new or changed features. So minor releases will only require minor increases in test case effort and major release will require major effort to add new test cases to the test suite for the app. The XCTest will then take all of the existing and new test cases and complete the test run for us. If we rely solely on manually testing, we are essentially re-inventing the wheel for each release by manually going through the entire test case list. Regardless of minor or major release, to regression test we will need to manually test everything each time so instead of a linear effort of writing test cases compared to new development work, we will have new development work and 100% testing effort. With the 100% increasing in nominal work load as the code base grows. So the snow ball effect.

So what can you automatically test with XCTest framework for Xcode


  • Functional tests. These are test cases, whereby we can test the logic of our code.
  • UI tests. These are test cases where we exercise and test the UI components such as buttons, labels etc of the app.
  • Performance tests. We can perform a performance test to establish a baseline of performance for the app in version 1.0 then in each new release repeat the test and ensure the performance is the same if not better. When the performance improves in a version we are releasing, then this will become our new baseline performance for future versions to be measured against.
  • Code coverage. Using the built in code coverage tool, we can see where the code is being tested with our test cases and where the code is not being tested due to missing test cases.

In conclusion, with the App Store giving all users of apps the power of Caesar to give us app a Thumbs UP or Thumbs DOWN in their reviews of our apps for the whole world to see, good automated testing is not just about sound programming practice, it is also directly related to the success of our apps in the app store and ultimately the economic return on the investment of our time.
blog comments powered by Disqus