Remember that our goal in agile BI development is the frequent release of production quality working software for user feedback and acceptance. At the end of each iteration or sprint, and each release our working product is expected to be of shippable quality, even in its most embryonic stages. This objective requires an entirely different approach to our quality assurance methods. Foremost it means integrating QA efforts right into our iterations.
Traditional BI development methods push system and acceptance testing to the end of the project cycle. This backend testing is typically manually intensive, possibly supplemented by the use of semi-automated tools. We need an entirely different testing discipline for Agile BI development.
First and foremost testing is integrated into the development process. Each development iteration must include plans for QA activities. One of the great things about this is that bugs don’t get a chance to accumulate over the project lifecycle. Quality feedback is immediate and frequent, and bugs are handled as they arise. QA specialists become integral members of the development team rather than gatekeepers at the end of the development cycle. Developers become integral to the QA process and learn sound testing practices as an extension of their technical skills. When I first introduce this notion to new agile teams I often get pushback from developers who say things like, “I don’t have time to test.” Or, “Testing is not my job.” I generally quell the urge to say something like, “If building a high quality BI system is not your job, then what exactly is your job?” Once developers establish the rhythm of making testing an integral part of development they usually love it. I’ve had a number of BI developers wonder why they didn’t learn to integrate testing and development long ago.
Essential to integrated testing is test automation. Manual testing is just not practical in a highly iterative and adaptive development environment. There are two key problems with manual testing. First, it takes too damn long and is an inhibitor to the delivery of frequent working software. Teams that rely on manual testing ultimately end up deferring testing until dedicated testing periods, which allows bugs to accumulate. Second, it is not sufficiently repeatable for regression testing. While we seek to embrace and adapt to change, we must always be confident that features that were “Done, done!” in a previous iteration retain their high quality in light of the changing system around them. Test automation requires some initial effort and ongoing diligence, but once technical teams get the hang of it, they can’t live without it.
Teams that do not practice integrated, automated testing are not really agile. It just isn’t feasible to create production quality, working features for user acceptance every 1-3 weeks without integrated and automated testing.
And by the way, quality assurance must be more comprehensive than system level testing. I’m always surprised at BI teams who treat final system testing as the only testing that is required in BI development. Agile BI developers test every unit of code, every integration point, every data structure, every user feature, and ultimately the entire working system not matter how embryonic. Unit testing involves testing the lowest level components that make up the BI system such as SQL scripts, ETL modules, stored procedures, etc. Integration testing involves testing all of the data transition points and wherever commercial tools are receiving or returning data. As data is pumped from source systems into staging databases; or from staging into multidimensional databases or OLAP engines, each data structure along the dataflow path must be tested to ensure that data integrity is preserved or enhanced. Simple mistakes like copying a VARCHAR(50) value into a VARCHAR(30) field in the staging database can wreak havoc on data integrity. Finally, each newly developed feature must be tested for acceptance and accuracy. Does it do what the user wants, needs, and expects; and does it do it correctly? While this is the ultimate acid test, we need confidence that our system is behaving well throughout the process flow.
The benefits of integrated, automated testing are even further boosting by using test-driven BI development. TDD is as much an implementation practice as it is a testing practice. In this approach test cases are written first, and then the code (or script, or configuration, etc.) is written to pass those test cases. When the system passes all of the test cases, and the BI practitioners can’t think of any new test cases, then the implementation work is “done”. That is it works as the developers think it should, and is of production quality. It is now ready for user acceptance to consider it “done, done”. While test-driven development may not be as mandatory as test automation and test integration, this is a technical practice that yields tremendous benefits since testing and development are inextricably linked. The test suite grows alongside the system, and since testing is automated the suite can be rerun frequently to maintain a high level of confidence in BI product quality.
TDD is applied most prominently at the lowest component level, or unit level, which ensures that high quality exists in the building blocks that make up the system. Story test driven development will help ensure that the user acceptance criteria are clearly defined for each user story before development begins.
Integrated automated testing in database development presents a unique set of challenges. Current automated testing tools designed for software development are not easily adaptable to database development, and large data volumes can make automated testing a daunting task. Data warehouse architectures further complicate these challenges since they involve multiple databases (staging, presentation, and sometimes even pre-staging); special code for data extraction, transformation, and loading (ETL); data cleansing code; and reporting engines and applications.