In our previous blog, 5 keys to nailing a BI Implementation, we focused on achieving strategic success in implementing Business Intelligence applications. In this blog, we turn our attention to a tactical, but important aspect – Testing of Business Intelligence Applications.
Effective integration of testing in the implementation process builds trust and confidence among business users as they make crucial strategic decisions, based on the BI data generated.
Testing of Data Warehouse/Business Intelligence (DW/BI) applications is a little different than testing traditional transactional applications as it requires data-centric testing approach.
The typical challenges an enterprises faces while testing DW/BI implementations include:
- Data volume, variety and complexity
- Data anomalies from disparate data sources
- Data loss during data integration process and handshaking between sources
- Time consuming
- No audit trails, reusability or methodology resulting into high cost of quality
- Specialized skills required to execute data validation and verification process
To ensure data completeness, accuracy, consistency, security and reliability throughout the life cycle, it is important to test all these aspects at each data entry point in the BI architecture and not just at the end through reports or dashboards.
BI Testing Strategy
The goal of testing BI applications is to achieve credible data. And data credibility can be attained by making the testing cycle effective.
A comprehensive test strategy is the stepping stone of an effective test cycle. The strategy should cover test planning for each stage, every time the data moves and state the responsibilities of each stakeholder e.g. business analysts, infrastructure team, QA team, DBA’s, Developers and Business Users. To ensure testing readiness from all aspects the key areas the strategy should focus on are:
- Scope of testing: Describe testing techniques and types to be used.
- Test environment set up.
- Test Data Availability: It is recommended to have production like data covering all/critical business scenarios.
- Data quality and performance acceptance criteria.
The below diagram depicts the data entry points and lists a few sample checks at each stage. – Data Collection, Data Integration, Data Storage and Data Presentation.
The primary aim of data completeness is to ensure that all of the data is extracted that needs to be loaded in the target. During the data acquisition phase it is important to understand the various data sources, the time boundaries of the data selected and any other special cases that need to be considered. The key areas this phase should focus on are:
- Validating the data required and the availability of the data sources from which this data needs to be extracted.
- Data profiling: Embedding data profiling activity helps in understanding the data, especially identifying different data values, boundary value conditions or any data issues at early stages. Identifying data problems early on will considerably reduce the cost of fixing it later in the development cycle.
Testing within the data integration phase is the crux as data transformation takes place at this stage. Business requirements get translated into transformation logic. Once the data is transformed, thorough testing needs to be executed to ensure underlying data complies with the expected transformation logic. Key areas this phase should focus on are:
- Validating the Data Model: This involves validating the data structure with business specifications. This can be done by comparing columns and their data types with business specifications and reporting column requirements ensuring data coverage at source.
- Reviewing the Data Dictionary: Verifying metadata which includes constraints like Nulls, Default Values, Primary Keys, Check Constraints, Referential Integrity, Surrogate keys, Cardinality (1:1, m: n), etc.
- Validating the Source to Target Mapping: Ensuring traceability throughout will help build the quality aspects like consistency, accuracy and reliability.
The data storage phase refers to loading of data within the data warehouse/data mart or OLAP cubes. The data loads can be one time, incrementally or in real-time. Key areas this phase should focus on are:
- Validating data loads based on time intervals.
- Performance and Scalability: Testing of initial and subsequent loads with performance and scalability aspect ensures that the system is within acceptable performance limits and can sustain further data growth.
- Parallel Execution and Precedence: Verifying appropriate parallel execution and precedence during ETL process is important as it may impact directly on performance and scalability of the system.
- Validating the Archival and Purge Policy: Ensures data history based on business requirements.
- Verifying error logging, exception handling and recovery from failure points.
This is the final step of the testing cycle and has the privilege of having a graphical interface to test the data. Key areas this phase should focus on are:
- Validating the Report Model.
- Report layout validation as per mockups and data validation as per business requirements.
- End to End Testing: Although individual components of the data warehouse may be behaving as expected, there is no guarantee that the entire system will behave the same. Thus execution and validation of end-to-end runs are recommended. Along with data reconciliation discrepancies, issues might surface such as resource contention or deadlocks. The end-to-end runs will further help in ensuring the data quality and performance acceptance criteria are met.
While above considerations are given, one important aspect that still remains to be addressed is the issue of ‘Time’. BitWise has created a platform based on DW/BI Testing Best Practices that automates and improves the overall effectiveness of DW/BI Testing. If you’re interested in learning more about this platform, please contact us.
With the features and benefits of this platform, the intention is to address most of the DW/BI testing challenges:
- End-to-end traceability achieved right through source extraction to reports.
- 100% requirement and test data coverage through test cases.
- Test case automation of standard checks achieving considerable time savings.
- Up to 50% time savings through automated regression testing.
- Defect or bug tracking involving all the stakeholders.
- Improved testing cycle time through reusability.
- Process improvements with analytical reporting showcasing test data, test cases & defect trends.
Testing BI applications is different than testing traditional enterprise applications. To achieve truly credible data, each stage of the lifecycle must be tested effectively – Data Collection, Data Integration, Data Storage and Data Presentation. If you’re not comfortable with your internal capabilities to test your BI applications, turn to the BitWise DW/BI Testing platform and lean on BitWise’s expertise and experience gained through testing business intelligence applications for clients over the past decade.
Editor's Note: The blog was originally posted on August 2015 and recently updated on April 2023 for accuracy.