Electronic versus paper-based data collection for conducting health-care research: A cost-comparison analysis
Sirshendu Chaudhuri1, Bhavani Shankara Bagepally2, Ditipriya Bhar1, Uday Kumar Reddy Singam3
1 Consultant, Division of Online Courses, ICMR National Institute of Epidemiology, Chennai, Tamil Nadu, India
2 Scientist E, Division of Non-Communicable Diseases, ICMR-National Institute of Epidemiology, Chennai, Tamil Nadu, India
3 Resident, Department of General Surgery, Apollo Institute of Medical Sciences and Research, Chittoor, Andhra Pradesh, India
Correspondence Address:
Sirshendu Chaudhuri
ICMR-National Institute of Epidemiology, Chennai, Tamil Nadu
India
Source of Support: None, Conflict of Interest: None
CheckDOI: 10.4103/ijph.ijph_1271_21
Background: Containing expenditure and efficient resource use is essential to limit the increasing costs of health research. Electronic data collection (EDC) is thought to reduce the costs compared to paper-based data collection (PDC). Objectives: As economic evidence in this area is scanty, especially in low- and middle-income countries, the objectives of the study are to perform an economic evaluation and compare the cost between EDC and PDC. Methods: A cost-comparison study was conducted to compare between EDC and PDC from the institutional perspective for the year 2018, based on a community-based survey. Step-down cost accounting was adopted with a bottom-up approach for cost estimation. Total and unit costs were estimated with the base case comparison between EDC and PDC while using SPSS software (e-SPSS and p-SPSS, respectively). We conducted scenario analyses based on the usage of different software, R and STATA for both EDC and PDC (e-R, p-R, e-STATA, and p-STATA, respectively). One-way and probabilistic sensitivity analysis (PSA) was performed to examine the robustness of the observed results. Results: In the base-case analysis, total costs of EDC and PDC were ₹72,617 ($1060.9) and 87,717 ($1281.5), respectively, with estimated cost reduction of ₹15,100 ($220.6). In other scenarios, the estimated cost reduction for e-R, e-STATA, p-R, p-STATA was ₹−274 ($4.0), 98 ($1.4), 14826 ($216.6), and 15,002 ($219.2), respectively, when compared to EDC-SPSS. On one-way and PSA, the results of the cost-comparison analysis were robust. Conclusion: EDC minimizes institutional cost for conducting health research. This finding will help researchers in efficiently planning for the budget for their research.
Keywords: Cost comparison, cost-estimation, cost-minimization analysis, data collection, paper-based data collection
A sustainable health research environment requires an integrated contribution from individual, institutional, national, and international solidarity. A major proportion of health research costs in low- and middle-income countries (LMICs) is contributed by the donor funding and grants.[1] However, available funding resources are much limited in comparison of what is required.[2] Therefore, the cost of research can be well managed by systematic planning and implementation as well as efficient use of resources during data collection, management, and analysis.[2]
Paper-based data collection (PDC) using paper case report forms (CRFs) are the conventional method of data collection in health research. In addition, PDC can be a laborious and error-prone process.[3] With the innovation of information technology, an increasing trend of use of electronic data collection (EDC) has been observed in many regions of the World.[4] Evidence suggests that in LMIC settings, EDC may serve as an effective platform to collect and transfer large quantities of data successfully.[5] In India, although systematic evidence on the use of EDC is lacking, the Indian government's policy on the use of mhealth for service delivery by health workers is an indication of an increasing trend in the application of EDC.[6] With the advent of Health Information Technology in India, there is an increasing trend of EDC by small to extensive scale surveys and trials than traditional PDC in researches.[7],[8],[9]
The current literature from different country settings has identified various advantages of EDC when compared to PDC. It is reported that EDC ensures increased accuracy, greater efficiency of data capture, better quality control of data management, rapid analysis of data, as well as a reduction in the cost of data collection.[3],[4],[10],[11],[12] However, the cost of EDC and PDC can be contextually different and study specific.[4] Despite the rapid increase of EDC in India, there is hardly any study examining the cost difference between the two methods. Hence, based on a cross-sectional survey, an economic evaluation from the institute's perspective was conducted to examine the costs incurred in different modes of data collection and methods applied while conducting health research and to examine the cost difference between the costs of traditional PDC and EDC.
Materials and MethodsThis study was conducted by a cost-comparison analysis approach. This economic evaluation is based on a single community-based survey in the state of Andhra Pradesh, India.[13] This survey was conducted in a rural area of Andhra Pradesh from a private medical college. The survey was conducted from May to September 2018. Data for this study were collected electronically through an android device using KoBoCollect open data kit.[14]
Retrospectively, costing was assigned to this survey.[13] For costing, a step-down cost accounting was adopted with a bottom-up approach. All the costs were estimated from an institutional perspective. We identified different cost centers and cost components within each of the cost centers for various activities/processes involved in the study. Then we imparted costs for each of the components. All the costs were estimated in Indian Rupee (₹) for the year 2018 (Also converted into the United States Dollar; conversion rate: 1$=68.45₹ as on July 1, 2018). As the study was of a short period (<1 year), we have not discounted costs.
Cost comparison is the primary outcome of this study. Data collection for the original survey was done electronically, and the analysis was done using the Statistical Package for Social Sciences Software (SPSS) Version 25.0. Armonk, NY: IBM Corp' after '(SPSS)' so that the whole sentence reads as-”Data collection for the original survey was done electronically, and the analysis was done using the Statistical Package for Social Sciences Software (SPSS), Version 25.0. Armonk, NY: IBM Corp. Similarly, we have imparted costs for all the relevant cost components for PDC, with a fixed study sample size, the same as that of EDC. Thus, we estimated costs of the survey (p-SPSS) assuming that PDC would have been done instead of EDC in relevant cost centers. Then, the cost difference was estimated as the difference in total costs between EDC and PDC mode of data collection.
Then for different scenarios of conducting the same study using different data analysis software was considered. For scenarios, we estimated the costs of the survey if other software such as “R” and “STATA” would have been used for data analysis for both EDC and PDC approaches. These costs were termed e-R and e-STATA when EDC was done, and data analysis software was R and STATA, respectively. Similarly, costs were termed as p-R and p-STATA when PDC was done with analysis using R and STATA, respectively. To examine the uncertainty of results, a one-way sensitivity analysis was conducted by lowering or increasing the individual cost components by 25%. Furthermore, probabilistic sensitivity analysis (PSA) was conducted by simulating the various components of the cost data using a gamma distribution. Based on the assumption that, if the same survey was conducted in low-, medium-, and high-cost settings, all the respective cost components were documented [Supplementary Table 1]. For example, the cost of a medical college can be situated in a rural setting (low-cost setting), small towns and cities (medium-cost setting), or in metropolitan cities (high-cost setting) where the costs of the same items can vary. Using these estimates from three different settings, the measures of dispersion were calculated and used as input while conducting the PSA. All the analyses were performed in Microsoft Excel 2019.
Ethical issues
The institutional ethics committee clearance has been obtained to conduct the study (Ref No: IEC04/AIMSR/02/2018).
ResultsThe primary study was conducted with a sample size of 102 subjects.[13] The steps of the survey were broadly divided into three phases – preparation, data collection, and data analysis and storage [Figure 1].
All identified cost centers and its component costs for both EDC and PDC modes of data collection are tabulated in [Table 1]. The costs of the preparatory phase and the data collection phase were similar for both EDC and PDC, with minimal variation between the two. A significant difference in costs between EDC and PDC was noticed in the costs related to data entry and CRF storage section [Table 1].
Table 1: Cost centers and its component costs for electronic and paper-based data collectionThe estimated total costs of EDC-SPSS and PDC-SPSS were ₹72,617 and ₹87,717, respectively, with estimated cost difference of ₹−15,100 [Table 2]. The data analysis and storage section of cost centers contributed ₹14,154 (93.7%) of total cost difference. In scenario's analysis, estimated cost difference for EDC-R, EDC-STATA, PDC-R, and PDC-STATA were ₹−274, ₹−98, ₹14,826, and ₹15,002 respectively when compared to EDC-SPSS. In the one-way sensitivity analysis, the cost difference between EDC-SPSS and PDC-SPSS was robust with maximum variation <8% even with a 25% change in the individual component costs. The most sensitive cost inputs were transportation-related vehicular costs for data collection, the salary of the researcher, and the costs of data analysis [Figure 2]. The one-way sensitivity analysis results of all the scenarios were robust with minimal deviation from that of observed base case results [Supplementary Figure 1], [Supplementary Figure 2], [Supplementary Figure 3].
Figure 2: One-way sensitivity analysis for comparison between e-SPSS and p-SPSS.In PSA [Supplementary Figure 4], results were robust as the PSA mean cost difference between e-SPSS as compared to p-SPSS was similar to that of base case values [Table 2]. Similarly, even in the PSA results in different scenarios were also robust and similar to that cost difference from the respective-based case analysis. Since the costs of software (R and STATA) could not be varied; hence, we could not calculate the measures of dispersion and hence both PSA and base-case values were the same. Results of PSA of cost-comparison analysis were shown in a scatter plot with the X-axis indicating the total costs of the comparator (p-SPSS) while the Y-axis was indicating the comparators (e-SPSS, e-R, e-STATA, p-R, p-STATA) [Supplementary Figure 4].
DiscussionIn this economic evaluation, we have examined the cost comparison that could result from the use of EDC using open-source electronic questionnaires when compared with traditional PDC. We observed that the use of EDC reduces the total cost of health research by 17%. Higher costs of PDC were mostly (93.7%) contributed by the cost of data entry and CRF storage. This finding holds good for commonly used software for data analysis.
Estimating the cost-comparison due to EDC is complex and study specific. One study from South Africa estimated that EDC could reduce the study cost by 50% over the traditional PDC.[4] Contemporary evidence from other parts of the world also indicates similar findings that the cost-comparison can vary between 49% and 62%.[15] This difference could be at various levels of the study. In the present study, we found more than 90% of the estimated cost-comparison was out of double data entry and the storage of paper case record forms (CRFs). The process involved with tedious data entry, and the necessary data cleaning, reduces the duration of health research and resulting costs involved in it.[16],[17],[18] Evidence from LMICs such as Ethiopia and Tanzania has shown that the use of EDC reduced the costs by 25%.[3],[19] However, the economic evaluation from the Ethiopian study considered the cost of electronic devices as single use, which may cause an over-estimation of cost for EDC.[19]
Although the cost of the data collection device did not influence the cost difference, the choice of device may affect the cost difference.[20] The types of electronic devices may affect the accuracy of the data collection as well in the field due to variation in size, and specification.[16] The estimated cost difference can be affected by the initial expenditure associated with the establishment of a system in an institution for using EDC methods instead of PDC.[11],[21] For example, we used an open-source tool for developing electronic questionnaires in our study. The use of a paid version of a similar tool could have altered the estimate.
The present study highlights various costs incurred to conduct the study from the institutional point of view at various stages of short-duration, community-based survey. Such findings will help in planning for the necessary set-up required by an institution to adopt the EDC approach instead of PDC in health research. The initial planning includes motivating and training the researchers, technologies needed, financing, maintaining the technology, and solving the ethical concerns associated with handling the data.[10],[22],[23] Once an electronic data capture system is established in an institution, it can save additional resources like costs, and the researchers can focus better on their research activities.[24]
Limitations
In the present study, we could not account for the costs in a few aspects of health research. These include the cost associated with data cleaning in case of PDC, maintenance charge during storage of data collection. We could not impart such charges as the original study sample size is relatively small and thus finding the present study should be examined cautiously for larger studies. As we use an open-source tool for data collection, the actual cost may vary if an institution could use a paid tool for EDC. The study explored the costs of public health research. Further study observations may not be extended for clinical trials.
ConclusionWe observed a cost reduction with the use EDC approach as compared to the PDC approach while conducting a health research cross-sectional single-center study. For short studies, we strongly recommend using EDC. We also recommend that the research institutions should priorly assess the potential cost of preparation and maintenance related to technologies associated with EDC by incorporating all the stakeholders. Further cost-minimization analysis is warranted for clinical trials and multi-centric studies in this area to have a better policy decision by all stakeholders.
Financial support and sponsorship
Nil.
Conflicts of interest
There are no conflicts of interest.
References
Comments (0)