Reliability in open source software

Ullah, Najeeb

doi:10.6092/polito/porto/2536707

Open Source Software is a component or an application whose source code is freely accessible and changeable by the users, subject to constraints expressed in a number of licensing modes. It implies a global alliance for developing quality software with quick bug fixing along with quick evolution of the software features. In the recent year tendency toward adoption of OSS in industrial projects has swiftly increased. Many commercial products use OSS in various fields such as embedded systems, web management systems, and mobile software’s. In addition to these, many OSSs are modified and adopted in software products. According to Netcarf survey more than 58% web servers are using an open source web server, Apache. The swift increase in the taking on of the open source technology is due to its availability, and affordability. Recent empirical research published by Forrester highlighted that although many European software companies have a clear OSS adoption strategy; there are fears and questions about the adoption. All these fears and concerns can be traced back to the quality and reliability of OSS. Reliability is one of the more important characteristics of software quality when considered for commercial use. It is defined as the probability of failure free operation of software for a specified period of time in a specified environment (IEEE Std. 1633-2008). While open source projects routinely provide information about community activity, number of developers and the number of users or downloads, this is not enough to convey information about reliability. Software reliability growth models (SRGM) are frequently used in the literature for the characterization of reliability in industrial software. These models assume that reliability grows after a defect has been detected and fixed. SRGM is a prominent class of software reliability models (SRM). SRM is a mathematical expression that specifies the general form of the software failure process as a function of factors such as fault introduction, fault removal, and the operational environment. Due to defect identification and removal the failure rate (failures per unit of time) of a software system generally decreases over time. Software reliability modeling is done to estimate the form of the curve of the failure rate by statistically estimating the parameters associated with the selected model. The purpose of this measure is twofold: 1) to estimate the extra test time required to meet a specified reliability objective and 2) to identify the expected reliability of the software after release (IEEE Std. 1633-2008). SRGM can be applied to guide the test board in their decision of whether to stop or continue the testing. These models are grouped into concave and S-Shaped models on the basis of assumption about cumulative failure occurrence pattern. The S-Shaped models assume that the occurrence pattern of cumulative number of failures is S-Shaped: initially the testers are not familiar with the product, then they become more familiar and hence there is a slow increase in fault removing. As the testers’ skills improve the rate of uncovering defects increases quickly and then levels off as the residual errors become more difficult to remove. In the concave shaped models the increase in failure intensity reaches a peak before a decrease in failure pattern is observed. Therefore the concave models indicate that the failure intensity is expected to decrease exponentially after a peak was reached. From exhaustive study of the literature I come across three research gaps: SRGM have widely been used for reliability characterization of closed source software (CSS), but 1) there is no universally applicable model that can be applied in all cases, 2) applicability of SRGM for OSS is unclear and 3) there is no agreement on how to select the best model among several alternative models, and no specific empirical methodologies have been proposed, especially for OSS. My PhD work mainly focuses on these three research gaps. In first step, focusing on the first research gap, I analyzed comparatively eight SRGM, including Musa Okumoto, Inflection S-Shaped, Geol Okumoto, Delayed S-Shaped, Logistic, Gompertz and Generalized Geol, in term of their fitting and prediction capabilities. These models have selected due to their wide spread use and they are the most representative in their category. For this study 38 failure datasets of 38 projects have been used. Among 38 projects, 6 were OSS and 32 were CSS. In 32 CSS datasets 22 were from testing phase and remaining 10 were from operational phase (i.e. field). The outcomes show that Musa Okumoto remains the best for CSS projects while Inflection S-Shaped and Gompertz remain best for OSS projects. Apart from that we observe that concave models outperform for CSS and S-Shaped outperform for OSS projects. In the second step, focusing on the second research gap, reliability growth of OSS projects was compared with that of CSS projects. For this purpose 25 OSS and 22 CSS projects were selected with related defect data. Eight SRGM were fitted to the defect data of selected projects and the reliability growth was analyzed with respect to fitted models. I found that the entire selected models fitted to OSS projects defect data in the same manner as that of CSS projects and hence it confirms that OSS projects reliability grows similarly to that of CSS projects. However, I observed that for OSS S-Shaped models outperform and for CSS concave shaped models outperform. To overcome the third research gap I proposed a method that selects the best SRGM among several alternative models for predicting the residuals of an OSS. The method helps the practitioners in deciding whether to adopt an OSS component, or not in a project. We test the method empirically by applying it to twenty one different releases of seven OSS projects. From the validation results it is clear that the method selects the best model 17 times out of 21. In the remaining four it selects the second best model.

PORTO @ Archivio Istituzionale della Ricerca