UCL Discovery
UCL home » Library Services » Electronic resources » UCL Discovery

Investigating the Effectiveness of Clustering for Story Point Estimation

Tawosi, Vali; Al-Subaihin, Afnan; Sarro, Federica; (2022) Investigating the Effectiveness of Clustering for Story Point Estimation. In: 2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER). (pp. pp. 827-838). IEEE: Honolulu, HI, USA Funding Agency: SECTION I.Introduction In agile development, Story Point (SP) is a commonly used measure of the complexity and required effort of completing a software development task, [1], [2]. Teams typically carry out assigning story. Green open access

[thumbnail of tawosi2022saner.pdf]
Preview
Text
tawosi2022saner.pdf - Accepted Version

Download (273kB) | Preview

Abstract

Automated techniques to estimate Story Points (SP) for user stories in agile software development came to the fore a decade ago. Yet, the state-of-the-art estimation techniques’ accuracy has room for improvement. In this paper, we present a new approach for SP estimation, based on analysing textual features of software issues by employing latent Dirichlet allocation (LDA) and clustering. We first use LDA to represent issue reports in a new space of generated topics. We then use hierarchical clustering to agglomerate issues into clusters based on their topic similarities. Next, we build estimation models using the issues in each cluster. Then, we find the closest cluster to the new coming issue and use the model from that cluster to estimate the SP. Our approach is evaluated on a dataset of 26 open source projects with a total of 31,960 issues and compared against both baselines and state-of-the-art SP estimation techniques. The results show that the estimation performance of our proposed approach is as good as the state-of-the-art. However, none of these approaches is statistically significantly better than more naive estimators in all cases, which does not justify their additional complexity. We therefore encourage future work to develop alternative strategies for story points estimation. The experimental data and scripts we used in this work are publicly available to allow for replication and extension.

Type: Proceedings paper
Title: Investigating the Effectiveness of Clustering for Story Point Estimation
Event: 29th IEEE International Conference on Software Analysis, Evolution and Reengineering
Location: Hawaii (Virtual)
Dates: 15 Mar 2022 - 18 Mar 2022
Open access status: An open access version is available from UCL Discovery
DOI: 10.1109/SANER53432.2022.00101
Publisher version: https://doi.org/10.1109/SANER53432.2022.00101
Language: English
Additional information: This version is the author accepted manuscript. For information on re-use, please refer to the publisher's terms and conditions.
Keywords: Software Effort Estimation, Story Point Estimation, Latent Dirichlet Allocation, Hierarchical Clustering
UCL classification: UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science
UCL > Provost and Vice Provost Offices > UCL BEAMS > Faculty of Engineering Science > Dept of Computer Science
UCL > Provost and Vice Provost Offices > UCL BEAMS
UCL
URI: https://discovery.ucl.ac.uk/id/eprint/10143360
Downloads since deposit
292Downloads
Download activity - last month
Download activity - last 12 months
Downloads by country - last 12 months

Archive Staff Only

View Item View Item