Story Point Estimation Using Transformer Based Agents
DOI:
https://doi.org/10.14279/eceasst.v85.2685Abstract
In the context of software project management, precise estimation of software development effort in a team commonly measured in story points is critical for effective sprint planning and resource allocation as well as team capacity that burns story points in each sprint. Traditional estimation techniques rely heavily on team which can be forgettable in the future. By leveraging Large LanguageModels (LLMs), there is an opportunity to help the team automate and standardize story point estimation, reducing bias and improving predictability. Moreover, LLMs have emerged as powerful tools for user computer interaction and complex problem solving. Models such as GPT-4 can decompose intricate tasks into sequential steps, solve each subproblem, and critically evaluate candidate solutions, demonstrating a degree of autonomous reasoning. However, their closed system design restricts access to realtime data and domain-specific expertise, occasionally yielding erroneous or misleading outputs (hallucinations). Although fine-tuning can mitigate these issues, it often demands extensive domain-specific data and specialized model weights, potentially compromising generalization. In this study, we address these limitations by enhancing agile story point estimation through an extended positional encoding mechanism and multi-agent weighting strategies within the models head layers. We build upon a baseline BERT transformer, introducing a knowledge pool
of specialized agents, each trained on distinct aspects of their own project data. Using 12,014 story point records from 8 open-source software projects, we allocated 80% of the dataset for agent training, embedded 80% of the remaining 20% within the shared knowledge pool, and reserved the final 20% for evaluation. Both baseline and proposed systems were trained under identical hyperparameters (epochs = 3, batch size = 16, learning rate = 2 × 10−5). Empirical results indicate that the enhanced multi-agent architecture attains an average accuracy of 70.81%, representing a substantial improvement over the 42.62% achieved by the standard BERT model a relative gain of approximately 48.3%. These findings suggest that integrating domain-specialized agents and refined encoding strategies can significantly bolster LLM performance in software estimation tasks, offering promising directions for augmenting agile project management with AI-driven decision support.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Oguzhan Oktay Buyuk, Asst. Prof. Dr Ali Nizam

This work is licensed under a Creative Commons Attribution 4.0 International License.
