The Limitations of Estimation
Linda M. Laird
Abstract: Estimation is a crucial element of software project planning. Unfortunately, there are inherent limitations in the ability to estimate projects accurately due to the inherent uncertainties in software projects.
In 1968, Alfred Pietrasanta of IBM System Research Institute wrote, "Anyone who expects a quick and easy solution to the multifaceted problem of resource estimation is going to be disappointed." Thirty years later, Lionel Briand and his colleagues observed that "Despite the large number of cost factors collected and the rigorous data collection, a lot of uncertainty in the estimates can be observed" (Briand et al., "An Assessment and Comparison of Common Software Cost Estimation Methods," Proc. Int'l Conf. Software Eng., [ICSE], IEEE 1999, pp. 313322). It's now 2006, and we still have problems with estimation.
In practical terms, your ability to estimate well comes down to how much you know about a project when you're estimating it, and how much uncertainty is inherent.
In software, we primarily want to estimate three aspects of a project: effort, schedule, and cost. Most methods start by estimating size - as thousands of lines of code (KLOC), function points - or some other proxy point and using that to estimate effort (that is, staff months). From effort, you typically derive staffing, schedule, and cost. Estimating effort is the primary challenge; once you have an effort estimate, wonderful tools are available to help you work through your schedule, staff, costs, and risk, and tie them into your project plan.
Effort Estimation: Where are we?
How well do you estimate? How do you know? Do you track your estimates through the project's life cycle and understand where you under- or overestimated so you can do better the next time?
The 1995 Standish Group's Chaos Report cites overruns on 89 percent of projects. Kjetil Molokken and Magne Jorgensen, who studied software estimation results, describe other surveys as most commonly reporting that 30 to 40 percent of programs overrun (K. Molokken and M. Jorgensen, "A Review of Surveys on Software Effort Estimation," Proc. 2003 Int'l Symp. Empirical Software Eng., IEEE CS Press, 2003, p. 223). They conclude that the overruns' root causes are complex, the data isn't always reliable, and those responding to the surveys "may have a tendency to over report causes that lie outside of their responsibility for example, customer-related causes." Rarely does just one factor on a project go awry. Rather, the interplay between factors, such as schedules, changing requirements, new technologies, resource unavailability, unforeseen problems, and personnel (management and staff) leads to overruns. And it's often easiest to attribute the overrun to factors over which you have little control - that is, the customer.
The literature lists several common causes for overruns:
- specifying incomplete or unclear requirements (not knowing what to do),
- failing to adjust schedules when scope changes (too much work),
- setting overly aggressive development schedules (too little time), and
- insufficient resources (not enough people or equipment).
These reasons are rather generic. In my view, the primary causes of software project overruns are:
- Lack of education and training. Many people don't know how to estimate, have no training in estimation, and receive no feedback on their estimates to help them improve. A developer who knows how to write code well doesn't necessarily know how to estimate.
- Confusion of the desired schedule/effort target with the estimate. Development teams are frequently pushed into dates because of business needs rather than a rational plan to deliver on those dates.
- Hope-based planning. Developers know the "right answer": that the project is on time and on budget, based on the marketing-or management-assigned target.
- Software personnel's inability to credibly communicate and support their estimates. The lack of good estimation processes and knowledge frequently leads to being pushed into the "shortest schedule you can't prove you won't make" instead of a rational schedule based on probable outcomes.
- Incomplete, changing, and creeping requirements. Nothing harms a fixed-price and fixed-schedule project more than changing and growing requirements.
- Quality surprises. Projects can easily spend half of their time in the test-and-fix phase, especially when the need for speed causes the development team to take additional risks and turn over inadequately tested code.
Do We Care How Well We Estimate?
The answer is a resounding yes. First, I'm sure your organization cares. Even if you're doing exploratory development or prototyping, you'll usually need some estimates. Second, if you're in the software business, estimating well is the only way you'll be able to have a personal life instead of working too many nights and weekends.
In most projects, estimates are crucial. If alternative approaches exist, accurate estimates improve the decisions you must make in a project. When bidding for jobs, if you underestimate, you can lose money and damage your company. If you overestimate, you might unnecessarily lose a job to a competitor. Estimating well helps you win the jobs you should win and lose those you should lose.
Estimates give you the opportunity to adjust project parameters to meet budgets and deadlines. If you understand your cost factors, you can adjust them to hit the target. At Stevens Institute of Technology, we have seen a marked improvement in the success rate of the senior projects in the students who also take the estimation and metrics class. Instead of focusing primarily on functionality and design, they now include difficulty and size, and define their projects to be doable in the time available.
Even when you can't adjust your cost factors, estimating well gives you the chance to manage the risk, rather than ignore it or be surprised later.
Software Estimation Methodologies and Models
Hundreds of documented software estimation methodologies, tools, and models exist. Some of the earliest ones estimated the software cost as a percentage of the hardware cost. For most, the output is an estimation of staff effort from a wide variety of inputs. Many of the methods use an analytical formula based on cost drivers, which typically are project characteristics such as system size, system domain, complexity, and development methodology. (Complexity in this context is project complexity, such as the need for high security or reliability, rather than design or code complexity.)
The tools and methodologies are primarily based on data from past projects. Researchers and engineers studied project data and determined equations and formulas that best matched the existing data points. Some of the formulas are extremely simple, others are complex. None are totally accurate with all of the past projects, nor should you expect them to be with their predictions for future projects. Instead, they predict, based on their formulas (which are primarily based upon their data set), the most likely effort required for the project as described. At Stevens, I teach over 10 methods for estimating projects. Some are simple and quick, others take considerable time and effort. My students estimate the same project using the different estimation methodologies and models. Their results vary widely, depending on their definition of the project, their assumptions, and the models used.
Which methodology should you use? Most projects use expert opinion. Excellent results have been achieved with use case points (Bente Anda, "Comparing Effort Estimates Based on Use Case Points with Expert Estimate," Proc. Empirical Assessment in Software Eng. 2002, http://www.simula.no/ departments/engineering/ publications/SE.5.Anda.2002.a. Function points have strong advocates as well. Whatever methodologies you select, you should test and calibrate them to your organization using your own past projects and determining the methodologies' accuracy level.
The current recommendation is that you use more than one methodology - in fact, the standard recommendation is to use at least three methods (F. Heemstra, "Software Cost Estimation," Information and Software Technology, vol. 34, no. 10, Oct. 1992, pp. 627639; T. Bollinger, "The Interplay of Art and Science in Software," Computer, vol. 30, no. 10, Oct. 1997, pp. 125128). Each method will add insight and understanding, and using three allows you to triangulate on an answer. (Three methods might seem like overkill, especially when many organizations don't even use one. Obviously, the amount of effort spent on estimation depends on the importance of the estimation's accuracy and the ease of creating the estimate.)
You should choose complementary methods with different biases (M. Jorgensen, "Realism in Assessment of Effort Estimation Uncertainty: It Matters How You Ask," IEEE Trans. Software Eng., vol. 30, no. 4, Apr. 2004, pp. 209217). Examples of complementary methods are
- expert opinion with analogy,
- expert opinion with function points,
- algorithmic models and use case points,
- expert opinions by experts with different project experiences and responsibilities and
- use case points and analogy
Software Estimation as Engineering
With all the available methodologies and methods, why are the estimation results so ragged? In some respects, the question comes down to whether software estimation is an engineering process or not.
Consider Terry Bollinger's statement, "if software estimation is believed to be a codifiable engineering process analogous to house building then litigation is a reasonable and expected consequence of inaccurate estimations" (Bollinger 1997).
The software engineering community is split into two camps:
- the process camp espouses that quality software can be developed on time if a particular software process or programming technology is used,
while
- the problem-solving camp believes that programming is fundamentally a process of solving problems and as such intrinsically resists codification.
Or, as Bollinger notes, "The creation of genuinely new software has far more in common with developing a new theory of physics than it does with producing cars or watches on an assembly line" (Bollinger 1997).
Which camp are you in? Problem solving or process execution? Or does it vary depending on the circumstances? In other words, are you developing a new system which requires creativity and invention, or are you developing a system that you already know how to build?
Consider the analogy of building six new apartment buildings, one after the other, all of which have the same design and all of which are being built on the same type of vacant land. The estimate for the first building might have been +/-25 percent of the final cost. By the third, the estimate is probably +/-5 percent. But consider the impact of changing the original plan. Instead of using brick and mortar, you use precast concrete because of the expected cost savings. Or, imagine you need to build on a rocky hillside instead of flat vacant lots. And you decide to add a penthouse. What will happen to your estimate? Plus or minus 25 percent will be an achievement.
The same is true of software projects. If you're building systems that are similar to those you've previously built using similar teams and technologies, and you can manage the requirements processes, your estimates will be quite accurate. If you're building fundamentally new systems, however, where invention is required, +/-25 percent will be an achievement.
Estimate Uncertainties
You must remember that estimates are probabilistic and communicate them appropriately. All projects have some uncertainties. Estimates are typically the 50-percent view, meaning the probability for being under or over budget is the same. (Unfortunately, Parkinson's Law, which states that work expands to meet the time available, holds for software projects. So, this 50-percent view says we'll be on budget 50 percent of the time and over budget the other 50 percent.)
To estimate more accurately, you must
- understand, accept, and manage estimation's inherent uncertainties; and
- change the estimation terminology to include the uncertainties.
The size of the estimate's uncertainty differs based on how close you are to the project's end, as Barry Boehm classic chart - the cone of uncertainty - illustrated in 1981. (Boehm, Software Eng. Economics, Prentice-Hall, 1981). This chart represents the estimate accuracy at different phases in the software life cycle. During the project's initial feasibility phase, the estimate's accuracy is a factor of +/-4. For example, if you estimate a project to be 200 staff months, it could take anywhere from 50 to 800 staff months. As you progress through the project's life cycle, the estimate accuracy increases, such that at the detailed design phase, it's a factor of approximately 1.3 rather than the initial 4.
Most practitioners agree that improved estimation practices and processes have changed the uncertainty scale. In addition, developers have a much greater tendency to underestimate (due to market and organizational pressures, optimism, and overconfidence) than to overestimate. Estimation is now typically closer to +100/-50 percent at the feasibility stage, but still not much better than +20/-10 percent at the detailed design stage.
Three types of uncertainties exist: statistical variance, known risks, and unknown risks. You'll always have statistical variance - the only question is its size. Your risk management plan should address the known risks. Unknown risks are obviously the toughest to handle. Organizations typically use experts to minimize the number of unknown risks. The Project Management Institute (PMI) recommends maintaining a 10-percent management contingency buffer in the absence of any other information.
It's also important to effectively communicate your estimates. Instead of giving a number, you should give ranges and views based on probabilistic distributions. The pX view means there is an X percent probability of not exceeding the estimate; thus, a p90 view suggests that the actual results will be less than or equal to the estimate 90 percent of the time (Jorgenson 2004).
Consider the difference in information between the following statements:
- My estimate is 12 staff months.
- My estimate is between 10 and 14 staff months.
- My p50 estimate is 12 months.
- I estimate that 95 percent of the time it will be between 10 and 14 months.
- I estimate that 98 percent of the time it will be less than 14 months.
The first statement is the easiest for others to deal with. It's precise, but you aren't communicating any of the uncertainties or risks. The other statements provide significantly more information and lets others help manage both the risks and the uncertainties.
Experts can specify the ranges as part of the estimation process. Unfortunately, experiments have shown that experts tend to be consistently overconfident in their range estimations (M. Jorgensen, "Practical Guidelines for Expert-Judgment-based Software Effort Estimation," IEEE Software, vol. 22, no. 3, May/June 2005, pp. 5763). The reasons suggested for their overconfidence include the need to appear confident and professional, belief in their abilities, and lack of immediate feedback. Experiments show that you can achieve more accurate ranges using predetermined organizational guidelines or probability distributions based on previous projects. I recommend following the engineering rules in Table 1 until you have your own organizational baseline. For example, if your current p50 estimate is 50 staff months and you're in the design phase, using these engineering rules will give you an expected range between 47 and 56 staff months.
Table 1. Estimation uncertainty engineering rules.

To use the distribution from previous projects to derive the X-percent effort, start with your best estimate (determined through other methods). For example, assume your estimate is 10 staff years and that of previous similar projects,
- 30 percent came in on budget,
- 10 percent came in under budget by 10 percent,
- 40 percent came in over budget by 50 percent, and
- 20 percent came in over budget by less than 50 percent.
Then your estimate percentages are:
- p10 estimate = 9 staff years,
- p40 estimate = 10 staff years
- p80 estimate = 15 staff years.
For agile methods which embrace change, expect more variation. Todd Little published data for 120 delivered systems based on the agile methodology (T. Little, "Agility, Uncertainty, and Software Project Estimation,"
www.toddlittleweb.com/ Papers/Agility,%20Uncertainty% 20and%20Estimation.pdf). For these systems, he found that
- "estimation accuracy follows a log-normal distribution" (that is, it was underestimated far more frequently than overestimated),
- "initial estimates are targets with only a small chance of being met,"
- "the range between the target and an estimate with 90 percent confidence is about four times greater," and
- "this behavior and uncertainty range is nearly identical at all stages in the project lifecycle, in conflict with the 'cone
of uncertainty."
So, to deal with estimates' uncertainties:
- change your terminology from one number to a pX number;
- use probability intervals to specify the range of estimates;
- use methods other than "gut feelings" for determining the range, because overconfidence is typical;
- if you're using agile methods, especially if you're following the "responding to change over following a plan" principle,
expect a low probability of meeting your initial estimates (perhaps 10 percent) and a large estimate uncertainty throughout
your projects.
Managing the Effort Budget
A primary reason for projects coming in over budget is the confusion between the project target - that is, what the senior management or marketing wants the cost to be - and the actual estimate.
When asked to provide an estimate, you must determine whether you're being asked for your view or being told what the answer must be. Frequently, because of business realities, you're told the answer - that is, you're given the job's "effort budget." In this case, the same tools and methodologies still apply. But instead of using them to create an estimate, use them interactively to create a rational plan to reduce the effort to meet the budget. For example, you could reduce the number of adjusted use case points by reducing actors and scenarios, decreasing the technical factors, improving the environmental factors, or even reusing large chunks of code and classes. Infinite potential solutions exist, some feasible, some not.
Some people, when faced with an inadequate budget, try adjusting the estimation model's parameters without changing the work involved. Unless you made mistakes in the initial estimate, this will only create more overtime. Others plan on overtime, which is best reserved for contingency and risk management. Have a plan you believe you can meet without overtime. The third favorite technique is cutting features. This option should be the last resort, unless you can prove that the feature is truly unneeded or optional. Instead, use your creativity and problem-solving skills to simplify the project or increase productivity.
Estimating Early and Often
When should you estimate? And how much effort should you spend on it?
It depends on the project and the methodology. Effort estimation is usually overhead - you aren't producing a deliverable product. So, you need to ensure that the benefit exceeds the cost. If development costs or schedule don't matter, you don't need to estimate. Or if you're dealing with frequently changing requirements and specifications, you need to be able to give ballpark estimates quickly and efficiently, and expect to update them (quickly and efficiently) when the situation changes.
In some projects, estimates greatly impact the quality of the decision making. I'm sure you have horror stories like mine about approaches or projects selected based on estimates that were later found to being totally inaccurate (and low).
For traditional developments, you should estimate the job at least three times:
- macroestimation during the feasibility phase,
- detailed estimation during the requirements phase, and
- refined estimation during the design phase.
The need for an estimate's accuracy is related to its purpose at a particular point in time. A cost estimate at a project's feasibility stage needs only be accurate enough to support a decision to prepare a more detailed project definition. Detailed product and project definitions aren't available at this phase, so the estimate will be high level. The estimate at the requirements specification stage, however, is a critical project decision point and must be reasonably accurate. You're potentially committing significant resources based on the business case, which is based on the cost estimate. The required accuracy depends on the business model. An in-house project or consulting service might be "pay as you go," so the estimate is really about establishing credibility. For external product sales, or for fixed-cost projects, the required accuracy depends on when a final price will be set. Once the detailed design has been completed, and hopefully most of the unknowns have become known, you can create a reasonably accurate estimate and plan.
Because of the inherent uncertainties in all projects, you might need to re-estimate as you learn more. Whenever major unplanned events occur, you'll need to understand how they impact the schedule and effort. Don't ignore them and just hope it will all work out. It rarely will. At least roughly estimate the impacts to make sure your plan is still viable. By estimating early and often, you can take appropriate action to adjust the program and project plan when you find that the work is not aligned with the budget or the schedule.
Three Golden Rules of Estimation
There are three golden rules which will significantly improve your estimations and estimation processes.
- Require all estimates to be justified. Gut feeling is not an adequate justification.
- Don't use methods or tools blindly. Try estimating previous (completed) projects to validate and tune the methods.
- Educate your estimators. Knowing how to do something doesn't mean you know how long it will take. Train people in estimation.
Accuracy is correlated with training and the ability to see results, not development experience.
Conclusion
Improving your estimation skills, processes, and how you communicate your estimations will help you avoid the difficulties inherent in estimation; however, your ability to estimate well will always be limited by the extent of your projects' uncertainty.
Acknowledgments: The ideas and conclusions in this article are the culmination of my experience and my readings of the works of others. I'd like to thank all those I referenced, as well as those I may have inadvertently missed, for which I apologize. I'd especially like to thank Magne Jorgensen and the folks at Simula Laboratories. Their work is fundamental to my conclusions and recommendations.
Linda M. Laird is the author of Software Measurement and Estimation: A Practical Approach, published recently by IEEE and Wiley. She is an adjunct professor at Stevens Institute of Technology, Hoboken, New Jersey. Contact her at linda_m_laird@msn.com.