Building a Quick Estimation Template When You Have Almost Nothing to Go On
The 4 PM Estimation Crisis
Last month, my PM dropped 73 backlog items on my desk at 4 PM. "Need estimates by tomorrow's planning meeting." Sound familiar?
Here's what I did: I grabbed the scariest item—a machine learning integration nobody on the team had touched before. I rated it:
- Innovation: 8 (we've never done ML)
- Scope: 7 (new infrastructure, data pipelines, APIs)
- People: 7 (data scientists, backend devs, DevOps, plus external vendors)
That's 22 points. With my multiplier of 5 for high complexity, I got 110 → mapped to 89 story points.
Took me 90 seconds. By 6 PM, I'd estimated all 73 items. Here's the framework I built and how you can adapt it for your team.
The Simplest Framework That Works
I ditched sophisticated algorithms. What I needed was something I could explain in 30 seconds and calculate in my head. I settled on rating each factor on a 1-10 scale - everyone understands 1-10.
Addition worked better than multiplication. It's intuitive, predictable, and gives a manageable range (3-30 instead of 1-1000). I can always adjust with a multiplier later based on complexity.
The Formula
(Innovation + Scope + People) × Multiplier = Estimate
Task → Rate (I+S+P) → Sum → Apply Multiplier → Map to Format
↓ ↓ ↓ ↓
(8+7+7) 22 ×5 = 110 → 89 points
No machine learning, no complex weightings. Just simple math that anyone can verify and adapt.
The Three Core Factors (Deep Dive)
Factor 1: Innovation
Definition: Have WE done this before as a team?
The Spectrum:
1-3 (Routine) We have templates, patterns, or completed similar work in the last 3 months
Example: Login form, CRUD endpoint we've built 10 times, standard bug fix
4-6 (Adaptation) We've done similar things but need to learn/adapt new elements
Example: Payment integration using a new provider similar to our last one, React component when we usually do Vue
7-10 (Pioneering) New territory requiring R&D, spikes, or proof-of-concept
Example: First ML model, blockchain integration, new architecture pattern the team has never implemented
Common Mistakes:
- ❌ Rating based on industry standards (GraphQL is common, but if YOUR team hasn't done it, it's high Innovation)
- ❌ Conflating "difficult" with "new" (migrating 10TB is hard but might be low Innovation if you've done it)
Calibration Question:
"If we started this tomorrow, how much would we be Googling vs. copy-pasting from our previous work?"
Factor 2: Scope
Definition: How big is the work itself (independent of who does it)?
The Spectrum:
1-3 (Tiny) Single file, one component, isolated change
Example: Button color change, add one validation rule, fix typo in API response
4-6 (Medium) Multiple components, several files, some integration
Example: New dashboard page with 3-4 widgets, API endpoint + frontend form, add caching layer
7-10 (Massive) System-wide changes, new infrastructure, touches many parts
Example: Migrate database, redesign auth system, build entire microservice, major refactor
Common Mistakes:
- ❌ Confusing Scope with People (a large data migration might only need one DBA - keep them separate)
- ❌ Underestimating "just UI" changes that require state management rewrites
Calibration Question:
"How many files/components/systems will we touch? Can this be done in isolation?"
Factor 3: People
Definition: How complex is the coordination and skill diversity?
The Spectrum:
1-3 (Solo) One person or single team with one skill set
Example: One backend dev doing API work, single frontend dev on UI, solo DBA query optimization
4-6 (Coordinated) 2-3 different skill sets, some handoffs, moderate sync needed
Example: Designer + frontend dev, backend + DevOps, API team + mobile team
7-10 (Orchestra) Many teams, diverse skills, external vendors, heavy coordination
Example: Frontend + backend + DBA + DevOps + security + external API vendor + QA specialists
Common Mistakes:
- ❌ Counting headcount instead of skill diversity (3 backend devs = low People, but backend + frontend + DBA = high)
- ❌ Ignoring external dependencies (vendors, third-party APIs, other company departments)
Calibration Question:
"How many different specialties are needed? How many handoffs or integration points exist?"
Why Progressive Multipliers? (Not Just Arbitrary Numbers)
Why not use a flat 3× multiplier across the board? Because complexity doesn't scale linearly.
A simple bug fix (sum of 5) might take 10 hours—double the base complexity. But a massive integration (sum of 22) doesn't take 44 hours. It explodes to 110+ hours because of:
- Context switching overhead between systems
- Integration testing across multiple platforms
- More stakeholders = more delays and approval cycles
- Unknown unknowns multiply with complexity
- Communication overhead grows exponentially
I arrived at these specific multipliers by analyzing 50 completed projects and calculating implied multipliers. Here's what the data showed:
| Complexity Level | Sum Range | Multiplier | Historical Average | Why This Works |
|---|---|---|---|---|
| Low | 3-9 | 2.5 | 2.3-2.7× | Small tasks stay manageable, minimal overhead |
| Medium | 10-15 | 4 | 3.8-4.3× | Moderate coordination, some integration complexity |
| High | 16-20 | 5 | 4.9-5.5× | Significant complexity, many moving parts |
| Very High | 21+ | 6 | 5.8-7.2× | Strong signal to split the task |
Data-Driven Calibration
Anything above sum of 20 showed wildly unpredictable multipliers (5.8× to 7.2×). Those tasks were consistently underestimated and should have been broken down. That's why 21+ is a red flag in my framework.
Mapping to Different Formats
The framework generates consistent relative values that map to any output format. Working in Agile? Map to Fibonacci. T-shirt sizes? Create bands. Need hours? Use a conversion factor (but keep it hidden from stakeholders).
The mapping can be a simple lookup table or nested IF statements. Let AI write these tedious formulas for you.
Fibonacci Mapping
Example conversion:
- Sum 3-6 → 3 points
- Sum 7-10 → 5 points
- Sum 11-14 → 8 points
- Sum 15-18 → 13 points
- Sum 19-22 → 21 points
T-Shirt Sizes
Example bands:
- XS: Sum 3-6
- S: Sum 7-9
- M: Sum 10-13
- L: Sum 14-17
- XL: Sum 18-21
- XXL: Sum 22+
Getting Your Team On Board
Introducing a new estimation framework requires buy-in. Here's a practical 3-week adoption roadmap:
- Estimate next sprint yourself using the framework
- Track actuals religiously (time spent, unexpected issues)
- Calculate variance for each item
- Document what worked and what felt off
Goal: Prove to yourself it works before asking others to try it
- Show team: "Here's what I estimated vs. actuals"
- Walk through 3-4 examples of HOW you rated them
- Ask: "What would you have rated differently?"
- Listen to disagreements—this reveals hidden complexity
Goal: Start the conversation about what drives complexity
- Have team rate 5 items independently
- Compare ratings (expect variance of ±2 points)
- Discuss differences—this is where definitions get refined
- Average the ratings or have brief debate
- Calculate final estimates together
Goal: Build shared understanding of Innovation, Scope, and People
Handling Pushback:
🗨️ "This is too simple"
→ "Try it for one sprint. Track accuracy. If it works, keep using it. If not, we'll adjust."
🗨️ "My tasks are too unique"
→ "Then rate them as high Innovation. The framework accommodates uniqueness—that's why we have a 1-10 scale."
🗨️ "People will game the system"
→ "If someone consistently inflates ratings and actuals don't match, the calibration will expose it. Transparency is the defense."
🗨️ "We already use Planning Poker"
→ "Great! Use this for initial estimates, then validate with Planning Poker. Or use this when you don't have time for the full ceremony."
Calibrate with Your Team's History
When you have even 5-10 completed items, work backwards:
- Take a completed item
- Rate it with your Innovation, Scope, and People factors
- Calculate what multiplier would have given you the actual result
- Average across several items
- That's your team's multiplier
Calibration Formula
Implied Multiplier = Actual Effort ÷ (Innovation + Scope + People)
Average this across multiple completed items to get your team's baseline multiplier.
No history? Start with a multiplier of 3 and adjust after your first few completions. The exact number matters less than having a consistent method.
When This Framework Failed Me
Six months in, I was cocky. I'd calibrated my multiplier to 4.2 and was hitting estimates within 10%.
Then came the "simple" mobile app redesign. I rated it:
- Innovation: 3 (we'd done mobile before)
- Scope: 5 (just UI changes)
- People: 2 (one frontend team)
Sum of 10, multiplier 4, estimate of 40 hours. Should've been a week.
It took 6 weeks.
What I Missed:
The Innovation score ignored that our mobile dev had left . The new dev hadn't worked in our codebase. That should've been a 7, not a 3.
The "just UI changes" actually required rewriting the entire state management . Should've been Scope 8, not 5.
The single "frontend team" was actually designer + junior dev + external consultant , all needing coordination. Should've been People 5.
The Critical Lesson
Don't rate based on the task description. Rate based on YOUR TEAM'S current reality. The framework is only as good as your honesty when rating. If your expert left, adjust Innovation. If "simple" hides complexity, adjust Scope. If coordination is messy, adjust People.
Focus on Defendable, Not Perfect
The goal isn't perfect estimates—it's estimates you can explain and adjust. When someone challenges your numbers, you can show:
- The specific Innovation, Scope, and People ratings
- Why you rated each factor that way
- How changing any rating affects the estimate
- Which completed work validates your approach
This transparency builds trust even when estimates are wrong.
Python Implementation
Here's a complete working implementation you can adapt:
def estimate_task(innovation, scope, people):
"""
Calculate task estimate using Innovation, Scope, People framework.
Args:
innovation (int): 1-10, have we done this before?
scope (int): 1-10, how big is the work?
people (int): 1-10, coordination complexity?
Returns:
dict: Contains sum, multiplier, raw_score, and fibonacci_points
"""
# Calculate sum
sum_factors = innovation + scope + people
# Apply progressive multipliers
if sum_factors <= 9:
multiplier = 2.5
complexity = "Low"
elif sum_factors <= 15:
multiplier = 4
complexity = "Medium"
elif sum_factors <= 20:
multiplier = 5
complexity = "High"
else:
multiplier = 6
complexity = "Very High - Consider Splitting"
# Calculate raw score
raw_score = sum_factors * multiplier
# Map to Fibonacci
fibonacci = [1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144]
fibonacci_points = min(fibonacci, key=lambda x: abs(x - raw_score))
return {
"sum": sum_factors,
"multiplier": multiplier,
"complexity": complexity,
"raw_score": raw_score,
"fibonacci_points": fibonacci_points
}
# Example usage:
result = estimate_task(innovation=8, scope=7, people=7)
print(f"ML Integration: {result['fibonacci_points']} points")
# Output: ML Integration: 89 points
How This Compares to Other Methods
| Method | Setup Time | Estimation Speed | Team Buy-in | Accuracy | Best For |
|---|---|---|---|---|---|
| Planning Poker | 30 min | 5 min/item | High | Medium | Small backlogs, team building |
| This Framework | 10 min | 90 sec/item | Medium | Medium-High | Large backlogs, rapid estimation |
| Expert Judgment | None | Varies | Low | Low-Medium | Quick guesses only |
| Story Points (Gut) | 5 min | 2 min/item | Low | Low | When nothing else works |
| Historical Analysis | 2+ hours | 10 min/item | Medium | High | Similar, well-documented work |
Key insight: This framework sits in the sweet spot—faster than Planning Poker, more structured than gut feel, and doesn't require extensive historical data.
Build in Automatic Reality Checks
Add a column for "actual effort" and calculate variance automatically. Use conditional formatting to highlight when estimates are off by more than 50%.
Track patterns:
- Are innovative tasks always underestimated?
- Does your team consistently underrate People complexity?
- Do certain types of work always blow up?
These patterns help you adjust ratings going forward, not just multipliers. The framework improves as you learn your team's blindspots.
Keep Cognitive Load Low
Spending more than 2 minutes per estimate? Your framework is too complex. The power isn't in sophistication—it's in consistency and speed. You should estimate 50 items in under an hour.
Requirements:
- Clear definitions for Innovation, Scope, and People
- Reference examples for common patterns
- No second-guessing or over-analysis
- Trust the framework even when it feels wrong initially
Use AI to Handle Tedious Parts
AI assistants excel at:
- Generating Excel formulas for your ISP logic
- Creating validation rules and conditional formatting
- Building reference tables and lookup functions
- Writing calibration calculations
- Producing multiple format outputs from the same data
Don't waste time on formula syntax. Describe what you want and let AI write the implementation. See the LLM Prompts section below for specific examples.
Embrace Continuous Refinement
Your first version will be wrong. The framework gives you something to be wrong WITH, which is infinitely better than being wrong without structure.
After each sprint/milestone/project:
- Compare estimates to actuals
- Look for systematic bias in Innovation, Scope, or People ratings
- Adjust either rating definitions or multipliers
- Document what you learned
Within 3-4 cycles, your accuracy improves dramatically. The framework evolves with your team's reality.
The Psychology Matters
Having a framework changes the conversation from "how did you guess that?" to "let's discuss these ratings." It moves you from defending random numbers to discussing specific complexity factors.
Stakeholders can engage with "I rated People complexity as 7 because we need frontend, backend, DBA, and DevOps coordination" in a way they can't with "my gut says 3 weeks."
Transparency is a Feature, Not a Bug
Be upfront that this is a rapid estimation tool based on simple math. Don't pretend it's more sophisticated than it is. When presenting estimates, show the Innovation + Scope + People ratings openly and explain it takes 30 seconds per item. Honesty builds credibility.
LLM Prompts to Get You Started
Getting started with an AI assistant to build your framework? Here are proven prompts that will generate exactly what you need:
"I need to estimate 50 IT tasks with minimal information. Create an Excel formula that:
- Takes 3 factors rated 1-10 (Innovation, Scope, People)
- Innovation = have we done this before? (1=many times, 10=never)
- Scope = how big is it? (1=tiny, 10=massive)
- People = coordination complexity, teams, skills needed (1=single person, 10=many teams)
- Adds them together
- Applies progressive multipliers based on the sum
- Maps the result to Fibonacci numbers (1,2,3,5,8,13,21,34,55,89)"
"Create a Python script using openpyxl that generates an Excel estimation template with:
- Headers for task name, category, Innovation (have we done this?), Scope (how big?), People (teams/skills needed)
- All three factors rated 1-10
- Data validation limiting ratings to 1-10
- Automatic calculation: (Innovation + Scope + People) × Progressive Multiplier
- Conditional formatting for variance over 50%
- A calibration sheet that calculates suggested multipliers from historical data"
"Create a reference table showing typical Innovation, Scope, and People ratings for common IT tasks:
- Simple bug fix (done many times, small scope, one developer)
- New API endpoint (done before but needs adaptation, medium scope, backend team)
- Database migration (somewhat new, large scope, multiple teams)
- Full microservice (never done, large scope, many teams and skills)
- UI dashboard (done similar, medium scope, designer + frontend)
Include suggested ratings for each factor"
"Write an Excel formula that:
- Takes completed story points in column A
- Takes Innovation rating in column B
- Takes Scope rating in column C
- Takes People rating in column D
- Calculates the sum in column E
- Calculates the implied multiplier in column F
- Provides an average multiplier recommendation at the bottom"
Common Mistakes to Avoid
Don't split Innovation into "technical innovation" and "business innovation." Don't divide People into "internal teams" and "external vendors." Keep it simple: Have we done this? How big? How many people/skills involved?
Scope is about the size of the work itself. People is about coordination complexity. A large data migration (Scope=8) might only need one DBA (People=2). Keep them separate.
Innovation means "have WE done this before?" - not whether it exists in the world. If your team has never built a REST API, that's high Innovation for you, even though it's standard in the industry.
Multiplying Innovation × Scope × People gives you a range of 1-1000. That's too wide. Addition (Innovation + Scope + People) gives you 3-30, which is much more manageable.
People complexity isn't just about headcount. Three developers with the same skills = low People complexity. Frontend + backend + DBA + DevOps = high People complexity, even if it's still just four people.
Real Calculation Examples
Let's walk through exactly how the math works with concrete numbers:
Example 1: Simple Bug Fix
- Innovation: 2 (we fix similar bugs weekly)
- Scope: 2 (single component affected)
- People: 1 (one developer, no coordination)
- Sum: 5
- Multiplier: ×2 (low complexity)
- Raw Score: 10
- Result: → 8 points
Example 2: Customer Portal Feature
- Innovation: 5 (done similar, has new elements)
- Scope: 6 (multiple screens, DB, API)
- People: 4 (frontend, backend, UX, PO)
- Sum: 15
- Multiplier: ×4 (medium-high)
- Raw Score: 60
- Result: → 55 points
Example 3: ML Integration
- Innovation: 8 (never done ML)
- Scope: 7 (new infra, pipeline, API)
- People: 7 (data scientists, backend, DevOps, vendors)
- Sum: 22
- Multiplier: ×5 (high complexity)
- Raw Score: 110
- Result: → 89 points (consider splitting)
Example 4: OAuth Implementation (Calibrated)
- Innovation: 8 (new OAuth pattern)
- Scope: 5 (touches auth everywhere)
- People: 6 (frontend, backend, security, PM)
- Sum: 19
- Multiplier: ×4.7 (calibrated from actuals)
- Raw Score: 89.3
- Actual: 89 points ✓
Calibration Example
You estimated a task:
- Innovation: 4, Scope: 5, People: 3
- Sum: 12, Multiplier: 3, Estimate: 36 → 34 points
It actually took 55 points.
Implied multiplier:
55÷12 = 4.6
This tells you to increase your multiplier for similar complexity from 3 to about 4.5.
Quick Reference Card (Print This!)
Here's everything you need on one page:
═══════════════════════════════════════════════════════════════
QUICK ESTIMATION FRAMEWORK
═══════════════════════════════════════════════════════════════
STEP 1: Rate Each Factor (1-10)
─────────────────────────────────────────────────────────────
Innovation: Have WE done this?
1-3 = Many times (templates exist)
4-6 = Similar work (need adaptation)
7-10 = Never attempted (R&D required)
Scope: How big is it?
1-3 = Small (one component)
4-6 = Medium (multiple components)
7-10 = Massive (system-wide changes)
People: Coordination complexity?
1-3 = Solo or single team
4-6 = 2-3 teams, some handoffs
7-10 = Many teams, diverse skills
STEP 2: Add Them Up
─────────────────────────────────────────────────────────────
Sum = Innovation + Scope + People
STEP 3: Apply Multiplier Based on Sum
─────────────────────────────────────────────────────────────
Sum 3-9: × 2.5 (Low complexity)
Sum 10-15: × 4 (Medium complexity)
Sum 16-20: × 5 (High complexity)
Sum 21+: × 6 (Flag for splitting!)
STEP 4: Map to Your Format
─────────────────────────────────────────────────────────────
Fibonacci: 3-6→3, 7-10→5, 11-14→8, 15-18→13, 19-22→21
T-Shirt: 3-6→XS, 7-9→S, 10-13→M, 14-17→L, 18+→XL
RED FLAGS
─────────────────────────────────────────────────────────────
⚠ Sum over 20? Split the task
⚠ Actual 2× estimate? Check which factor you underrated
⚠ Consistently off? Adjust your multiplier
CALIBRATION
─────────────────────────────────────────────────────────────
Implied Multiplier = Actual Effort ÷ (I + S + P)
Average across 5-10 tasks → Your team's multiplier
EXAMPLE
─────────────────────────────────────────────────────────────
ML Integration: I=8, S=7, P=7
Sum = 22, Multiplier = 5
Raw = 110 → Maps to 89 story points
═══════════════════════════════════════════════════════════════