DevSpark in Practice: A NuGet Package Case Study

March 19, 202612 min read

The DevSpark series describes the methodology. This article shows it. Four consecutive feature specifications on WebSpark.HttpClientUtility — a production .NET NuGet package — covering a documentation site, compiler warning cleanup, a package split, and a new batch execution feature. Each spec illuminated something different about what spec-driven development costs, what it saves, and what it preserves.

DevSpark Series — 24 articles

The Problem with Methodology Articles

The rest of this series describes the methodology. What it is, why it matters, how it evolved, what it costs, what it saves. That's the right place to start. But describing a process and showing it are two different things.

This article shows it.

WebSpark.HttpClientUtility is a production .NET NuGet package I maintain — utilities for managing HttpClient with resilience, caching, telemetry, concurrent execution, and web crawling. It has real consumers, runs against multiple target frameworks (.NET 8, 9, and 10), and has years of accumulated decisions baked into it. A genuinely brownfield codebase, not a demo project.

Over the past several months, I applied DevSpark to four consecutive features in this package. Each spec is in the repository under /specs/. Each solved a different type of problem — a documentation site, a quality cleanup, an architectural split, and a new feature. Each one illuminated something different about what the methodology costs, what it saves, and what it preserves.

This is what it actually looked like.

The Codebase Before Specs

Before the first spec was written, HttpClientUtility was a working, tested, useful library with a real problem: its institutional knowledge lived entirely in my head.

Why did the FireAndForget implementation use a specific exception-handling pattern? Commit history told you what changed, not why. Why did the resilience policies distinguish transient from non-transient errors in that particular way? There was a reason — a production incident that taught it — but it was undocumented.

This is the standard state for solo-maintained open-source libraries. The author knows everything. The documentation covers the happy path. The edge cases and architectural decisions that shaped the code exist nowhere except memory that fades.

The four specs changed that.

Spec 001: The Documentation Site

Type of problem: Greenfield within brownfield — a new artifact for an existing project

The initial requirement was approximately: "Build a documentation website for the NuGet package." That's a vague enough prompt that an AI assistant would generate something functional and wrong — a generic static site that breaks immediately on GitHub Pages because subdirectory deployment requires relative paths, and "standard" static site configurations use absolute paths.

The spec forced a different conversation first.

By the time the clarification loop was done, the requirement had become specific: Eleventy 3.0 (over Hugo, Jekyll, and Astro — with the evaluation criteria documented), GitHub Pages deployment at a specific subdirectory path, all paths relative and dynamically calculated, NuGet API integration with cache fallback for build resilience, Prism.js with exactly four target languages (C#, JavaScript, JSON, PowerShell — no kitchen sink), and performance targets of 90+ Lighthouse scores across all categories.

The spec artifacts ran to over 37KB for the specification alone. That sounds like a lot until you consider what it replaced: multiple iterations of generate-test-debug-regenerate trying to make a static site deploy correctly to a subdirectory.

What survived in the spec artifacts that wouldn't have survived otherwise: the rationale for choosing Eleventy over the alternatives. When I revisit this in a year, I won't have to reconstruct why Astro was considered and rejected, or why Jekyll was ruled out. The decision is documented where the decision lives.

What the spec caught: The path resolution requirement — every path must be relative, calculated dynamically — would have been invisible until the first deployment attempt. Catching it at spec time cost an hour of thinking. Catching it at deployment time would have cost debugging time plus the confidence tax of watching generated code fail in production.

Spec 002: Zero Compiler Warnings

Type of problem: Quality remediation — cleaning up what had accumulated

"Fix the compiler warnings" is a deceptively vague requirement. It sounds obvious. What it doesn't specify: what constitutes a proper fix. The obvious answer — #pragma warning disable — silences the warning. The right answer depends on a quality standard that has to be stated explicitly, not assumed.

The spec for this feature had to answer eight clarification questions before planning could begin:

What counts as an acceptable fix for a nullable reference type warning?
Are #pragma suppressions ever acceptable, and if so, under what conditions and with what documentation?
Do test methods require XML documentation? If so, what level of detail?
What is the sequencing when one warning fix reveals another?

These aren't edge cases. They're the substance of the work. Without explicit answers, an AI assistant will make reasonable-sounding choices that may or may not match your actual standards — and you won't know until you're reviewing code.

The spec answered them with specificity. The fix strategy: ArgumentNullException.ThrowIfNull() for null guards (not conditional checks), full XML documentation on all public APIs (including test methods that document what and why, not just what), TreatWarningsAsErrors enabled in Directory.Build.props, and suppressions limited to a maximum of five with documented justification for each.

The result — a zero-warning build across net8.0 and net9.0, with 711 tests passing — reflected those standards exactly. More importantly: the standards are now in the spec artifacts. The next feature that touches public APIs inherits the documentation requirement. The quality bar compounds.

What the spec caught: The instinct to silence warnings rather than fix them. That instinct is fast and feels productive. The spec created a gate before that shortcut could be taken — not by prohibiting it, but by forcing explicit articulation of the alternative. When you have to write down "no pragma suppressions except with documented justification," it changes what feels like a reasonable default.

Spec 003: Splitting the Package

Type of problem: Architectural — changes with real consequences for real consumers

This is where spec-driven development earned the most of its overhead.

The core question: WebSpark.HttpClientUtility had grown into a monolithic package. The SiteCrawler and related web crawling components pulled in five additional specialized dependencies that core HTTP utility users didn't need. Every consumer was carrying those dependencies regardless of whether they ever crawled a site.

The solution — split into a base package and a crawler extension package — was obvious in principle and genuinely complex in execution. Architectural splits affect consumers. Breaking changes in NuGet packages are painful for everyone downstream. Package versioning, strong-name signing, GitHub Actions atomic publishing, and consumer migration all had to be designed before the first line changed.

The spec surfaced the decisions that had to be made:

Lockstep versioning: Both packages would always release at the same version number, eliminating the consumer confusion of mismatched versions
Zero breaking changes for core users: Consumers who never imported crawling features would upgrade without a single code change
Explicit breaking change communication for crawler users: They would need to add the new package and register the crawler service separately — documented in the migration guide before implementation began
Atomic publishing: The GitHub Actions workflow would publish both packages in a single atomic operation or roll back both, eliminating the split-brain state where a consumer could install mismatched versions

The implementation complete note records the outcome: base package at 9 dependencies (reduced from 13), crawler package with 5 additional specialized dependencies, package size reduction over 40%, and zero unexpected breaking changes.

What would have happened without a spec? The split would have happened — the problem was real enough that it couldn't be ignored. But the decisions about versioning, breaking change communication, and atomic publishing would have been made ad hoc during implementation, under pressure, when the cost of getting them wrong was highest. The spec moved those decisions to the cheapest possible point: before any code changed.

What the spec caught: The atomic publishing requirement. It's not obvious until you're staring at a scenario where package A is published and package B fails — and you've just shipped a state to NuGet.org that consumers can encounter. The spec forced thinking through the failure modes before they could occur.

Decision	Where It Was Made	Cost of Getting It Wrong
Lockstep versioning	Spec time	Documentation update
Zero breaking changes for core	Spec time	Spec revision
Atomic publish or rollback	Spec time	GitHub Actions configuration
Migration guide for crawler users	Spec time	Writing the guide
If discovered post-release	Consumer complaints	Hotfix, comms, apology

Spec 004: Batch Execution Orchestration

Type of problem: New feature — capability that didn't exist before

The batch execution spec is the most recent, and the most instructive about how specs evolve during implementation.

The feature: template-based batch HTTP request execution with combinatorial parameter substitution, configurable concurrency, and percentile statistics collection (P50, P95, P99). Think load-testing tooling that runs inside the library itself — useful for developers characterizing API behavior.

The spec captured seven design decisions that would have been made silently otherwise:

The orchestrator is independent from the existing decorator chain (not an extension of the resilience or caching layers, but a separate service above them)
Template substitution uses {placeholder} tokens with a special {{encoded_user_name}} token for URL-safe encoding — a single-pass renderer with no recursive processing
Statistics use snapshot-based percentile calculation — samples collected into a sorted array at completion rather than streaming approximations
Hashing uses SHA-256 from the BCL (no external dependency added for response body comparison)
Progress reporting uses REST-friendly polling for the demo UI, not SignalR dependency — keeping the core library free from real-time infrastructure concerns
The batch runner is opt-in via HttpClientUtilityOptions, not registered by default
Demo runs are capped at 50 requests to prevent the sample application from being used as an unintentional load-testing tool

Each of these decisions has a reason. The spec recorded the reasons at the moment they were made.

The implementation deviated from the original plan in one significant area: the SignalR progress reporting, initially scoped as a core feature, was moved to the demo application only. That deviation was fed back into the spec artifacts — the implementation notes document the why. Future work on the batch execution feature inherits that boundary without having to rediscover it.

What the spec caught: The scope creep tendency. Batch execution is adjacent to load testing, which is adjacent to reporting, which is adjacent to dashboards. Without a spec defining the boundaries explicitly, each of those adjacencies is an invitation to expand scope. The spec's explicit opt-in registration requirement was the clearest signal that this was a focused utility feature, not a platform.

What Four Specs Built

Looking across the four specifications, something more durable than the features themselves accumulated: a set of documented decisions that form the beginning of a real constitution for this package.

Not a formal constitution yet. But the raw material:

Path resolution standard: All links and asset references must be relative, calculated dynamically. No environment-specific configuration in templates. (From Spec 001)
Documentation standard: All public APIs require XML documentation. Test methods require documentation explaining what and why, not just what. (From Spec 002)
Warning standard: Zero warnings in all target frameworks. TreatWarningsAsErrors enabled. Suppressions require documented justification with a maximum of five. (From Spec 002)
Package boundaries: Core utilities ship without crawling dependencies. Crawler extensions ship separately. Both packages version in lockstep. (From Spec 003)
Dependency discipline: No external dependencies added for functionality available in the BCL. New dependencies require explicit justification in the spec. (From Spec 004)
Scope control: New features default to opt-in registration. Demo-specific functionality stays in the demo project, not the library. (From Spec 004)

These aren't aspirational principles. They're documented decisions that shaped real code — with the rationale preserved in the spec artifacts. The next contributor who opens this repository won't have to rediscover them. The next AI session that reads the specs won't assume generic patterns where specific decisions already exist.

That's what four specs built, beyond the four features.

The Honest Accounting

The overhead is real. Each spec took several hours to write and refine before any implementation began. The documentation site spec alone runs to 37KB before the plan or tasks — that's a substantial investment for a feature that a competent developer could prototype in an afternoon.

But the accounting has to include both sides:

Feature	What the Spec Prevented	Estimated Cost of Discovery Without Spec
Documentation site	Wrong deployment path strategy	2-4 hours debugging post-deployment
Compiler warnings	Suppression-first instinct	Quality debt that compounds silently
Package split	Ad-hoc versioning decisions under pressure	Consumer-facing breaking changes, hotfix cycle
Batch execution	Scope drift into load-testing platform	Weeks of scope creep, architectural coupling

The documentation site spec paid for itself the first time the GitHub Pages deployment succeeded without iteration. The package split spec paid for itself by moving the consumer-impact decisions to the cheapest possible moment.

The compiler warnings spec paid for itself differently — not in prevented debugging, but in accumulated quality standards that compound across every subsequent feature. That's the payoff that's hardest to measure but most durable.

Resources

The full specification artifacts for all four features are available in the WebSpark.HttpClientUtility repository — including spec, research, data model, plan, tasks, and quickstart documents for each feature. Reading the actual artifacts alongside this article shows what spec-driven development produces at each stage of the pipeline.

Explore More

DevSpark: Months Later, Lessons Learned -- What months of real spec-driven development actually taught me about AI
Why I Built DevSpark -- Building the tool I needed to survive the reality of brownfield developm
DevSpark: Constitution-Driven AI for Software Development -- DevSpark aligns AI coding agents with project architecture and governanc
Taking DevSpark to the Next Level -- A practitioner's guide to bridging three decades of enterprise developme
DevSpark: The Evolution of AI-Assisted Software Development -- From requirements discipline to continuous governance — a complete frame