The Invisible Tax: Why AI Made Engineering Faster and Measurement Harder

Blogs » The Invisible Tax: Why AI Made Engineering Faster and Measurement Harder

A reading of Harness’s State of Engineering Excellence 2026, with a few ideas about what to do next.

There is a strange thing happening inside software engineering organizations right now. Productivity dashboards are green. Cycle times are shrinking. And yet, walk over to the developers and you’ll hear a different story. Longer reviews, more debugging of code they didn’t write, a creeping sense that something has shifted and the time spent in doing that is not accounted for.

Harness’s State of Engineering Excellence 2026, based on a survey of 700 practitioners and managers across five countries, puts numbers to the dissonance.

The paradox
89% of engineering leaders say productivity metrics have improved since deploying AI. 81% say code review time has increased. Both can be true. AI generates more code; cycle times shorten. But the cost shows up downstream, in the review queue.

Roughly 31% of a developer’s day is now consumed by AI-related work that doesn’t appear in any standard metric, for example, reviewing AI code for accuracy (53%), fixing subtle bugs from AI output (52%), explaining AI-generated code to teammates (48%), context-switching (45%). Only 38% of organizations track AI code review time at all.

Then the measurement gap itself: 94% say tech debt, validation time, and burnout are missing from their frameworks. Only 6% think those frameworks are sufficient. And underneath, a trust problem: 54% of developers fear AI productivity data will be used in their performance reviews. Managers don’t share the worry. The people designing the measurement systems are the people who feel safest from them.

The real story
The gains from AI are real, but they’re being booked on the wrong line of the ledger. Gross output is up. Net output (output minus the validation tax) is the number we don’t have.

Six ideas worth trying

Make Validation a formal SDLC stage. Right now, code review is an event between “done writing” and “merged.” In an AI-augmented workflow, validation is its own discipline. Name it, resource it, measure it. What gets a name on the SDLC diagram gets a budget.
Track the Review-to-Generation Ratio (RGR). If AI generates 10x more code but review effort scales 8x, you’ve relocated work. If it scales 2x, you’ve won. Minutes of review per 100 lines of AI-assisted code will tell you more about real ROI than most current dashboards.
Build a firewall between improvement data and performance data. The 54% who fear individual evaluations aren’t paranoid. They’re rational. Separate aggregated tools, model, process data from individual review data, with an explicit policy that the two never cross. Make the firewall a feature, not a vibe.
Let practitioners design the framework. The people bearing the validation cost should define what counts as productivity. A working group of senior ICs, a quarter to propose, leadership ratifies. Better framework, and buy-in no amount of comms can manufacture.
Instrument cognitive load, not just cycle time. A weekly two-minute pulse: how much time on review or fixes? How often did you override the AI? How confident in what shipped? Pair self-report with quantitative data and you start to see net effort instead of gross throughput.
Measure decay, not just snapshots. The five ideas above are point-in-time. The real costs of AI-assisted engineering accumulate. Three longitudinal metrics worth adding: validation quality decay (does reviewer accuracy degrade as AI review volume grows?), false-confidence rates (the gap between ship-time confidence and downstream defects, a calibration score for human-AI collaboration), and long-term maintainability of AI-generated architectures (12-24 month tech debt traceable to AI-authored code). Reviewer fatigue and trust calibration drift (teams becoming over-trusting or over-skeptical) extend the same frame. AI’s productivity gains are short-term; its costs are long-term. Measure only the first and you mis-price every investment.

The reframe
For thirty years, engineering productivity meant throughput: how much code, how fast, defect rate. That framing made sense when humans were the bottleneck. The bottleneck has moved. AI removes friction from generation and adds it to judgment: validation, integration, accountability. The work hasn’t disappeared. It’s migrated to a place none of our instruments are pointed.

The 2026 assignment isn’t “instrument more.” It’s instrument differently.

Authored by: Manish Verma, Head of Global Ops and Delivery, Lirik

Lirik empowers businesses to seize global opportunities with top-tier CRM, ERP, and data solutions. We combine startup agility with enterprise maturity, delivering personalized experiences, operational excellence and transformative growth.

Services

Quick Links

Services

Quick Links

Talk to one of our experts.

If you are applying or looking for a job, please email hiring@lirik.io.

Global Delivery Centers

Gurgaon

Fortune Towers II, Floor #5 406 Udyog Vihar, Phase III Gurgaon, India, 122016

Noida

Vertex Tower, Plot no-
C-33, 4th Floor, Phase 2, Industrial Area, Sector 62, Noida, Uttar Pradesh, 201309

Pune

Suma Center, 6th Floor Near Deenanath Mangeshkar Hospital Pune, India, 411004

Jaipur

IndiQube Fort, 3rd Floor Malviya Nagar, Jaipur India, 302017

Nagpur

4th Floor, JK Heights
Ajni Square, Deo Nagar Nagpur, India, 440015