Skip to content
Back to blog

How to Evaluate Code Quality Without Reading Code

Priit Kallas

Deadlines slip. Bugs recur. The team says they need “more time to refactor.” But you can’t tell whether the codebase is actually struggling or whether the team is just slow.

You’re responsible for the software, whether as an engineering manager, a founder who hired developers, or a project owner who outsourced the build. But you can’t evaluate the code yourself. And the only people who can tell you what’s going on are the same people whose work you’re trying to evaluate.

This is the non-technical manager’s guide to code quality for managers. No code reading required.

6 things you can check without being technical

1. Dependency health

Every software project relies on dozens or hundreds of external libraries. Open-source packages the team didn’t write but depends on. These libraries get security patches and bug fixes. When a project falls behind on updates, risk accumulates silently.

What to look for: Ask how many dependencies are outdated and how many have known security vulnerabilities. Any automated tool can report this. It’s a factual measurement, not an opinion. For reference, when we analyzed the React codebase, 52% of dependencies were outdated with 60 known vulnerabilities.

What it means: A project with 50%+ outdated dependencies isn’t being maintained proactively. Known critical vulnerabilities are unaddressed security risks. That doesn’t mean the team is bad. It often just means nobody’s been paying attention to this dimension.

Red flag: “We’ll update dependencies when we have time.” This means it never gets prioritized until something breaks.

2. Test coverage on critical paths

Tests are automated checks that verify the software works correctly. Not all code needs tests, but the paths that handle money, user data, or core business logic absolutely do.

What to look for: Ask which user workflows have automated tests and which don’t. You don’t need to understand the tests — you need to know whether the team has a safety net for the things that matter most.

What it means: If the payment flow or the signup flow has no tests, bugs in those areas will reach your users before anyone catches them. The team is relying on manual testing or luck.

Red flag: “We test manually before each release.” Manual testing doesn’t scale, misses edge cases, and depends on whoever’s doing it remembering to check everything.

3. Knowledge concentration (bus factor)

If one developer wrote 80% of a critical subsystem and they leave, you have a serious problem. This is called “bus factor” — how many people need to get hit by a bus before nobody understands the code.

What to look for: Ask who owns each major area of the codebase. If the answer is one name for everything, that’s a risk. Git history analysis can quantify this automatically. A code health report will surface it as a “team knowledge” or “ownership concentration” metric.

What it means: High knowledge concentration isn’t a code quality issue. It’s an organizational risk. The code might be excellent, but if only one person understands it, you’re one resignation away from a crisis.

Red flag: “Only Alex can work on that part of the system.” This should trigger immediate knowledge-sharing, pair programming, or documentation efforts.

4. Commit patterns and velocity

How the team works is visible in their commit history. How often they ship, how large the changes are, whether the pattern is consistent or erratic.

What to look for: Ask for a summary of commit frequency over the last 3-6 months. Look for consistency, not raw speed. Healthy projects have steady, granular commits. Unhealthy projects have long silences followed by massive changes.

What it means: Large, infrequent commits suggest the team is batching work and shipping risky big-bang releases. Steady, small commits suggest confidence and a functioning CI/CD pipeline. A declining trend might signal team burnout, scope creep, or mounting technical debt.

Red flag: “We deployed a major update last Friday.” Large deployments on Fridays are a sign the team doesn’t have confidence in their release process.

5. Architecture complexity

You don’t need to understand architecture diagrams in detail. But you should know whether the system is getting simpler or more complex over time, and whether the team can explain it clearly.

What to look for: Ask the lead developer to draw the system architecture in 5 minutes. If they can’t, or if the diagram looks like a bowl of spaghetti, that’s a signal. Also ask: “If we needed to replace module X, how hard would that be?”

What it means: Good architecture means components are separated and replaceable. Bad architecture means everything is tangled together. Changing one thing breaks three others. The team’s ability to explain it simply is itself a quality signal.

Red flag: “It’s complicated, you wouldn’t understand.” If the team can’t explain the architecture to a non-technical person, they may not fully understand it themselves.

6. Technical debt trajectory

Technical debt is the gap between “how the code should be” and “how the code actually is.” Every project has some. What matters is whether it’s growing or shrinking.

What to look for: Ask whether the team spends time on cleanup/refactoring, and how much. A team that spends zero time on maintenance is accumulating debt. A team that spends 100% on maintenance isn’t shipping value. The sweet spot is 15-25% of capacity on technical health.

What it means: Growing technical debt is like compound interest on a loan. It makes everything slower and more expensive over time. If the team says “features are taking longer than they used to,” technical debt is likely the cause.

Red flag: “We’ll clean it up after the next release.” Technical debt cleanup that’s always deferred is never done.

10 questions to ask your engineering team

Bring these to your next 1:1 or team meeting. You don’t need to understand the technical details. The way the team answers matters as much as the answer itself.

  1. “How many of our dependencies have known security vulnerabilities?” — If they don’t know, that’s an answer too.
  2. “What percentage of our critical user flows have automated tests?” — Look for specifics, not “we have good coverage.”
  3. “If Sarah left tomorrow, which parts of the system would nobody else understand?” — Names the bus factor risk directly.
  4. “When was the last time we updated our dependencies?” — Monthly is healthy. “I don’t remember” is not.
  5. “What’s the biggest risk in our codebase right now?” — Good teams have a ready answer. Worried teams deflect.
  6. “How long would it take a new developer to make their first meaningful contribution?” — Measures codebase approachability. Over 2 weeks suggests documentation and architecture issues.
  7. “What would break if we doubled our user count tomorrow?” — Tests whether the team thinks about scalability or just current needs.
  8. “How much of our sprint capacity goes to maintenance vs new features?” — Zero maintenance = debt accumulation. Over 40% = something is structurally wrong.
  9. “Can you show me a trend of our code health metrics over the last 3 months?” — If they can’t, they’re not tracking it.
  10. “What’s one thing you’d fix if you had a free week?” — The answer reveals what the team knows is wrong but hasn’t been able to address.

Tools that translate code into language you understand

If you want objective data instead of (or alongside) the team’s self-assessment, several tools can help. They serve different purposes:

Developer-focused code quality tools:

  • SonarQube — the industry standard for rule-based code analysis. Thorough, but the output is designed for developers. You’ll need an engineer to interpret the dashboards.
  • CodeClimate — maintainability scores and velocity metrics. More accessible than SonarQube but still developer-oriented.

Engineering analytics platforms:

  • Jellyfish — engineering investment and team analytics for VPs and CTOs. Powerful but enterprise-priced ($50K+/year).
  • LinearB — engineering metrics and workflow automation. Similar tier to Jellyfish.

AI-powered code health reports:

  • StackGrit — AI analyzes your codebase and produces plain-language health reports designed for non-technical stakeholders. Covers architecture, security, team dynamics, and code quality in one report. $29-299/mo with a free first analysis.

Manual code audits:

  • Hiring a consultant ($200+/hr) for a one-time audit gives you a human expert’s opinion but no ongoing monitoring. Good for due diligence; expensive for regular health checks.

The right choice depends on your situation. If you have a technical team that just needs better tooling, SonarQube or CodeClimate work well. If you need to understand project health without technical interpretation, StackGrit or a consultant is the better fit. If you’re managing large engineering organizations, Jellyfish or LinearB provide the executive-level analytics.

You don’t need to read code. You need to read the signals.

The teams that perform best aren’t the ones with the cleanest code. They’re the ones where leadership has visibility into what’s actually happening. The six signals and ten questions above give you that visibility. The tools automate the measurement.

The goal isn’t to become technical. It’s to stop guessing.


Want to see where your project actually stands? StackGrit produces a plain-language health report covering all six signals above. No technical background required. First report is free, no credit card.

Get your project health report →