7.5 — Program Metrics & Continuous Improvement

Response & Improvement 90 min All Roles
0:00 / 0:00
Listen instead
Program Metrics & Continuous Improvement
0:00 / 0:00

Learning Objectives

  • Design a metrics program that measures the effectiveness of the SSDLC program using leading and lagging indicators.
  • Conduct an OWASP SAMM self-assessment to establish baseline maturity and define target states.
  • Use BSIMM benchmarking to compare your program against industry peers.
  • Define and track AI-specific metrics for AI-augmented development.
  • Build executive and technical dashboards that communicate program value and guide resource allocation.
  • Implement a continuous improvement cycle that uses data to drive program evolution.

1. Why Metrics Matter

“You can’t improve what you don’t measure” is a cliche because it is true. An SSDLC program without metrics is operating on faith — faith that training is effective, faith that tools are catching vulnerabilities, faith that the investment is paying off. Faith is not a strategy.

Metrics serve four critical functions:

Demonstrate ROI: Security programs compete for budget with every other business priority. Metrics that show measurable improvement in vulnerability density, detection rates, and remediation times make the case for continued — or increased — investment.

Identify trends: A single data point tells you where you are. A trend tells you where you are going. Rising vulnerability density may indicate scaling problems, technology migration challenges, or training gaps. Declining MTTR may confirm that process improvements are working.

Inform resource allocation: Limited resources must be applied where they have the greatest impact. Metrics reveal which teams, technologies, or vulnerability types need the most attention. Without data, resource allocation is guesswork.

Create accountability: Metrics make commitments measurable. “We will improve our security posture” is a wish. “We will reduce our critical vulnerability MTTR from 14 days to 7 days by Q3” is a goal. Goals can be tracked, reported on, and held accountable.

CIS Control 16.1 — Secure Application Development Process

CIS 16.1 requires establishing and maintaining a secure application development process. Metrics are how you demonstrate that the process is functioning and improving.

CIS Control 16.9 — Training Program for Application Developers

CIS 16.9 requires that developers receive application security training appropriate to their role. Metrics demonstrate that training is being delivered, completed, and having an impact on security outcomes.


2. OWASP SAMM Self-Assessment

The OWASP Software Assurance Maturity Model (SAMM) is the most widely used framework for assessing and improving software security programs. It provides a structured way to evaluate where you are, define where you want to be, and build a roadmap to get there.

OWASP SAMM Maturity Model Figure: OWASP SAMM Maturity Model — Five business functions with three security practices each across four maturity levels

SAMM Structure

SAMM is organized into five business functions, each with three security practices:

Business FunctionSecurity Practices
GovernanceStrategy & Metrics, Policy & Compliance, Education & Guidance
DesignThreat Assessment, Security Requirements, Security Architecture
ImplementationSecure Build, Secure Deployment, Defect Management
VerificationArchitecture Assessment, Requirements-driven Testing, Security Testing
OperationsIncident Management, Environment Management, Operational Management

Each practice is assessed at a maturity level from 0 (none) to 3 (mature):

Level 0 — Implicit: No formal activities. Security happens ad hoc, if at all.

Level 1 — Initial: Basic security practices are in place. Activities are performed but may not be consistent, documented, or comprehensive.

Level 2 — Managed: Security practices are defined, documented, and applied consistently across the organization. Metrics are collected.

Level 3 — Optimized: Security practices are continuously improved based on data. Activities are automated where possible. The program is integrated into the overall business strategy.

Conducting a SAMM Assessment

Step 1: Assemble the assessment team. Include representatives from development, security, operations, QA, and management. Each perspective contributes to an accurate assessment.

Step 2: Score each practice. For each of the 15 practices, evaluate the current state against the SAMM criteria. Use the SAMM assessment questionnaire (available at owaspsamm.org) as a structured guide.

Be honest. Overrating your current state defeats the purpose. Common error: confusing “we have a policy” with “we consistently follow the policy.” SAMM scores the practice, not the documentation.

Step 3: Document the baseline. Create a radar chart or scorecard showing the current maturity level for each practice. This is your baseline — the starting point from which all improvement will be measured.

Step 4: Define target maturity levels. Not every practice needs to be at Level 3. Target levels should reflect:

  • Business risk profile: higher-risk organizations need higher maturity.
  • Regulatory requirements: compliance mandates may require specific maturity levels.
  • Resource availability: improvement requires investment.
  • Industry benchmarks: what peer organizations have achieved.

A common starting target is Level 1 across all practices (eliminating gaps) with Level 2 in the highest-risk areas.

Step 5: Create a roadmap. For each practice gap (current level vs. target level), define:

  • Specific activities needed to reach the target.
  • Resource requirements (people, tools, budget).
  • Timeline with milestones.
  • Success criteria and metrics.
  • Dependencies on other practices.

Step 6: Reassess periodically. Conduct a full SAMM reassessment annually. Track progress against the roadmap. Adjust targets and priorities based on business changes, new threats, and lessons learned.


3. BSIMM Benchmarking

The Building Security In Maturity Model (BSIMM) is a descriptive model based on real-world data from over 100 organizations. Unlike SAMM (which is prescriptive — “here is what you should do”), BSIMM is descriptive — “here is what organizations actually do.”

BSIMM Structure

BSIMM organizes software security activities into four domains and twelve practices:

DomainPractices
GovernanceStrategy & Metrics, Compliance & Policy, Training
IntelligenceAttack Models, Security Features & Design, Standards & Requirements
SSDL TouchpointsArchitecture Analysis, Code Review, Security Testing
DeploymentPenetration Testing, Software Environment, Configuration Management & Vulnerability Management

How to Use BSIMM

Benchmarking: Compare your organization’s activities against the BSIMM dataset. For each activity, see what percentage of participating organizations perform it. This tells you whether you are ahead of, behind, or aligned with industry practice.

Prioritization: Activities performed by the highest percentage of organizations are likely foundational — you should do them too. Activities performed by fewer organizations may represent advanced maturity or niche requirements.

Trend analysis: BSIMM is updated periodically with new data. Track which activities are increasing or decreasing in adoption to understand industry direction.

Key BSIMM Findings

Several findings from BSIMM data are particularly relevant to program managers:

Security awareness training is declining. Despite being nearly universal among early BSIMM participants, formal security awareness training has been declining in adoption. This does not mean training is less important — it means organizations are shifting toward more targeted, role-specific training and away from generic annual awareness programs. The insight is that generic security awareness training has limited measurable impact. Targeted training tied to specific roles, technologies, and vulnerability patterns is more effective.

Automation is increasing. Organizations are increasingly automating security activities (SAST, SCA, DAST integration into CI/CD) and decreasing reliance on manual security reviews. The shift toward “security as code” continues to accelerate.

Cloud security activities are now mainstream. Container security, infrastructure-as-code scanning, and cloud configuration review have moved from innovative to standard practice.

Software supply chain security is the fastest-growing area. SBOM generation, dependency analysis, and supply chain risk management have seen the largest increase in adoption.


4. Leading Indicators (Predictive)

Leading indicators measure activities and behaviors that predict future security outcomes. They answer the question: “Are we doing the things that will lead to better security?”

Training Metrics

MetricTargetMeasurement Method
Training completion rate>95% of eligible developersLMS completion records
Assessment pass rate>80% first attemptAssessment platform scores
Assessment score trendsUpward over timeAverage score per cohort per quarter
Time to complete modulesWithin expected rangeLMS time tracking (too fast = not engaging, too slow = struggling)
Training relevance rating>4.0/5.0Post-training survey
Knowledge retention>70% on 90-day retestScheduled reassessment

Why these matter: Training completion alone does not indicate learning. A developer who clicks through slides in 10 minutes has “completed” training but learned nothing. Assessment scores, score trends, and engagement depth (time spent) provide a more accurate picture of actual learning.

Engagement Metrics

MetricTargetMeasurement Method
CTF/tournament participation rate>50% of developersPlatform participation records
Security champion coverage1 champion per 10-20 developersChampion program roster
Voluntary training uptakeIncreasing trendLMS enrollment records for optional content
Security question volumeIncreasing trendSlack channel analytics, office hours attendance
Knowledge sharing eventsAt least monthly per championEvent calendar tracking
Platform login frequencyWeekly for active developersPlatform analytics

Why these matter: Engagement metrics measure security culture. An organization where developers voluntarily participate in CTFs, ask security questions, and share security knowledge has a fundamentally different security posture than one where developers do the minimum required training and move on.

Process Metrics

MetricTargetMeasurement Method
Code review security commentsIncreasing trendCode review platform analytics
Threat model coverage100% of new featuresThreat model registry
Security requirements in user stories>80% of stories for sensitive featuresStory tracking analysis
Tool adoption rate (CI)>90% of reposCI/CD pipeline configuration audit
Tool adoption rate (IDE)>50% of developersIDE plugin telemetry

5. Lagging Indicators (Outcome-Based)

Lagging indicators measure results — the actual security outcomes the program is designed to improve. They answer the question: “Is our security actually getting better?”

Vulnerability Metrics

MetricTargetMeasurement Method
Vulnerability densityDecreasing trend (vulns per 1,000 LOC)Scanner data normalized by codebase size
Mean Time to Remediate (MTTR)Decreasing, within SLA targetsVulnerability tracking system
Mean Time to Detect (MTTD)Approaching zero (shift-left)Time from code commit to vulnerability detection
Pre-production detection rate>80% caught before productionCompare pre-prod findings to production findings
Security debt ratioDecreasing (unresolved vulns / total open issues)Vulnerability tracking system
Recurrence rateDecreasing (same vuln type after training)CWE-based tracking over time
Production security incidentsDecreasing (incidents from code-level vulns)Incident tracking system
False positive rateDecreasing (indicates better tool tuning)Scanner validation data

MTTR Targets by Severity

SeverityMTTR TargetIndustry Benchmark (Median)
Critical24 hours5-7 days
High7 days30 days
Medium30 days90 days
Low90 days180+ days

Organizations that consistently meet these targets are in the top quartile of their peers. The gap between targets and industry benchmarks shows both the ambition required and the improvement opportunity.

Detection Phase Distribution

Track where vulnerabilities are discovered in the SDLC:

PhaseIdeal TargetWhy
Design (threat modeling)15-20%Cheapest to fix, prevents entire classes of issues
Development (IDE + commit)30-40%Fixed immediately by the developer who wrote the code
Build (CI pipeline)20-30%Caught before code reaches shared environments
Test (staging/QA)5-10%Caught before production but more expensive to fix
Production (monitoring/pen test)<5%Most expensive, may require incident response
External (researchers/customers)<2%Indicates external parties finding what you missed

A mature program shifts left: the majority of vulnerabilities are discovered during design and development, not in production. Track this distribution over time to measure shift-left progress.


6. AI-Specific Metrics

The integration of AI into the development lifecycle requires a new category of metrics that did not exist before 2023. These metrics are essential for understanding whether AI is helping or hurting your security posture.

AI Code Quality Metrics

MetricBaselineTargetSource
AI suggestion acceptance rate~30% (GitHub Copilot reported)Not a target — understand contextAI tool telemetry
AI vulnerability introduction rate2.74x human baseline (Veracode 2025)Approaching 1.0x (parity with human code)Scanner data + code attribution
AI code mutation scoreVaries>60%Mutation testing of AI-generated code
Human override rateVariesTracked for trendsCode review data + AI telemetry
AI code review coverageVaries100% for security-sensitive codeCode review records
Secret leakage rate6.4% baseline (Copilot-generated code)<1%Secret scanner data + code attribution

AI Process Metrics

MetricTargetSource
AI-attributable defect rateDecreasing trendPost-deployment defect analysis + code attribution
Time-to-detection: AI bugs vs human bugsEqual or better for AI bugsDefect tracking system
Shadow AI detection rate>90% of unauthorized AI tool usage detectedNetwork monitoring, DLP, CASB
AI policy compliance rate>95%Policy audit data
AI tool coverage (approved vs unapproved)>95% usage on approved toolsTool inventory + usage monitoring
AI training completion rate>95% of developersLMS completion records
AI incident rateDecreasing trendIncident tracking system

Interpreting AI Metrics

AI suggestion acceptance rate (~30%): This is not inherently good or bad. A very low rate may indicate the AI tool is not providing value. A very high rate may indicate developers are accepting suggestions without critical evaluation (automation bias). The key is to understand the context: what types of suggestions are being accepted, and are accepted suggestions later found to be problematic?

AI vulnerability introduction rate (2.74x human baseline): The Veracode 2025 data showed that AI-generated code contained vulnerabilities at 2.74 times the rate of human-written code. This is a critical metric. If your organization is using AI coding tools without additional safeguards, you may be introducing more vulnerabilities than you are catching. Track this metric over time as AI tools improve and as your guardrails (SAST, code review, AI-specific policies) take effect.

Secret leakage rate (6.4% baseline): The Stanford/UC Berkeley research found that GitHub Copilot-generated code contained hardcoded secrets (API keys, passwords, tokens) in approximately 6.4% of suggestions. This metric must be tracked and driven down through AI tool configuration, secret scanning in the IDE and CI/CD pipeline, and developer training on secret management.


7. Correlation Analysis

Individual metrics are useful. Correlated metrics are powerful. Correlation analysis connects program activities (leading indicators) to security outcomes (lagging indicators), proving cause and effect.

Training Effectiveness Correlation

Vulnerability density: trained vs. untrained developers. Compare the vulnerability density (vulns per 1,000 LOC) of code written by developers who have completed security training against those who have not. Significant difference validates training investment.

Control for confounding factors:

  • Seniority (more experienced developers may write fewer vulnerabilities regardless of training).
  • Technology (some languages/frameworks are inherently more secure).
  • Code complexity (security-critical code is harder to write correctly).

MTTR before and after training cohorts. Track the average MTTR for vulnerability remediation by developers before and after they complete security training. Faster MTTR after training indicates developers can identify and fix security issues more effectively.

Recurrence rates after targeted training. When a developer receives targeted training on a specific vulnerability type (e.g., XSS), track whether they continue to introduce that vulnerability type. Decreasing recurrence validates the training’s effectiveness for that specific topic.

Benchmarking Against Peers

Secure Code Warrior Trust Score: If using Secure Code Warrior for training, the Trust Score provides a developer-level and team-level metric that measures security knowledge, training engagement, and code review activity. Benchmark against industry data.

BSIMM percentile ranking: If participating in BSIMM, compare your activity adoption against the full dataset. Identify activities where you are below the 50th percentile — these are areas where most of your peers have matured beyond your current state.

Verizon DBIR data: Use the annual Verizon Data Breach Investigations Report to benchmark your vulnerability types against industry-wide breach data. If 30% of breaches in your industry involve web application attacks, and your production web application vulnerability rate is declining, you are reducing your exposure to the most common attack vectors.

Research Correlation

Published research provides additional correlation evidence:

  • 70% fewer security incidents with robust, ongoing security training programs (SANS/GIAC research).
  • 3x less likely to suffer breach with exposure-based vulnerability prioritization (Tenable/Cyentia research).
  • 60-80% reduction in effective remediation workload with EPSS-based prioritization (FIRST research).
  • 50% reduction in vulnerability escape rate when SAST is integrated in IDE + CI/CD (Synopsys BSIMM data).

These correlations support the business case for program investment by connecting activities to measurable outcomes.


8. Security Champions Program Metrics

The Security Champions program is a force multiplier for the SSDLC program. Metrics track its health and impact.

Coverage and Structure

MetricTargetDescription
Champion coverage ratio1:10-20 developersEvery team should have at least one champion
Champion distributionEvery product team representedNo teams without champion coverage
Champion tenure>12 months averageChampions need time to be effective
Champion churn rate<20% annuallyHigh churn indicates burnout or lack of support

Activity and Growth

MetricTargetDescription
Belt progression ratesUpward trendChampions advancing through training levels
Mentoring activityEach champion mentoring 2+ developersActive mentoring drives team-wide improvement
Security review participationChampions involved in >80% of team security reviewsChampions as security review participants
Knowledge sharing eventsMonthly per championPresentations, workshops, brown bags
Tool champion activitiesEach champion proficient in >=2 security toolsChampions as tool experts for their teams

Impact Metrics

MetricTargetDescription
Security culture surveyImproving trendAnnual survey measuring developer security attitudes
Champion team vs non-champion team vuln densityChampion teams lowerCompare vulnerability density across teams
Time to adopt new security practicesChampion teams fasterMeasure how quickly teams implement new security requirements
Security question routing>50% resolved by champions (not AppSec)Champions as first line of security guidance

9. Reporting

Metrics are only useful if they reach the right audience in the right format at the right time.

Executive Dashboards

Audience: CTO, CISO, VP Engineering, Board members. Frequency: Monthly summary, quarterly deep dive. Focus: Business risk, trends, compliance, ROI.

Content:

  • Overall risk posture: Single metric or traffic light showing current risk level and trend direction.
  • Vulnerability backlog trend: Total open vulnerabilities by severity over time (should be stable or decreasing).
  • SLA compliance: Percentage of vulnerabilities remediated within SLA, by severity.
  • Production incident trend: Security incidents from code-level vulnerabilities over time.
  • Program maturity progress: SAMM or BSIMM scores compared to target.
  • Investment efficiency: Cost per vulnerability found and fixed, compared to cost of security incidents.
  • Compliance status: Current compliance posture against applicable frameworks.

Rules for executive dashboards:

  • No more than 7 metrics on a single view.
  • Every metric must have a trend (up/down/stable) and a context (good/bad/neutral).
  • Use red/yellow/green sparingly and only with clear, documented thresholds.
  • Include narrative: what do the numbers mean? What action is needed?

Technical Dashboards

Audience: AppSec team, engineering managers, team leads, security champions. Frequency: Real-time to weekly. Focus: Tool-specific metrics, finding details, SLA compliance, team performance.

Content:

  • Scanner findings by tool, severity, and CWE: What are the scanners finding?
  • False positive rates by tool and rule: Are the tools tuned correctly?
  • MTTR by team and severity: Which teams are fast/slow at remediation?
  • Pipeline gate blocks: How many builds are being blocked by security gates? What is the override rate?
  • Dependency vulnerability status: Known vulnerable dependencies by severity and EPSS.
  • AI code metrics: AI suggestion acceptance rates, AI-attributable defects, shadow AI detections.
  • Training completion and assessment scores by team: Which teams need attention?

Quarterly Program Reviews

Format: 60-minute meeting with stakeholders (engineering leadership, security leadership, product management).

Agenda:

  1. Program status (10 min): Key metrics, trends, highlights.
  2. Accomplishments (10 min): What was achieved this quarter? What milestones were hit?
  3. Challenges (10 min): What is not working? Where are we falling short?
  4. Metrics deep dive (15 min): Detailed analysis of key metrics and correlations.
  5. Next quarter plan (10 min): Priorities, resource needs, targets.
  6. Discussion (5 min): Questions, feedback, decisions needed.

Annual Maturity Assessment

Format: Formal SAMM or BSIMM reassessment.

Deliverable: Comprehensive report including:

  • Current maturity scores compared to baseline and previous year.
  • Progress against roadmap milestones.
  • Updated roadmap for the next year.
  • Budget request with ROI justification based on measured outcomes.
  • Industry benchmarking comparison.

10. Continuous Improvement Cycle

Metrics without action are just expensive dashboards. The continuous improvement cycle ensures that data drives change.

The Metrics-Action Loop

Measure → Analyze → Identify → Plan → Execute → Verify → Repeat
  1. Measure: Collect leading and lagging indicators per the metrics program.
  2. Analyze: Identify trends, anomalies, and correlations. Compare against targets and benchmarks.
  3. Identify: Determine what is working (reinforce), what is not working (address), and what is missing (add).
  4. Plan: Define specific improvement actions with owners, timelines, and success criteria.
  5. Execute: Implement the improvements.
  6. Verify: Measure the impact of the improvements. Did the metrics move in the expected direction?
  7. Repeat: Continuously cycle through the loop.

Common Improvement Actions

Metric SignalPossible Root CauseImprovement Action
Training completion lowContent not engaging, time not allocatedGamify training, allocate sprint time, manager accountability
Assessment scores decliningContent stale, new technology gapsUpdate content, add new modules, adjust difficulty
Vulnerability density increasingNew technology adoption, developer turnoverTargeted training, improved onboarding, tool configuration
MTTR increasingProcess bottleneck, capacity constraintAutomate triage, increase AppSec capacity, improve tooling
Pre-production detection lowTool gaps, insufficient coverageAdd tools, expand scan scope, improve rule configuration
Recurrence rate highTraining not addressing root causeRedesign training with hands-on labs, enforce SAST rules
AI defect rate highInsufficient AI code reviewMandate human review for AI code, improve SAST for AI patterns
Champion churn highBurnout, insufficient recognitionReduce champion load, add recognition/rewards, management support
False positive rate highScanner misconfigurationTune rules, adjust thresholds, add context filters
SLA compliance decliningPriorities competing, unrealistic SLAsReview SLAs, protect remediation time, executive support

Avoiding Metric Gaming

Metrics that are tied to performance evaluations or compensation will be gamed. Common gaming behaviors:

  • Closing vulnerabilities without fixing them: Marking findings as “won’t fix” or “false positive” to improve MTTR. Counter with audit of closed findings.
  • Splitting large fixes into multiple tickets: Inflating “vulnerabilities fixed” count. Counter with deduplication logic.
  • Racing through training: Completing modules in minimum time without engagement. Counter with assessment scores as the primary metric, not completion alone.
  • Downgrading severity: Classifying High vulnerabilities as Medium to avoid SLA pressure. Counter with independent severity validation.
  • Counting work-in-progress as complete: Counter with verification requirements before closure.

The antidote to gaming is measuring outcomes (vulnerability density, production incidents, breach probability) rather than activities (tickets closed, training completed, scans run). Outcomes are harder to game because they reflect reality.


Key Takeaways

  1. Metrics are infrastructure, not overhead. Without metrics, you cannot demonstrate improvement, justify investment, or make informed decisions.
  2. Use SAMM for prescriptive guidance and BSIMM for descriptive benchmarking. SAMM tells you what to do. BSIMM tells you what your peers are doing. Both are valuable.
  3. Lead with leading indicators, prove with lagging indicators. Leading indicators predict future outcomes. Lagging indicators confirm past results. You need both.
  4. AI-specific metrics are non-negotiable in 2026. AI vulnerability introduction rate (2.74x baseline), secret leakage rate (6.4% baseline), shadow AI detection, and AI-attributable defects must be tracked.
  5. Correlation proves causation over time. Comparing vulnerability density between trained and untrained developers, tracking MTTR before and after training, and measuring recurrence rates after targeted training connect program activities to security outcomes.
  6. Dashboards must match the audience. Executives need 7 metrics with trends and narrative. Technical teams need detailed, real-time data. Quarterly reviews bring everyone together.
  7. Measure outcomes, not just activities. Activities can be gamed. Outcomes reflect reality. A 95% training completion rate means nothing if vulnerability density is increasing.

Practical Exercise

Exercise 1: SAMM Baseline Assessment Using the OWASP SAMM assessment tool (owaspsamm.org), conduct a baseline assessment for your organization. Score each of the 15 practices. Document the current state, evidence supporting the score, and gaps identified. Present the results as a radar chart.

Exercise 2: Metrics Dashboard Design Design an executive dashboard and a technical dashboard for your SSDLC program. For each metric:

  • Define the metric precisely (name, formula, data source).
  • Set a target with justification.
  • Define the reporting frequency.
  • Determine red/yellow/green thresholds.
  • Create a mock visualization.

Exercise 3: AI Metrics Baseline Establish baseline measurements for your organization’s AI-specific metrics:

  • What is the current AI suggestion acceptance rate?
  • What is the current vulnerability rate in AI-generated code vs. human-written code?
  • How much shadow AI usage exists?
  • What percentage of developers have completed AI security training? Identify the three highest-priority AI metrics for your organization and define improvement targets for the next quarter.

Exercise 4: Correlation Analysis Select one leading indicator and one lagging indicator. Collect data for the past 6-12 months. Analyze the correlation between them. Present your findings: is the leading indicator predictive of the lagging indicator? What confounding factors might affect the relationship? What additional data would strengthen the analysis?


References

Study Guide

Key Takeaways

  1. Metrics serve four functions — Demonstrate ROI, identify trends, inform resource allocation, create accountability.
  2. SAMM is prescriptive, BSIMM is descriptive — SAMM tells you what to do; BSIMM tells you what peers actually do; both are valuable.
  3. SAMM has four maturity levels — Level 0 (Implicit), Level 1 (Initial), Level 2 (Managed), Level 3 (Optimized).
  4. AI-specific metrics are non-negotiable in 2026 — AI vulnerability rate (2.74x), secret leakage (6.4%), override rate, AI-attributable defects.
  5. Measure outcomes, not just activities — Activities can be gamed; outcomes reflect reality; 95% training completion means nothing if vulnerability density increases.
  6. Security champion ratio 1:10-20 developers — Every team represented; 12+ months tenure needed; <20% annual churn target.
  7. 70% fewer incidents with robust training — SANS/GIAC research; 3x less likely to breach with EPSS-based prioritization.

Important Definitions

TermDefinition
OWASP SAMMSoftware Assurance Maturity Model — prescriptive framework for assessing and improving security programs
BSIMMBuilding Security In Maturity Model — descriptive model based on real-world data from 100+ organizations
Leading IndicatorMeasures activities and behaviors predicting future security outcomes
Lagging IndicatorMeasures actual security outcomes confirming past results
Vulnerability DensityVulnerabilities per 1,000 lines of code; primary outcome metric
Detection Phase DistributionWhere vulnerabilities are found in the SDLC (design through external)
Security ChampionDeveloper serving as persistent security contact for their team
Correlation AnalysisConnecting program activities (leading) to security outcomes (lagging)
MTTRMean Time to Remediate by severity; key lagging indicator
Metric GamingManipulating metrics without improving outcomes (e.g., closing vulns as “won’t fix”)

Quick Reference

  • MTTR Targets: Critical 24h (industry median 5-7d), High 7d (30d), Medium 30d (90d), Low 90d (180d+)
  • Ideal Detection Distribution: Design 15-20%, Development 30-40%, Build 20-30%, Test 5-10%, Production <5%, External <2%
  • Executive Dashboard: Max 7 metrics, every metric has trend + context, includes narrative
  • AI Metrics: Acceptance rate (~30%), vulnerability rate (2.74x), mutation score (>60%), override rate (30-50%), secret leakage (<1%)
  • Common Pitfalls: Measuring activities not outcomes, too many metrics on dashboards, no correlation analysis, gaming via “won’t fix” closures, generic training without role-specific content

Review Questions

  1. Explain the difference between SAMM and BSIMM, and describe a scenario where you would use each.
  2. Design an executive dashboard with 7 or fewer metrics that demonstrates SSDLC program value, including trends and thresholds.
  3. How would you correlate training completion (leading) with vulnerability density (lagging) to prove training ROI?
  4. Your organization’s AI code comprises 30% of the codebase but 60% of defects — which metrics would you examine and what actions would you take?
  5. Describe three common metric gaming behaviors and how measuring outcomes instead of activities prevents each.
Program Metrics & Continuous Improvement
Page 1 of 0 ↧ Download
Loading PDF...

Q1. What is the difference between OWASP SAMM and BSIMM?

Q2. What are the four OWASP SAMM maturity levels?

Q3. According to Veracode 2025 data, at what rate does AI-generated code introduce vulnerabilities compared to human-written code?

Q4. What is the ideal percentage of vulnerabilities that should be discovered in production in a mature program?

Q5. What is the leading indicator for measuring security culture in an organization?

Q6. What are the four functions metrics serve according to the module?

Q7. What is the recommended Security Champion coverage ratio?

Q8. What is the most common way organizations game vulnerability management metrics?

Q9. How many metrics should be on an executive dashboard at most?

Q10. What correlation research shows approximately 70% fewer security incidents?

Answered: 0 of 10 · Score: 0/0 (0%)