The OWASP Benchmark is a test suite designed to evaluate the speed, coverage, and accuracy of automated vulnerability detection tools. Without the ability to measure these tools, it is difficult to understand their strengths and weaknesses, and compare them to each other. The Benchmark contains thousands of test cases that are fully runnable and exploitable. The following is the scorecard showing how well the commercial tools collectively performed against version 1.1,1.2 of the Benchmark. For each vulnerability it shows the lowest, average, and highest scores across all the commercial tools included in this scorecard calculation.
For more information, please visit the OWASP Benchmark Project Site.
Vulnerability Category | Low Tool Type | Low Score | Ave Score | High Score | High Tool Type |
---|---|---|---|---|---|
Command Injection | SAST | 0 | 16 | 29 | SAST |
Cross-Site Scripting | SAST | 9 | 23 | 39 | SAST |
Insecure Cookie | SAST | 0 | 33 | 100 | SAST |
LDAP Injection | SAST | 0 | 17 | 54 | SAST |
Path Traversal | SAST | 1 | 19 | 35 | SAST |
SQL Injection | SAST | 9 | 24 | 34 | SAST |
Trust Boundary Violation | SAST | 0 | 8 | 16 | SAST |
Weak Encryption Algorithm | SAST | 0 | 39 | 74 | SAST |
Weak Hash Algorithm | SAST | 0 | 38 | 77 | SAST |
Weak Random Number | SAST | 0 | 36 | 90 | SAST |
XPath Injection | SAST | 0 | 27 | 59 | SAST |
Average across all categories for 6 tools | 1.7 | 25.5 | 55.2 |
Tool Type | SAST - Static Application Security Testing. DAST - Dynamic Application Security Testing. IAST - Interactive Application Security Testing. These terms were coined by Gartner. |
---|---|
True Positive (TP) | Tests with real vulnerabilities that were correctly reported as vulnerable by the tool. |
False Negative (FN) | Tests with real vulnerabilities that were not correctly reported as vulnerable by the tool. |
True Negative (TN) | Tests with fake vulnerabilities that were correctly not reported as vulnerable by the tool. |
False Positive (FP) | Tests with fake vulnerabilities that were incorrectly reported as vulnerable by the tool. |
True Positive Rate (TPR) = TP / ( TP + FN ) | The rate at which the tool correctly reports real vulnerabilities. Also referred to as Recall, as defined at Wikipedia. |
False Positive Rate (FPR) = FP / ( FP + TN ) | The rate at which the tool incorrectly reports fake vulnerabilities as real. |
Score = TPR - FPR | Normalized distance from the random guess line. |