Gain insight into some of the details of the OWASP Top 10 Call for Data and industry survey, and what we were attempting to learn. Hear about was learned from collecting and analyzing widely varying industry data and attempts to build a dataset for comparison and analysis. This talk will discuss tips and common pitfalls for structuring vulnerability data and the subsequent analysis. Learn what the data can tell us and what questions are still left unanswered. Uncover some of the differences in collecting metrics in different stages of the software lifecycle and recommendations for handling them.
1 of 31
Download to read offline
More Related Content
Wrangling OWASP Top10 data at BSides Pittsburgh PGH
2. z
OWASP TOP 10 OVERVIEW
則 First version was released in 2003
則 Updated in 2004, 2007, 2010, 2013, 2017
則 Started as an awareness document
則 Now widely considered the global baseline
則 Is a standard for vendors to measure against
3. z
OWASP TOP 10-2017 RC1
則 April 2017
則 Controversy over first release candidate
則 Two new categories in RC1
則 A7 Insufficient Attack Protection
則 A10 Underprotected APIs
則 Social Media got ugly
4. z
BLOG POSTS
則 Decided to do a little research and analysis
則 Reviewed the history of Top 10 development
則 Analyzed the public data
則 Wrote two blog posts
5. z
DATA COLLECTION
則 Original desire for full public attribution
則 This meant many contributors, didnt
則 End up mostly being consultants and vendors
則 Hope to figure out a better way for 2020
6. z
HUMAN-AUGMENTED TOOLS (HAT) VS.
TOOL-AUGMENTED HUMANS (TAH)
則 Frequency of findings
則 Context (or lack thereof)
則 Natural Curiosity
則 Scalability
則 Consistency
13. z
OWASP SUMMIT JUNE 2017
則 Original leadership resigns right before
則 I was there for SAMM working sessions
則 Top 10 had working sessions as well
則 Asked to help with data analysis for Top 10
14. z
OWASP TOP 10-2017
則 New Plan
則 Expanded data call, one of largest ever @ 114k
則 Industry Survey to select 2 of 10 categories
則 Fully open process in GitHub
則 Actively translate into multiple languages
則 en, es, fr, he, id, ja, ko
18. z
DATA CALL RESULTS
則 A change from frequency to incident rate
則 Extended Data Call added: More Veracode,
Checkmarx, Security Focus (Fortify), Synopsys,
Bug Crowd
則 Data for over 114,000 applications
22. z
DATA CALL RESULTS
則 Percentage of
submitting
organizations that
found at least one
instance in that
vulnerability
category
23. z
WHAT CAN THE DATA TELL US
則 Humans still find more diverse vulnerabilities
則 Tools only look for what they know about
則 Tools can scale on a subset of tests
則 You need both
則 We arent looking for everything
24. z
WHAT CAN THE DATA NOT TELL US
則 Is a language or framework more susceptible
則 Are the problems systemic or one-off
則 Is developer training effective
則 Are IDE plug-ins effective
則 How unique are the findings?
則 Consistent mapping?
則 Still only seeing part of the picture
25. z
VULN DATA IN PROD VS TESTING
0
0.5
1
1.5
2
2.5
3
0 0.5 1 1.5 2 2.5 3 3.5
Number of Vulnerabilities in Production
26. z
VULN DATA IN PROD VS TESTING
0
0.5
1
1.5
2
2.5
3
0 0.5 1 1.5 2 2.5 3 3.5
Security Defects in Testing
27. z
VULN DATA STRUCTURES
則 CWE Reference
則 Related App
則 Date
則 Language/Framework
則 Point in the process found
則 Severity (CVSS/CWSS/Something)
則 Verified
29. z
WHAT ABOUT TRAINING DATA?
則 How are you measuring training?
則 Are you correlating data from training to
testing automation?
則 Can you track down to the dev?
則 Do you know your Top 10?
30. z
WHAT CAN YOU DO?
則 Think about what story to tell, then figure what
data is needed to tell that story
則 Structure your data collection
則 Keep your data as clean and accurate as possible
則 Write stories
則 Consider contributing to Top 10 2020