OWASP TOP 10 Data Call Submission

The OWASP community is in the process of finalising its effort to update the TOP 10 since its last publication in 2013. It's a perfect opportunity for us to contribute back to the community. We had a close look at our vulnerability data set and submitted the raw version to OWASP, which plans to publish it by early 2017 at the latest. Besides submitting the questionnaire, we also would like to share our views and conclusions that can be drawn from the vulnerability data set that was part of our submission. 

Data set background 

Statistics are great but without a background on how a specific data set has been compiled, they can be misleading. The requirement to publish the raw data set for OWASP TOP 10 is definitely a step forward. It will provide greater transparency for the AppSec community, allow individuals and organisations to review the submissions and maybe in the future derive additional useful conclusions besides the TOP 10 from it.

First, let’s look at the context for the data set that the below statistics are generated from. It has been collected throughout 2015 predominately across the Finance, Government, Technology and Telecommunication industries. Vantage Point is a Singapore based company, therefore, the majority of the findings come from local assessments in Singapore and in the SEA region. Taking the terminology from the OWASP questionnaire of the submission form, our AppSec assessment methodology is best described as manual expert penetration testing or/and code review coupled with commercial as well as free DAST and SAST tools. In terms of risk we only include vulnerabilities that have a direct impact on the system and who actually score CVSSv2 above zero, therefore observational findings with no risk are excluded from the data set. 

Counting numbers

The graph below shows our top vulnerability types based on counting how often the instances occur across the entire data set. As expected Cross Site Scripting is leading the pack, Reflected and Stored XSS account for more than a quarter of the pie. Using Components with Known Vulnerabilities in our case refers to outdated 3rd party software or libraries. Our AppSec assessments include application source code most of the time and we frequently see outdated and vulnerable libraries  being used in applications. Managing those 3rd party libraries in the SDLC is definitely a pain point for a lot of organisations. The other vulnerability types like SQL Injection, authentication and authorization issues and Information Exposure still occur fairly frequently, which is not surprising. The label Others, accounts for vulnerability types that account for less than 3% of the data set.    


Vulnerability count based on instances per application

Why 1+1 is not always 2 

Analysing the data set reveals some really interesting aspects that even we, who feed vulnerabilities into our vulnerability database on a daily basis, were not aware of. An important consideration when you work with vulnerability metrics is the question on how to count vulnerabilities. The question seems trivial but it's an important one to ask and understand. Consider the following simplified example: a penetration test reveals 10 instances of Cross Site Scripting, 5 instances of Insecure Direct Object Reference, 3 instances of Information Disclosure and 2 instances of SQL Injection in a web application. Purely relying on counting instances of vulnerabilities reveals that in the TOP 3 of the tested web application, SQL Injection is not included. Also for an attacker one instance of SQL Injection or one insecure File Upload is usually enough, you don't need 10 to compromise an application. Therefore, counting vulnerability types based on their occurrence in an application regardless of how many instances there are, reveals a slightly different and I would say richer picture. Vulnerability types such as insecure File Uploads, Cleartext Transmission of Sensitive Information and Insufficient Anti-automation Vulnerabilities make it into the statistics. In general, it is important to be mindful of the fact that counting raw numbers does not always reveal all aspects of an underlying vulnerability data set.  


Vulnerability type only counted one time per application regardless of instance count

Another interesting metric that can be derived from the data set is the likelihood of discovering a specific vulnerability type during an AppSec assessment. Cross Site Scripting (Reflected and Stored) is again leading the way here. It can be overserved in more than half of all AppSec assessments.  


Considering risk 

Another good example why different vulnerability metrics are important is risk. Being able to categorise vulnerability types based on their risk category is very important for organisations when they coordinate their remediation efforts. Resources, especially skilled information security experts in organisations are scarce. Therefore, being able to prioritise efforts based on risk is crucial because in most cases there will be more vulnerabilities in the backlog than hands available that can fix them. Another aspect is mitigation of vulnerabilities and considering when they occur in the SDLC. This also requires prioritisation and typically, higher risk means issues need to be addressed first. 

All vulnerabilities in the presented data set are scored with CVSSv2. v2 has several drawbacks, I have discussed those in a previous article a few weeks back. Often times, vulnerability types are scored as medium risk findings even though the real world impact differs substantially. In the current data set more than 60% of the vulnerability types score a medium risk, about 30% low leaving less than 9% as high-risk findings. CVSSv3 should address that issue by introducing the critical-risk category and by improving the calculation formula to have more spread between vulnerability risks. This way the majority of vulnerabilities are not cramped into one category. Hopefully, in the future vulnerability data sets that rely on v3 scoring will reveal new relevant trends and statistics in the respective risk categories.