- collect CVEs recorded in the SAP-kb project, this will have patch commits, incl the actual sources that have been changed
- then look up version ranges, using GHSA API (or if this does not work, snyk or NVD). Open: need to map CVEs to GHSA id for query. Can clone https://github.com/github/advisory-database/ and the get info from JSON entries , alias field.
- this would give us vulnerable and fixed version ranges, so we can use this to locate the built jars in Maven Central. The advantage of this approach is that we avoid building which is known to be tricky at scale (todo: look up Crista Lopez' work for supporting ref).
- the release version ranges between vulnerable and fixed version may be much larger than the commit
- issue 1 -- more classes could be touched -- can be mitigated by using project-kb data that has the classes modified in the security patch
- issue 2 -- the classes modified in the security patch might have more changes applied to them in other commits between the version -- nothing we can do here other than argueing that this wont have a major impact
- SAP-KB entry URL
- GHSA entry URL
- CVE
- commit URL
- latest vulnerable version GAV
- fixed version GAV
- source paths affected in commit (from sap-kb)
- relative path to latest vulnerable version jar (downloaded from Maven central)
- fixed version jar (downloaded from Maven central)
- path of affected .class file relative to latest vulnerable version jar
- name of affected .class file
- Named inner classes are separated out into separate records (rows within the CSV file).
- Anymymous inner classes are considered to be parts of their parents