Tom Alrich published a blog last year titled “The Global Vulnerability Database won’t be a “database” at all“. It is basically his outline for how to make an international database that many can contribute to, to replace the inadequate CVE / NVD database. He said he welcomes any comments and when it comes to vulnerability databases (VDBs), that’s my jam. Buckle up, this may be a long one. Note that I will use ‘GVD’ to refer to Tom’s idea, despite that not being an official or even truly proposed name.

GVD Organization
Tom starts out suggesting that this database be open source, managed by a 501(c)(3) non-profit, and open to commercial financial support. Does that sound familiar to anyone? Perhaps it is too old for some these days, but that is precisely what the Open Sourced Vulnerability Database (OSVDB) was, and it was even managed by the 501(c)(3) Open Security Foundation (OSF) (note: I was heavily involved in both). So what happened to it? It went away and underwent a metamorphosis into a commercial, subscription VDB (note: I am heavily involved in VulnDB). Why? Because getting support from both volunteers and companies was next to impossible. Everyone wanted this free, often superior, data but virtually no one wanted to work or pay for it. So the project just couldn’t sustain itself.
So the first part of this idea is already at risk. Granted, with enough time passing and the community now seeing that we were right all along, that CVE is not viable, perhaps it will work. That said, I will give a strong caution that if you move forward with this, plan for the worst. Expect it to be difficult to find volunteers, expect it to be difficult to get reliable funding, expect people to use the data against whatever license you declare, and expect there to be disagreements with how it moves forward. How will this be managed? Who has ultimate say? What VDB expertise will be at hand? Who will actually do the daily grind of aggregating and analyzing vulnerability disclosures? What’s the plan if your setup can’t keep up with the rapidly increasing disclosures? Just a few things to think about beforehand.
CPE and the Intro
This part comes from the second paragraph in the introduction, not the proposal, but it is important to speak to.
Specifically, a new identifier is needed for proprietary software, since I (and others) regard CPE as a dead end, even though it was pioneering in its time.
I have mixed feelings here, and lean more toward agreeing overall, but want to point out a few details that give this more context. First, the biggest problem with the Common Platform Enumeration (CPE) in my eyes is that it is singularly controlled by NVD. No one else can add official CPE at this time, although at the recent VulnCon, Tanya Brewer (director of NVD) said that they do plan to open it up somewhat so that CVE Numbering Authorities (CNAs) can submit. That’s a good start but still severely limits CPE in general. The CNAs are already submitting vulns to CVE which get passed to NVD who assigns the CPEs. What about the over 100,000 vulnerabilities without a CVE? Companies that track those have to use unofficial CPEs then deal with the nightmare of trying to manage them all when NVD eventually gets around to creating a CPE for a product they already tracked.
Next, the actual format of the CPE naming that is too loosely defined, so that two professionals can derive two different strings for the same vulnerability disclosure. Consider MITRE’s example of “cpe:2.3:a:microsoft:internet_explorer:8.0.6001:beta:*:*:*:*:*:” which could trivially be interpreted by another as “cpe:2.3:a:microsoft:internet_explorer:8.0.6001-beta:*:*:*:*:*:*”. Now those strings don’t match and it has a cascading effect when trying to match a vulnerability to an asset programmatically. I consistently see vendors that will refer to their own product in three or more ways, with different capitalization, dashes, spaces, and more. Trying to standardize that requires an open standard, and one that anyone can contribute CPE to. So as Tom says, CPE is likely dead pending an overhaul or ‘fork’ of the standard that is then executed better.

Identifying the Problem
Currently, there is no easy way to identify vulnerabilities of all types (CVE, OSV, etc.) that apply to a single software product or component of a software product.
I’ll have to disagree with the very first line of this proposal because either I do not understand the meaning, or this is just simply wrong in my eyes. CVE has the capability to do this for any software. OSV has the capability to do it in open source software (OSS). It’s extremely simple; there is a unique identifier that goes with a vulnerability. That is the easy way! What I hope you meant is “the rest of it”, which is the actual aggregation, analyzation, and normalization of that data. That is what is extremely difficult to do, especially at the scale we’re seeing these days.
Also, because of the naming problem, there is no easy way to identify all products affected by a particular vulnerability. Achieving either of these goals requires multiple database searches and manual correlation of the results; even after doing that, there is no guarantee that the user will be able to achieve either goal.
This too is wrong, of sorts. I think it is basically a post hoc logical fallacy despite not explicitly saying the first part in the same paragraph. In short, CVE / NVD have set the stage and Tom calls CVE “by far the dominant [vulnerability] type” in the introduction. He then goes on to say there is a naming problem and requirement of going to multiple databases to get information. So CVE / NVD becomes the problem and he then argues that there is “no easy way” to do all of this. Just because the current system is bad, doesn’t mean there isn’t a better solution, and it doesn’t mean it is necessarily any more difficult than what others face right now.
The CPE <-> PURL Dilemma
Rather than quoting a significant amount of text, I will point to the third paragraph of the proposal that starts with “There needs to be a globally accessible vulnerability database…“. Part of what Tom describes is understandable and the frustration is real. However, where he starts out saying “this simply can’t be done now” and ends with a proposal that kind of does it. My issue here is that it can be done now, despite being painful, as he proposed. Instead of a CPE <-> PURL mapping, he proposes a CPE <-> Vuln ID <-> PURL mapping, even if indirectly. Because that is what his solution proposes in a matter of two paragraphs. If you can pull off that mapping then you have the initial mapping you want, right?
In fact, the data don’t even need to reside in a single database.
This line is a bit baffling since Tom opens the proposal griping that the current system requires multiple databases and even then “there is no guarantee that the user will be able to achieve either goal.“
The various constituent databases (NVD, OSV, OSS Index, etc.) can simply be referenced through a single smart query engine, which is titled the “Global Vulnerability Database” (GVD).
Amusingly, this is exactly what CVE aimed to be. For those in the industry familiar with CVE, MITRE has always claimed it is a dictionary, not a database. I still take exception to their choice of terminology, but the implementation is more of an index if anything. Regardless, MITRE had hoped to be that single place that binds the references to everyone else. Unfortunately, that went out the window long ago; they make no effort to do so today and haven’t for over a decade. In the modern day, that becomes extremely challenging to do it correctly, but if that is all you are doing really, then MITRE has more than enough resources to do it.

Building on my reply to the quote above, and the paragraph it is found in along with the subsequent paragraph, I really walk away with a feeling of a certain South Park episode regarding gnomes and a business plan. Alrich starts his premise by saying other databases simply aren’t enough, that what he wants can’t be done, and ends up proposing something that is half there while the other half is a big question mark.
In his model, GSV would be a query engine that would tap into NVD, OSV, Sonatype’s OSS Index, and more. He goes on to say that it is possible to improve CVE by adding PURLs among other unnamed improvement projects. The proposal is rounded out by saying that the GSV could query all of this without degrading current capability all the while leveraging e.g. CVE JSON 5.1 specs to bring it closer to being that universal VDB. These are some of the moving pieces yes, but I feel there is a lot of material that isn’t spoken to that is a requirement to make this even close to reality.

The best way to achieve the goal of a GVD is through a global effort, funded by private industry, nonprofit organizations and government. It is likely that, as long as one or two well-known organizations lead the initial effort, there will be substantial interest worldwide. Therefore, obtaining adequate funding may not pose a big problem.
This summary goes back to the gnomes above, to a tee. Let’s start by examining why the government would fund this when they already sink over 12 million into CVE and NVD which appear to be mismanaged based on their budgets. For them to fund another initiative, it would call into question if there is “waste and abuse” which would fall under the purview of the DHS OIG and/or DOC OIG I presume, to determine if that is a level of redundancy that is not acceptable. Next, if this is all doable and achievable, and would create so much benefit to the industry, why haven’t those “one or two well-known organizations” done it yet? As I mentioned before, OSF and the OSVDB was attempting to go in that direction two decades ago and ran into these same challenges.
Below are likely goals to be achieved by this project:
Access to the database needs to be free, although high-volume commercial uses may be tariffed in some way.
This too may become a much bigger problem than you imagine. While OSVDB had broader data, any given entry was typically incomplete due to resources. We attempted to publish entries for everything, even if only title, date, and references (basically the CVE “dictionary” model), while filling out other entries to 100% as we could. Despite having incomplete entries, we still faced non-stop abuse of our license. Small and large companies were scraping our data and using it commercially while not providing any support or financing to us. By 2014 I personally had enough and called out two companies, one small and one large. Be prepared to have a lawyer or two volunteer to help manage this potential issue.
The database should be easily accessible worldwide, except in remote areas, etc. In general, no country should have their access to the database restricted, although there might be reasons to do so in some cases, like active support of terrorism.
Just go ahead and say you will restrict it based on the Department of State’s list of State Sponsors Terrorism and/or the Department of State’s Countries of Particular Concern, Special Watch List Countries, Entities of Particular Concern and/or the Department of the Treasury’s Sanctions Programs and Country Information. In this day and age, we all know this is a token effort at best, as any country on the list can trivially use a VPN or hosting provider in another country to access the database. This is perhaps the most pedantic issue brought up in proposing the GSV.
Because there are errors in the current databases (e.g., off-spec CPE names), there should be an ongoing effort to clean up errors. There should also be an effort to make strategic enhancements to the database, such as adding purl identifiers to existing and new CVE reports. However, these efforts need to be undertaken as funds and time permit. It is possible that volunteers can be found to assist in these efforts, such as college cybersecurity majors.
I feel like I need to keep making variations on the South Park gnome meme here. Proposing an effort to “clean up errors” after specifically citing CPE just doesn’t work. The CPE dictionary is closed, and exclusively maintained by NIST for the time being. Even with Tanya Brewer’s proposal at VulnCon, opening it up would be to CNAs so they could contribute their own. While she didn’t propose the ability for them to police their own product’s CPE strings, that would be a logical thing to do. That said, it still would be out of reach of the GSV effort unless CNAs were involved. This is distant future planning at this point.
Alrich’s Grand Finale
To sum this up, there needs to be a single searchable database of vulnerabilities worldwide. This will probably not be a single physical database implemented in a single facility. Instead, it might in effect be an AI-based “switching center”, through which searches would be coordinated among different vulnerability databases, using diverse identifiers for software and vulnerabilities.
I bolded the part that completely contradicts the idea of a global vulnerability database. What Alrich starts with, a global vulnerability database, ends up being described as much more a specialized search engine of vulnerability information instead. This proposal concludes with a patch-work solution that has many hurdles including licensing that changes, resources going away, staffing, funding, and much more.
As someone who spent a ridiculous amount of time working for a free, open, community maintained database, I really am a fan of that general idea. As is, I just don’t see the current proposed GSV solving many problems.

Leave a Reply to David D HenningCancel reply