For almost two decades, CVE has been considered an industry standard for vulnerability tracking. A CVE ID can be affiliated with many vulnerabilities, in a format like CVE-2014-54321. Note my choice in ID, from 2014 with a consecutive set of numbers. That is because I specifically chose a ‘sample’ CVE that was set aside as an example of the CVE ID Syntax Change in 2014. This change occurred when it was determined that 9,999 IDs for a single year was not going to be sufficient. Technical guidance on this is available, as well as more basic information and the announcement about the change. Starting out with this hopefully demonstrates that there may be more to an ID than meets the eye.
Fundamentally, the ID is simple; you have the CVE prefix, followed by a year identifier, and a numeric identifier. In the CVE used above, it would represent ID 54321 with a 2014 year identifier. Fairly simple! But you are reading an entire blog on these IDs by me so the spoiler is here. It isn’t so simple unfortunately. I want to give a rundown of what a CVE ID really is, and set the record straight. Why? Because I don’t think MITRE has done a good job with that, and worse, actively works against what could be a clear and simple policy. We’ll use CVE-YEAR-12345 as a representative example for the purpose of discussing these IDs to be clear about which part of an ID we’re talking about.
When CVE was started in 1999, assignments were made based on a public disclosure. However, from the beginning, the YEAR portion almost immediately was not guaranteed to represent the year of disclosure. This was because MITRE’s policy was to assign an ID for a pre-1999 vulnerability using a CVE-1999 ID. We can see this with CVE-1999-0145 which was assigned for the infamous Sendmail WIZ command, allowing remote root access. This feature was publicly disclosed as a vulnerability on November 26, 1983 as best I have determined (the Sendmail changelog). While it was a known vulnerability and used before that, it was privately shared. If there is a public reference to this vulnerability before that date, leave a comment please!
The takeaway is that a vulnerability from 1983 has a CVE-1999 identifier. So from the very first year, MITRE set a clear precedent that the YEAR portion of an ID does not represent the year of discovery or disclosure. You may think this only happened for vulnerabilities prior to 1999, but that isn’t the case. In the big picture, meaning the 22 years of CVE running, an ID typically does represent the disclosure year. However, per one of CVE’s founders, “because of CVE reservation, sometimes it aligned with year of discovery“. That is entirely logical and expected as a CVE ID could be used to track a vulnerability internally at a company before it was disclosed. For example, BigVendor could use the CVE ID not only for their internal teams, such as communicating between security and engineering, but when discussing a vulnerability with the researcher. If a researcher reported several vulnerabilities, using an ID to refer to one of them was much easier than the file/function/vector.
For the early CVE Numbering Authorities (CNA), companies that were authorized to assign a CVE without going through MITRE, this was a common side effect of assigning. If a researcher discovered a vulnerability on December 25 and immediately reported it to the vendor, it may be given e.g. a CVE-2020 ID. When the vendor fixed the vulnerability and the disclosure was coordinated, that might happen in 2021. The founder of CVE I spoke to told me there “weren’t any hard and fast rules for CNAs” even at the start. So one CNA might assign upon learning of the vulnerability while another might assign on public disclosure.
Not convinced for some reason? Let’s check the CVE FAQ about “year portion of a CVE ID”!
What is the significance and meaning of the YEAR portion of a CVE ID
CVE IDs have the format CVE-YYYY-NNNNN. The YYYY portion is the year that the CVE ID was assigned OR the year the vulnerability was made public (if before the CVE ID was assigned).The year portion is not used to indicate when the vulnerability was discovered, but only when it was made public or assigned.
Examples:
A vulnerability is discovered in 2016, and a CVE ID is requested for that vulnerability in 2016. The CVE ID would be of the form “CVE-2016-NNNN”.
A vulnerability is discovered in 2015 and made public in 2016. If the CVE ID is requested in 2016, the CVE ID would be of the form “CVE-2016-NNNNN”.
All clear, no doubts, case closed!
That clear policy is conflicting or may introduce confusion in places. Looking at MITRE’s page on CVE Identifiers, we see that the “The process of creating a CVE Record begins with the discovery of a potential cybersecurity vulnerability.” My emphasis on ‘discovery’ as that means the ID would reflect when it was discovered, and not necessarily even when it was reported to the vendor. There are many cases where a researcher finds a vulnerability but may wait days, weeks, months, or even years before reporting it to the vendor for different reasons. So it is more applicable that the ID will be assigned based on when the vendor learns of the vulnerability in cases of coordinated disclosure with a CNA. Otherwise, a bulk of CVEs are assigned based on the disclosure year.
It gets messier. At the beginning of each year, each CNA will get a pool of CVE IDs assigned. The size of the pool varies by CNA and is roughly based on the prior year of assignments. A CNA that disclosed 10 vulnerabilities in the prior year is likely to get 10 – 15 IDs the subsequent year. Per section 5.1.4 of the CNA rules, any IDs that are not used in a calendar year should be REJECTed if they were not assigned to an issue. “Those CVE IDs that were unused would be rejected.” But then, it stipulates that the CNA can get “CVE IDs for previous calendar years can always be requested if
necessary.” So per current rules, a CNA can request a new ID from a prior year despite REJECTing IDs that were previously included in their pool. That means it is entirely optional, up to each CNA, on how they assign.
[Update: Note that the pool of IDs a CNA gets one year may not be the same the next. Not only in regards to the size of the pool, but the first ID may be in an entirely different range. e.g. 2019-1000 vs 2020-8000.]
The take-away from all this is that we now have many reasons why a CVE ID YEAR component does not necessarily tie to when it was disclosed. The more important take-away? If you are generating statistics based on the YEAR component, you are doing it wrong. Any statistics you generate are immediately inaccurate and cannot be trusted. So please don’t do it!
Finally, a brief overview of the numeric string used after the YEAR. Going back to our example, CVE-YEAR-12345, it is easy to start to make assumptions about 12345. The most prevalent assumption, and completely incorrect, is that IDs are issued in a sequential order. This is not true! Covered above, CNAs are given pools of IDs at the beginning of each year. Oracle and IBM assign over 700 vulnerabilities a year, so the pool of IDs they receive is substantial. There are over 160 participating CNAs currently, and if each only received 100 IDs, that is over 16,000 IDs that are assigned before January 1st.
In 2021, the effect of this can be seen very clearly. Halfway through April and we’re already seeing public IDs in the 30k range. For example, CVE-2021-30030 is open and represents a vulnerability first disclosed on March 28th. According to VulnDB, there are only 7,074 total vulnerabilities disclosed this year so far. That means we can clearly see that CVE IDs are not assigned in order.