CVE Farming – Problem & Solution

jericho

1 year ago

Blog Origins

In the last year or two, I have increasingly used the term “CVE farming” in conversations and LinkedIn posts [1]. This has led a few people to ask what it meant and I gave a very cliff notes version of the answer. I started taking notes for this blog a while back expecting for the question to come up more and more. Weeks back the topic came up again on LinkedIn and I said something that wasn’t communicated very well because a colleague asked a question that seemed unrelated to my point. That was the real impetus to prioritize this blog. Note that Jonathan Kuskos used this term over a year ago in a slightly different context it seems, but very much along the lines I do.

In a reply, Delta Regeer asked if I was against assigning IDs for what many of us refer to as ‘crap’ software. This term applies to software that, while functional, was done as a hobby and distributed on platforms known for low-quality vulnerability-ridden software written by an amateur. This isn’t software that will see the light of even medium-size businesses, possibly not even small. Worse, we frequently see the same base software just re-skinned with minor coding changes to support the new purpose. So “Alice’s PHP Farm Supply Software” becomes “Alice’s PHP Airplane Maintenance Software” with a few tweaks and little work. While the visible look of the software changes and some of the script names or parameters change, the underlying code remains identical. That includes vulnerabilities that become common across the entire range of “products”.

This leads to understanding the concept of abstraction as applied to vulnerability databases. Know it or not, this has been a headache for many organizations for a couple decades. In the early days MITRE had fairly clear rules for abstracting vulnerabilities to CVE IDs and more importantly, they were fairly consistent. In the last five to ten years, that consistency has gone away as MITRE has become almost completely hands-off in the assignment and publication process. That means one researcher may do it one way, the next the opposite, and MITRE has no quality assurance or sanity checks in place to prevent anything.

Abstraction Rules

So what is abstraction? The following rules are from the VulnDB portal guide I wrote almost 15 years ago now:

How a vulnerability database abstracts disclosures (i.e. splitting issues and giving them unique identifiers) varies greatly. Some VDBs group multiple vulnerabilities under a single identifier (e.g. CVE), while others will group some but not other disclosures (e.g. Symantec). Yet other databases will create multiple entries for the same vulnerability simply because it is used on different operating systems (e.g. Secunia). These methods of abstraction are not conducive to logical and comprehensive vulnerability alerting. Such methods also make it positively impossible to perform any form of statistical analysis of vulnerability disclosures. VulnDB is based on a relatively simple concept: one entry per unique vulnerability. This methodology allows for true per-vulnerability tracking within your organization, allows you to generate accurate statistics, and better evaluate the security of a vendor or product.

There are times where we deviate from this rule, primarily when the abstraction does not prove beneficial to anyone. It can be argued that every single function or parameter is a distinct vulnerability, but abstracting to that degree would become a burden on both RBS and customers. In many cases, it isn’t clear from a disclosure if several issues are truly distinct, or the cause of a single base entry. We do our best to evaluate each disclosure and abstract based on the available information along with the value it provides.

Examples

With that in mind, consider that someone discloses a vulnerability in Microsoft Windows. You would expect it to get one CVE ID and it does. If that same vulnerability impacts Windows 95, Windows 2000, Windows 2007, and Windows 11, do you expect multiple CVE IDs for each operating system? No, of course not. Why? It is the same code base, the same vulnerability that manifests in each operating system because of that shared code (e.g. CVE-2006-4696, CVE-2024-27856). So jump back to Alice’s software where we see a cross-site scripting (XSS) vulnerability in /classes/function.php. If that is disclosed in both the ‘Farm’ software and the ‘Airplane’ software, it gets one ID. It doesn’t matter if the disclosures are done in the same blog post, separated by days / weeks / years, or if disclosed by two people. It’s the same vulnerability.

Let’s look at some historical IDs, mostly, to help better understand how CVE is not consistent. Using these examples, consider which ones make your analysis and triage more or less difficult.

CVE-2004-2735 – 1 product, 56 scripts, multiple parameter XSS
CVE-2022-48325 – 1 product, 1 script, 86 parameter XSS
CVE-2006-4976 – 1 product, 79 scripts, one parameter IDOR
CVE-2005-2045, CVE-2005-2046, CVE-2005-2047 – 1 vendor, 3 products, 1 script, 1 parameter

Because MITRE has no guardrails, and some CNAs are more than happy to cater to the practice I call CVE farming, and they will assign a new ID every time that same vulnerability is disclosed if it is a different piece of software by Alice. Most of the time if you evaluate the codebase of each product you will see that many files are virtually identical; it’s the same as the Windows scenario mentioned above. So why treat one differently than the other? Any mature vulnerability database will catch these typically and lump them together. What becomes “Alice’s Critter Rescue Software” becomes “Alice Multiple Software” for the title and then each vulnerable product is associated with it.

So let’s take Bob who is a new researcher just setting out to find vulnerabilities to get his very first CVE ID. He finds a simple XSS vulnerability in Alice’s software and requests a CVE ID. If via MITRE it is done through a farm and largely automated, and Bob gets the ID no question. He’s even asked to write the description, even if he doesn’t understand the vulnerability and English is his fourth language. So we get a really bad CVE description, a new ID, and a vulnerability that will never impact the corporate world. Yay, he got his first CVE, let the resume padding begin! So he downloads the next piece of Alice’s software and finds the exact same vulnerability, due to the exact same code. Bob requests his second ID, gets it, and there we go. Now he can churn out a string of CVEs by ‘finding’ the same vulnerability over and over. Before too long, Bob is proudly announcing he has 50 CVEs under his belt. Meanwhile, Bob shows that he can’t properly audit software and is not qualified to find all vulnerabilities in it, within reason.

Does it mean Bob knows the first thing about web-app testing? Does it mean that Bob has improved the state of security in any way? No and no. In fact, Bob has actually degraded the state of security as security teams all over the world have to read those CVEs, determine what they represent, and if they impact the organization. Because the National Vulnerability Database is so far behind, they can’t rely on timely vulnerability intelligence with proper metadata to make that process faster. So Bob has now wasted, quite literally, hundreds or perhaps thousands of hours of security professionals.

We can blame Bob to some degree, but maybe he just doesn’t know how this works other than “CVE on resume good!” We certainly can blame the assigning CNA, as well as MITRE’s current implementation of the CVE program. If either would do the most basic of due diligence, they would just add new reports for this same vulnerability to an existing CVE ID instead. CVE volume would drop a bit, workloads would be relaxed a tad, and the security ecosystem would overall be slightly improved by that tiny bit of due diligence.

There are two types of farming that I see; what I call “copycat” farming and what I consider outright abusing CVE for personal gains. Copycat farming is where Carol finds a vulnerability in Alice’s software, discloses it, and then moves on. Doug sees that disclosure and decides to check a different piece of Alice’s software and conveniently “finds” the exact same vulnerability. He then discloses it and moves on. Eve repeats the same process Doug did and the cycle continues until all of Alice’s software has been enumerated. Sometimes the same researchers will move to Frank’s software and the cycle repeats.

The second type of farm, outright abuse of the CNA, typically happens when a single researcher does all of the above. They abuse the CNA by either requesting through MITRE and the automated form with no QA and no guardrails, or they stagger their requests to another CNA by days or weeks. Due to the volume some CNAs deal with, they are going to be less likely to notice the duplication if they cared, which the complicit CNAs do not.

What the farmers and those CNAs don’t understand is that many orgs, including some Fortune 500, have to deal with every CVE ID disclosed in some fashion. They have to digest it to determine if it impacts them, at minimum acknowledge it and flag it as “not affected” before moving on, or they have to start a real triage process when it might affect them. That first step takes time and it is a burden on thousands of people, all so one person can claim “I have a hundred CVEs to my name“, which ultimately means nothing to anyone that understands the CVE ecosystem, while impressing anyone that does not.

For anyone that actually analyzes every newly published CVE ID, and there aren’t that many of us in the world, you may have doubts as to just how often these vulnerabilities should be considered a duplicate. But there are many considerations to take into account during that analysis and ultimately the assignment of a new ID versus merging to an existing ID.

Paths can mean little, e.g. /adms/ vs /pms/ vs /whatever/, where the vulnerability reporter installed under different paths, or the software does just to distinguish it from another. Fundamentally the rest of the path is what matters.
How the vulnerable file is invoked can vary, e.g. /manage_user.php vs /adms/admin/?page=manage_user. The actual vulnerable code may be in the same file in both scenarios, and that is what matters.
“A vulnerability was found in SourceCodester Online Food Ordering System 2.0.” – frequently see attribution to the hosting site. This is like calling anything hosted on GitHub attributed to GitHub as the vendor. Using SourceCodester as an example, e.g.
- https://www.sourcecodester.com/php/15434/library-management-system-qr-code-attendance-and-auto-generate-library-card.html by kingbhob02
- https://www.sourcecodester.com/php/15346/online-fire-reporting-system-phpoop-free-source-code.html by oretnom23

Jumping back to abstraction briefly, consider the prior examples along with the considerations above. Now let’s look at VulnDB ID 282464 which is titled “oretnom23 Multiple Products /admin/ user/manage_user Page id Parameter SQL Injection“. First, I am fairly sure this vendor’s software will not see corporate networks. If it does, that IT department really needs to reconsider their methodology for vetting software. Second, that entry covers one vendor, one file, one parameter, and most importantly one vulnerability. Associated with that entry are 34 different oretnom23 products that all have the same vulnerable code. That entry also has 34 CVE IDs linked to it. That one vulnerability has now skewed any vulnerability statistics you derive from CVE data to be off by 33. Imagine this happening dozens of times with that single vendor, then guess to extrapolate how bad it might be in the bigger picture. CVE statistics are not giving information on vulnerability disclosures; they are giving statistics on the CVE project, and those two are not synonymous.

As for the farmers, we see them grabbing software from the same sites (e.g. tsourcecode.com, sourcecodester.com, 1000projects.org, codeastro.com, [..]). Each time these sites seem to cater to amateur PHP developers, and each time they seem to have no understanding of secure coding principles. Worse, you might think after over 1,000 disclosures that “oretnom23” might take an interest in improving that code, right? You would be if oretnom23 got many security reports. A tiny fraction of those 1,000+ disclosures include anything about contacting the vendor. Those that did report may have done so in a way that immediately sours the process due to the tone of the mail. They can often come across as condescending and demanding gratitude for this amazing code auditing, despite disclosing a single vulnerability.

Case Study

It’s typically pretty easy to spot the CVE farmers based on their disclosure history. This, of course, requires a database that tracks the creditee of the disclosure so using CVE data will not provide this. We’ll take E1CHO as an example and just one look at their GitHub repo will show they find vulnerabilities in these hobby PHP software packages with considerable repetition.

One of the most abused in this category is Carlo Montero, aka “oretnom23”. With over 1,000 vulnerabilities disclosed in their software it squarely puts them up there with large commercial vendors. Given that over 99% of disclosures against their software were uncoordinated, meaning the vendor was not contacted, there is no reason to suspect that these vulnerabilities have been patched. Meanwhile, oretnom23 continues to produce more software with the same base vulnerable code. Using our farming analogy, that means that new fertile land keeps being found.

It’s easier to see these patterns with more structured data, so I have created a spreadsheet with three tabs that illustrate three separate CVE farming instances. First, we have many ‘copycat’ farmers disclosing the same SQL injection issue in the same script across many packages:

Next, we have one researcher that likely saw a pattern and began farming different software himself:

Last, we have the same researcher farming CVEs and ultimately leading to other copycats:

Solutions

This is a case where the solution to this problem is absurdly simple. Any mature vulnerability database already does this, while CVE does not.

Remind CNAs of abstraction policy (CNA Rules 4.2.13 bullet 1)
Stick to it when assigning IDs
Hold CNAs accountable to that policy (looking at you SCIP/VulDB!)
Bring back simple QA on all CvE entries before publication

Yes, it is that simple. No, that is hardly out of reach for MITRE. Yes, they already have a laughable amount of money they could throw at the problem. If MITRE disagrees with any of that, then I personally don’t feel they are qualified to run a vulnerability database and they need to hire some folks that are.

Blog Origins

Abstraction Rules

Examples

Case Study

Solutions

Share this: