15-year-old Python flaw found in ‘over 350,000’ projects • The Register

At least 350,000 open supply projects are believed to be doubtlessly weak to exploitation through a Python module flaw that has remained unfixed for 15 years.

On Tuesday, safety agency Trellix stated its menace researchers had encountered a vulnerability in Python’s tarfile module, which gives a solution to learn and write compressed bundles of recordsdata often called tar archives. Initially, the bug hunters thought they’d chanced upon a zero-day.

It turned out to be a couple of 5,500-day difficulty: the bug has been residing its greatest life for the previous decade-and-a-half whereas awaiting extinction.

Identified as CVE-2007-4559, the vulnerability surfaced on August 24, 2007, in a Python mailing listing publish from Jan Matejek, who was on the time the Python bundle maintainer for SUSE. It may be exploited to doubtlessly overwrite and hijack recordsdata on a sufferer’s machine, when a weak software opens a malicious tar archive through tarfile.

“The vulnerability goes mainly like this: If you tar a file named "../../../../../and many others/passwd" after which make the admin untar it, /and many others/passwd will get overwritten,” defined Matejek on the time.

The tarfile listing traversal flaw was reported on August 29, 2007 by Tomas Hoger, a software program engineer at Red Hat.

But it had already been addressed, form of. One day earlier, Lars Gustäbel, maintainer of the tarfile module, dedicated a code change that provides a default true check_paths parameter and a helper operate to the TarFile.extractall() technique that throws an error if a tar archive file path is insecure.

But the repair didn’t handle the TarFile.extract() technique – which Gustäbel stated “shouldn’t be used in any respect” – and left open the likelihood that extracting information from untrusted archives may trigger issues.

In a remark thread, Gustäbel defined that he now not considers this a safety difficulty. “tarfile.py does nothing incorrect, its conduct conforms to the pax definition and pathname decision tips in POSIX,” he wrote.

“There is not any identified or doable sensible exploit. I [updated] the documentation with a warning that it is perhaps harmful to extract archives from untrusted sources. That is the one factor to be executed IMO.”

Indeed, the documentation describes this footgun:

Warning: Never extract archives from untrusted sources with out prior inspection. It is feasible that recordsdata are created outdoors of patheg members which have absolute filenames beginning with "/" or filenames with two dots "..".

And but right here we’re, with each of them extract() and extractall() nonetheless posing the specter of arbitrary path traversal.

“The vulnerability is a path traversal assault in the extract and extractall features in the tarfile module that enable an attacker to overwrite arbitrary recordsdata by including the ‘..’ sequence to filenames in a tar archive,” defined Kasimir Schulz, a vulnerability researcher for Trellix, in a weblog publish.

The “..” sequence adjustments the present working path to the guardian listing. So utilizing code just like the six-line snippet beneath, Schulz says, the tarfile module may be instructed to learn and modify the file’s metadata earlier than it is added to the tar archive. And the result’s an exploit.

import tarfile

def change_name(tarinfo):
    tarinfo.identify = "../" + tarinfo.identify
    return tarinfo

with tarfile.open("exploit.tar", "w:xz") as tar:
    tar.add("malicious_file", filter=change_name)

According to Schulz, Trellix constructed a free instrument referred to as Creosote to scan for CVE-2007-4559. The software program has already found the bug lurking in functions like Spyder IDE, an open-source scientific atmosphere written for Python, and Polemarch, an IT infrastructure administration service for Linux and Docker.

The firm estimates the tarfile flaw may be found “in over 350,000 open-source projects and prevalent in closed-source projects.” It additionally factors out that tarfile is a default module in any Python challenge and is current in frameworks created by AWS, Facebook, Google, and Intel, and in functions for machine studying, automation, and Docker containers.

Trellix says it is working to make repaired code accessible to affected projects.

“Using our instruments, we at the moment have patches for 11,005 repositories, prepared for pull requests,” defined Charles McFarland, a vulnerability researcher for Trellix, in a weblog publish. “Each patch will probably be added to a forked repository and a pull request remodeled time. This will assist people and organizations alike turn into conscious of the issue and provides them a one-click repair.

“Due to the scale of weak projects we count on to proceed this course of over the following few weeks. This is anticipated to hit 12.06 p.c of all weak projects, somewhat over 70K projects by the point of completion.”

The remaining 87.94 p.c of affected projects might want to contemplate different doable choices. ®

Leave a Comment