What is Hadoop

Hadoop is an Apache open-source platform for storing, processing, and analyzing massive amounts of data in a distributed manner through large clusters of commodity hardware. Apache Hadoop is written in Java and is basically used for batch processing. Large data sets are spread through clusters of commodity computers, and applications developed with Hadoop run on them. Computers that are sold as commodities are inexpensive and readily available. These are primarily used to increase computing power at a low cost. Apache Hadoop is a widely used open-source tool that provides a robust framework for distributed storage and processing of big data. It is renowned for its ability to harness a network of computers to tackle complex computations involving vast data sets. 

However, versions 3.3.1 to 3.3.4 of Apache Hadoop on Linux are currently vulnerable to CVE-2023-26031. This critical privilege escalation vulnerability allows local users to gain root privileges and potentially enable remote users to gain similar access.

The vulnerability stems from introducing “YARN Secure Containers” in Hadoop 3.3.0. The containers execute user-submitted applications in isolated Linux containers, but a critical change in the library loading path introduced in the patch “YARN-10495” has created a loophole. 

The vulnerable binary, HADOOP_HOME/bin/container-executor, is owned by root with the suid bit set, which allows YARN processes to run containers as the users submit the jobs. The change in path will enable users with lower privileges to execute a malicious libcrypto library as root, potentially leading to a security breach.

Assessing and Addressing the Vulnerability

To determine if your version of container-executor is vulnerable, use the readelf command to check if the RUNPATH or RPATH value includes “./lib/native/.” If so, your system is at risk. The potential for remote users to gain root privileges if YARN clusters accept work from them adds an extra layer of complexity and urgency to this issue.

Remediation and Workarounds

Apache Hadoop has addressed this vulnerability in version 3.3.5. Therefore, upgrading to this version is the most effective solution. However, if immediate patching is not feasible, alternative workarounds can be implemented to mitigate the risk:

  • Remove execute permissions from the bin/container-executor binary. This prevents the execution of the vulnerable binary, effectively nullifying the exploit path.
  • Change the ownership of bin/container-executor from root. This further restricts access to the binary, adding an extra layer of protection.
  • Delete the bin/container-executor binary completely. This eliminates the vulnerable binary, ensuring that no exploitation can occur.