IMPORTANT NOTE: Our XDR soultion provides protection & detection for Log4j across both networks and endpoints. Contact us
"The Log4j vulnerabilities pose an unacceptable risk to (US) federal network security. CISA has issued this emergency directive to drive federal civilian agencies to take action now to protect their networks, focusing first on internet-facing devices that pose the greatest immediate risk."
- CISA Director Jen Easterly
Note: In this article attempt will be to explain the Log4j vulnerability to non-IT persons forming majority of Top management and the Board of Directors as well as many non-IT persons are concerned about it. Endeavour will be to minimise the jargon and explain the issue in the most simplistic manner. The article thus is not a technical guidance but in support of Information Security Management.
Primer - How is software built?
Before we delve into a bit of technicalities, it is important to understand how software are built. Modern day software are millions of lines of code.Complete software is NOT written word by word but is built on previously written codes which are called libraries. It is a bit like manufacturing, even if one builds an innovative product, he does not start from mining the earth but takes inputs from previously built items - nuts, bolts, metal, fabric etc. And it is always great if the source code can be free, which is called open source. These inputting codes themselves may have been built on other libraries and pieces of codes. The software developer assembles these codes in a structured manner and writes their own code to connect them up and create new functionality or applications. Once a software coding is complete, it is compiled. Compiling means that code which is human understandable is now converted to machine readable and actionable by computers. The user of code may or may not have the capability to decompile the code (It is best practice to not let the user decompile the code unless it is an open source code). Therefore there can be a situation, and generally is, that higher level code may embed precompiled code. Some of the part of the code may not be within the code but is fetched from somewhere in the net or pre-declared folder in the computer. Therefore it is possible that to run a software, thousands of other software will run to make it happen and some of these may not be transparently visible to the user or even the final software authors.
Another concept which needs to be understood before we can discuss the main issue is logging. Logging is like recording a specified activity or function. There can be logging all activities, or logging of failure or negative or success or positive action. Logging is required for:-
the purpose of tracing activities,
debugging the software or function,
for the purpose of record keeping and information,
to issue warning if some conditions are met, and
report an error or in case software crashes then to find the reason of fatality.
Therefore logging is important to keep software healthy however logging is rarely the main focus of an application, therefore it is very rare that logging components are written by the software developers themselves. It is practically standard practice to pick up this piece of code from open source and use it.
Logging has another challenge. If logging is too verbose, then it not only slows down the system but also consumes space and finding errors becomes complicated as well as time consuming. While too little logging may miss out the critical part. Therefore if the logging command is hardcode, then it is difficult to manage logs as per dynamically changing needs. Therefore logging software should adapt while the main application is running. It is called runtime changes. Also what to log, what not log, when to log, when not to log, where to store the log and how much log space to be consumed require dynamic changes and without human interventions during run-time.
To make these dynamic changes the logging part of the software should get its instructions. These instructions are created by humans who are generally not the software developers but users. But as the environment changes from development to full scale deployment the physical location of various databases, services etc will change. To overcome this challenge a common interface is required which can understand things in human language, apply the context and interact with the right database. Therefore naming and directory interface is required which either understands the context or takes instructions.
What is Log4j ?
Prior to September 2000, there were several logging software and software developers could themselves write print commands. However, that had several issues. In September 2000, Apache released log4j logging API (API is the acronym for Application Programming Interface, which is a software intermediary that allows two applications to talk to each other). The release statement of Apache was: “Log4j is an open source project based on the work of many authors. It allows the developer to control which log statements are output with arbitrary granularity. It is fully configurable at runtime using external configuration files…. It offers several advantages. It provides precise context about a run of the application. Once inserted into the code, the generation of logging output requires no human intervention. Moreover, log output can be saved in persistent medium to be studied at a later time. In addition to its use in the development cycle, a sufficiently rich logging package can also be viewed as an auditing tool”.
The source code was made fully open, meaning anyone could use it without any issue associated with it. The best part was that log4j was made portable to practically all popular computer languages such as C, C++, C#, Perl, Python, Ruby, and Eiffel of that time. Practically all software developers started using it in their software. As time passed it kept getting deep embedded into the later software which used previous software libraries. To dynamically configure the log4j it uses JNDI API. The Java Naming and Directory Interface™ (JNDI) is an application programming interface (API) that provides naming and directory functionality to applications written using the Java™ programming language. It is defined to be independent of any specific directory service implementation. Thus a variety of directories -new, emerging, and already deployed can be accessed in a common way. Log4j is one such Java™ language written software. With JNDI API, log4j can access many types of data, like objects, devices, files of naming and directory services, eg. it is used by EJB to find remote objects. JNDI is designed to provide a common interface to access existing services like DNS, NDS, LDAP, CORBA and RMI or user defined services at user defined location (URL).
What is Log4j Vulnerability?
In most simplistic understanding is if the user having malicious intentions he can in place entering his name can enter an argument (statement) where JDNI function is called to get further information from the location (URL) of the malicious actor’s choice. At that URL controlled by a malicious actor, he can introduce malicious (arbitrary) code which the system will run and may execute the command, if the parameters are matched.
This vulnerability is given the highest threat perception score of 10 and identified as CVE-2021-44228. The impacted version is Log4j 2.0-beta9 up to 2.14.1. Apache issued two patches but these patches had their own problems including vulnerability to denial of service attack. Today, 19 Dec 2021 latest patch version 2.17.0 is issued. It is yet to be analysed for its effectiveness and inbuilt vulnerabilities, if any.
To grasp the sense of the problem, one can see a recent Google report on log4j released on 17 Dec 2021. “ So far, nearly 5,000 artifacts have been patched, leaving more than 30,000 more. It will be difficult to address the issue because of how deep Log4j is embedded in some products. …..Most artifacts that depend on log4j do so indirectly. The deeper the vulnerability is in a dependency chain, the more steps are required for it to be fixed. For greater than 80% of the packages, the vulnerability is more than one level deep, with a majority affected five levels down (and some as many as nine levels down)," Wetter and Ringland of Google wrote in the report.
Millions of software have used log4j, some of them in fact no longer exist and their software is embedded deep inside, hence such software may never be patched. The whole ecosystem is creaking and it may take years to overcome it.
There is a small reason for relief as well as a far more serious threat. The attacker not only needs to know the existence of log4j but also its environment to create appropriate malware. As on today a general purpose malware will be ineffective if the impacting organisation has its cyber hygiene in place. Therefore, both parties the IT teams as well as the attackers are running against time to patch Vs find appropriate malware. Well evolved attackers, who have high end surveillance mechanisms are likely to succeed. Conti ransomware team is leading the attacker pack. According to CrowdStrike reports even some nation states have jumped in to take advantage of the vulnerability.
Mitigation Approach
It is an active and highly complex vulnerability. Therefore giving a static mitigation approach which may change continuously may be disastrous. The complete Information Technology staff and IT Security Staff need to keep checking for any updates. The most effective and dynamically changing defensive response is provided by CISA of USA on link - https://www.cisa.gov/uscert/apache-log4j-vulnerability-guidance.
Also Apache is providing necessary patches at this link - https://logging.apache.org/log4j/2.x/security.html
CISA Recommendations
Additionally following steps may be taken.
Enhance your cybersecurity overall posture
Minimise use of VPN
Use two factor authentication for VPN and cloud services.
Consider all firewalls as valid targets and first general purpose attack is expected on firewalls and RADIUS. Disable firewalls creating outbound traffic and keep manual watch on any outbound traffic created by firewalls and LDAP/AD servers.
Switzerland CERT has issued following mitigation approach
Bug Bounty Hunter, Anton (@therceman) has sahred the follwing cheatsheet which professionals may use it with appropriate caution.
Impacted Companies and their Products
Practically all companies are impacted, because vulnerability may not be on the surface but buried deep inside. Following are the major organisations impacted by log4j vulnerability
Minecraft
Steam
Apple iCloud
Tencent
Twitter
Cloudflare
Amazon
Tesla
ElasticSearch
Google
Alcatel
Cisco
Dell
Palo-Alto Networks
Rapid7
Fortinet
IBM
Intel
McAfee
Microsoft
RedHat
RSA
Salesforce
Simens
SolarWinds
SonicWall
Sophos
Github has placed a comprehensive list affected products/comapnies here - https://github.com/cisagov/log4j-affected-db
Nice explanation :)