Over the past 2 weeks many people working in IT have been dealing with the fallout of the vulnerabilities and exploits being carried out against servers and applications using the popular Log4J java library. Information security people have been responding 24/7 to the incident, operations folks have been patching servers at record speeds, and software developers have upgrading, removing libraries and crossing their fingers. WAFs are being deployed, CDN (Content Delivery Network) rules updated, and we are definitely not out of the woods yet.
Those of you who know me realize I’m going to skip right over anything to do with servers and head right onto the software angle. Forgive me; I know servers are equally important. But they are not my speciality…
Although I already posted in my newsletter, on this blog and my youtube channel , I have more to say. I want to talk about some of the things that I and other incident responders ‘discovered’ as part of investigations for log4j. Things I’ve seen for years, that need to change.
After speaking privately to a few CISOs, AppSec pros and incident responders, there is a LOT going on with this vulnerability, but it’s being compounded by systemic problems in our industry. If you want to share a story with me about this topic, please reach out to me.
Let’s get into some systemic problems.
Inventory: Not just for Netflix Anymore
I realize that I am constantly telling people that having a complete inventory of all of your IT assets (including Web apps and APIs) is the #1 most important AppSec activity you can do, but people still don’t seem to be listening… Or maybe it’s on their “to do” list? Marked as “for later”? I find it defeating at times that having current and accurate inventory is still a challenge for even major players, such as Netflix and other large companies/teams who I admire. If they find it hard, how can smaller companies with fewer resources get it done? When responding to this incident this problem has never been more obvious.
Imagine past me, searching repos, not finding log4j and then foolishly thinking she could go home. WRONG! It turns out that even though one of my clients had done a large inventory activity earlier in the year, we had missed a few things (none containing log4j, luckily). When I spoke to other folks I heard of people finding custom code in all SORTS of fun places it was not supposed to be. Such as:
- Public Repos that should have been private
- Every type of cloud-based version control or code repo you can think of; GitLab, GitHub, BitBucket, Azure DevOps, etc. And of course, most of them were not approved/on the official list…
- On-prem, saved to a file server – some with backups and some without
- In the same repos everyone else is using, but locked down so that only one dev or one team could see it (meaning no AppSec tool coverage)
- SVN, ClearCase, SourceSafe, subversion and other repos I thought no one was using anymore… That are incompatible with the AppSec tools I (and many others) had at hand.
Having it take over a week just to get access to all the various places the code is kept, meant those incident responders couldn’t give accurate answers to management and customers alike. It also meant that some of them were vulnerable, but they had no way of knowing.
Many have brought up the concept of SBOM (software bill of materials, the list of all dependencies a piece of software has) at this time. Yes, having a complete SBOM for every app would be wonderful, but I would have settled for a complete list of apps and where their code was stored. Then I can figure out the SBOM stuff myself… But I digress.
Inventory is valuable for more than just incident response. You can’t be sure your tools have complete coverage if you don’t know you’re assets. Imagine if you painted *almost* all of a fence. That one part you missed would become damaged and age faster than the rest of fence, because it’s missing the protection of the paint. Imagine year after year, you refresh the paint, except that one spot you don’t know about. Perhaps it gets water damage or starts to rot? It’s the same with applications; they don’t always age well.
We need a real solution for inventory of web assets. Manually tracking this stuff in MS Excel is not working folks. This is a systemic problem in our industry.
Lack of Support and Governance for Open-Source Libraries
This may or may not be the biggest issue, but it is certainly the most-talked about throughout this situation. The question posed is most-often is “Why are so many huge businesses and large products depending on a library supported by only three volunteer programmers?” and I would argue the answer is “because it works and it’s free”. This is how open-source stuff works. Why not use free stuff? I did it all the time when I was a dev and I’m not going to trash other devs for doing it now…. I will let others harp on this issue, hoping they will find a good solution, and I will continue on to other topics for the rest of this article.
Lack of Tooling Coverage
The second problem incident responders walked into was their tools not being able to scan all the things. Let’s say you’re amazing and you have a complete and current inventory (I’m not jealous, YOU’RE JEALOUS), that doesn’t mean your tools can see everything. Maybe there’s a firewall in the way? Maybe the service account for your tool isn’t granted access or has access but the incorrect set of rights? There are dozens are reasons your tool might not have complete coverage. I heard from too many teams that they “couldn’t see” various parts of the network, or their scanning tools weren’t authorized for various repos, etc. It hurts just to think about; it’s so frustrating.
Luckily for me I’m in AppSec and I used to be a dev, meaning finding workarounds is second nature for me. I grabbed code from all over the place, zipping it up and downloading it, throwing it into Azure DevOps and scanning it with my tools. I also unzipped code locally and searched simply for “log4j”. I know it’s a snapshot in time, I know it’s not perfect or a good long-term plan. But for this situation, it was good enough for me. ** This doesn’t work with servers or non-custom software though, sorry folks. **
But this points to another industry issue: why were our tools not set up to see everything already? How can we tell if our tool has complete coverage? We (theoretically) should be able to reach all assets with every security tool, but this is not the case at most enterprises, I assure you.
This might sound odd, but the more places I looked, the more I found code that was undeployed, “not in use” (whyyyyyyy is it in prod then?), the project was paused, “Oh, that’s been archived” (except it’s not marked that way), etc. I asked around and it turns out this is common, it’s not just that one client… It’s basically everyone. Code all over the place, with no labels or other useful data about where else it may live.
Then I went onto Twitter, and it turns out there isn’t a common mechanism to keep track of this. WHAT!??!?! Our industry doesn’t have a standardized place to keep track of what code is where, if it’s paused, just an example, is it deployed, etc. I feel that this is another industry-level problem we need to solve; not a product we need to buy, but part of the system development life cycle that ensures this information is tracked. Perhaps a new phase or something?
Lack of Incident Response/Investigation Training
Many people I spoke to who are part of the investigations did not have training in incident response or investigation. This includes operations folks and software developers, having no idea what we need or want from them during such a crucial moment. When I first started responding to incidents, I was also untrained. I’ve honestly not had near as much training as I would like, with most of what I have learned being from on the job experience and job shadowing. That said, I created a FREE mini course on incident response that you can sign up for here. It can at least teach you what security wants and needs from you.
The most important part of an incident is appointing someone to be in charge (the incident manager). I saw too many places where no one person was IN CHARGE of what was happening. Multiple people giving quotes to the media, to customers, or other teams. Different status reports that don’t make sense going to management. If you take one thing away from this article it should be that you really need to speak with one voice when the crap hits the fan….
For those attempting to protect very old applications (for instance, any apps using log4j 1.X versions), you should consider getting a shield for your application. And by “shield” I mean put it behind a CDN (Content Delivery network) like CloudFlare, behind a WAF (Web Application Firewall) or a RASP (Run-Time Application Security Protection).
Is putting a shield in front of your application as good as writing secure code? No. But it’s way better than nothing, and that’s what I saw a lot of while responding and talking to colleagues about log4j. NOTHING to protect very old applications… Which leads to the next issue I will mention.
Several teams I advised had what I would call “Ancient Dependencies”; dependencies so old that the application would requiring re-architecting in order to upgrade them. I don’t have a solution for this, but it is part of why Log4J is going to take a very, very long time to square away.
Technical debt is security debt.– Me
I usually try not to share problems without solutions, but these issues are bigger than me or the handful of clients I serve. These problems are systemic. I invite you to comment with solutions or ideas about how we could try to solve these problems.