Enhancing Cloud Firewall Security
Every now and then I walk into a Zscaler Internet Access (ZIA) customer environment where the default firewall rule is set to allow, rather than deny. Generally this is a holdover from the pre-sales proof-of concept (PoC). It's certainly quite understandable as to how that would linger. But now, in production, we need to get achieve the best practice, especially if we are about to start bringing any sessions over from previous firewalls -- those that do have the default deny enabled.
So, here's my simple 3-step process for getting that accomplished in minimal time and with the greatest positive impact.
The biggest challenge is going to be finding the right person to take charge of this very real problem. Because while this process is about getting default deny implemented in minimal time, I didn't say it would be all that easy. You will need that one special individual who will take full ownership of the entire process from start to finish. Relentless. Thorough. Driven. Professional. Zscaler Certified. But, more importantly, deeply experienced in firewall traffic and analysis.
Oh yeah, it also couldn't hurt to have someone who clearly wants the program to succeed and is going to do it the right way. From using the next generation firewall (NGFW) methods over the old 5-tuple skill set, promoting training and communication, and really holding reign over each and every firewall rule.
If possible, someone else should be assigned to shadow the leader, providing hours of direct support each business day. There will be hours upon hours spent analyzing traffic using tools such as:
- IPlocation.net to determine the destination of each questionable IP address.
- ARIN.net to validate what was learned about that IP range and to grab just the right IP ranges.
- Internal Databases such as Service Now, existing firewall rules, and asset databases are especially helpful in understanding source IPs.
- Microsoft Excel (pivot tables, specifically) to quickly crunch through all the log information and see the traffic much more clearly.
Finally, make sure every stakeholder is aware of what you are working to. Your CISO will likely be interested to the point of wanting a weekly status update. And be prepared to fend off those who might try to slow you down with unnecessary processes, such as totally meaningless change control (also, more on that later).
Now, let's go...
Step 1: Catalog
What you will immediately notice when trying to get your arms around just how big this problem has become is that the logs for the default rule are immense, with, dare I guess, most of your traffic hitting this rule. Regardless of where you measure this (Dashboard, Insights, or Logs), it's just going to be obvious.
Right about now you are recalling the saying "How do you eat an elephant? answer = one bite at at a time". And that's basically it. One. Bite. At. A. Time.
Enter the "INVESTIGATE" Rules
Start off by building a core set of INVESTIGATE rules. Go all CAPS so that everyone can tell what you are doing and spell that out in the rule descriptions so that it is perfectly clear. Rules such as INVESTIGATE applications and INVESTIGATE IPs should be pretty intuitive. These are simply the rules that you will use to pull traffic out of the default rule at the bottom.
Your goal, and the success of this endeavor, hinges on you properly pulling traffic out of the default rule, up to individual INVESTIGATE rules that will let you and everyone else see what is going on.
TIP: Resist the urge to create change control updates for each INVESTIGATE rule that you add. Seriously, from here on in you are about to hit a point where you, alone, could quite possibly be pushing the Activate button a thousand times a month. And being crushed by a process that is built for material changes, where here there really are none, will kill the program. If someone insists, push back. Escalate if you must. Let them know that you are not actually changing anything yet (that's Step 2). Here you are only pulling already allowed traffic into newer preceding rules so that it can be cataloged, understood, and prepared for what's next to come. One bite at a time, remember?
Also worth noting is that if you have guest networks going through the Zscaler firewall (great if you do!), you will also have create location-based rules for the guest sub-locations, generally ahead of most of your INVESTIGATE rules. These will be permanent, so for those you might actually want to follow your standard change control process, as building a permanent rule does constitute a material change.
But really, the best strategy and tactic is where the CISO knows that the right team is assembled and has just given carte-blanche to get it done.
With all this in place, just spend hours a day doing what you must to get the lingering Default rule traffic up into other rules. Step 3 will address how to go about emptying those rules, but for now you need them to transition from ALLOW to DENY, so just do it.
To do this as efficiently as possible, I simply run a log report that shows all the traffic in the Default rule over the previous 24 hour window. You will notice from the screenshot that I use Inbound Bytes min/max to get rid of all the traffic that comes back as 0 bytes (not really talking to anything anyway). That gets exported and then goes into a pivot table where I can filter it as needed, using that as my guide to research each IP address before even returning back to the Zscaler admin console (a great chore for an intern or junior associate).
As you now do your investigation into each IP address (most likely using the tools I mentioned above), take the time to build a destination group for absolutely each and every destination range. Yes, many will be Amazon, Google, Microsoft and every other large network on the planet. But again, this is not the time to create lasting policy, merely to learn...and catalog all that you have learned. So take that traffic, create as many destination groups (IP and/or FQDN) as you can find and that make any sense whatsoever to add, then add them.
Then just go add that friendly new destination group to whatever firewall rule is best applies to. There should be no IP addresses in any of the firewall rules (that's old school), only the named destinations. Otherwise, you are practically flushing all that IP lookup knowledge that you did right down the toilet, only making it harder for others to pick up on later. By the time you make it to Step 2, you will not only have an impressive database that will serve the organization for generations, but will have also set the bar for the entire firewall and security team to follow. Well done!
Step 2: Activate
Congratulations on now reaching this major milestone! Here you have reached a point where you are not at all finished with the daily chore of watching over the Default rule, but have reached the minimally viable product (MVP) of being able to state, with your reputation on the line, that the organization can enable the default deny with acceptable risk.
As each organization will have their own standards for change control approvals, now is where your ability to communicate and sell comes into play. Everyone will want to know what it means, what's at stake, what both success and failure looks like...all of it.
Expect to answer the following:
- How will this impact the service desk? Answer: Really not at all. We have taken great steps to ensure that all the traffic that was previously being allowed in the Default rule has, at the very least, found an intermediate home it the many INVESTIGATE rules that we created. In fact, we don't even want or need the Service Desk knowing. They already have processes for users that might call in with an application that doesn't work and will just follow those as needed. But really, maybe only one or two cases should even surface over the coming months, such as a single user who needs to upload a large amount of data on a quarterly basis to a business partner using a special SSH client/session that we just haven't seen in any of our prior daily log analysis. But those are very few and far between.
- What if an application does break? Answer: Simple. Our team is trained and ready to go. We will look at whether any traffic is being blocked by the Zscaler firewall and quickly determine if it should be allowed. More or less immediately we will drop that traffic into an appropriate INVESTIGATE rule so the the user/group is happy. Then it's just a matter of following the change control process to get it into a permanent rule.
- How much traffic is hitting the rule now? Answer: Almost none. And what is hitting it day after day (a few hundred events is all we see over a 24 hour period) is largely classed as just ambient garbage from web sites as they collect unneccessary bits and bytes of user information. Where we do see big spikes at times, that is generally some client machine that managed to get some sort of torrent software running and traffic we would want and expect to block anyway.
- When do you want to apply this rule? Answer: Just as soon as the change control is approved. No hesitation. And certainly no downtime window. We are ready, so if it was approved right now I would enable it right now, with immediate affect.
Step 3: Remediate
And now we have reached the point where we have to get all the traffic out of the INVESTIGATE rules, finding each a permanent home. And of course that home could be a specific or grouped allow rule via granular firewall policies (especially cloud app), or just letting it now fall down into the default deny rule.
When I lead these efforts for clients myself, this is the point where I shift away from me as the lead and more toward the standing firewall processes. Why? Because having achieved Step 2, we need to expand all this great knowledge and cybersecurity acumen to the rest of the team. Also because many key decisions are going to have to be made, such as:
- Should we allow enterprise clients to initiate applications outbound such as AnyDesk, LDAP, and SSH?
- Do we have a list of which applications are denied by default across all enterprise users?
- Likewise, do we have a list of applications that are denied across even guest network users?
- Do we need to adjust any policies so that we are covered as we shift them out of the INVESTIGATE rule(s), setting them up to instead hit the default deny rule?
- What do we need to communicate at each step?
- And of course most importantly, who is accountable and what is our daily risk/exposure as long as any potentially bad traffic is still hitting the INVESTIGATE rules? If we are breached, what might an auditor, especially and outside auditor, find that would damage our credibility? Did we stay committed to the remediation, or relax and hit cruise control after achieving Step 2?
Enjoy all that you have achieved and can now better focus on as you take on advanced threats, scale your traffic, and prepare for all that lies ahead (such as those that lie in encrypted traffic).