Why SecOps Automation and SOAR Initiatives Fail

Five lessons on security automation risks that will save you energy, time, and money.

Aug 03, 2024

3 out of 4 organizations I encounter waste their investments in security automation.

After two years of automating our SOC, we decreased the alerts requiring human interaction by 40% and the time spent on alerts by 70%.

I will analyze 5 pitfalls we experienced internally or with customers to help you maximize your investments.

1. Not using other products at their fullest first

2. Starting with SecOps automation for the wrong reasons or assumptions

3. Not having the right people or focus

4. Focusing on containment orchestrations too soon

5. Not knowing how to measure success

Over the past 8 years, I’ve helped cybersecurity service companies boost their revenue and profits while enhancing security for 60+ organizations worldwide.

I share business strategy insights on SecOps and cybersecurity that will take your company to new heights. Subscribe for free to get my latest ideas first!

Sign me up!

My Automation Journey

I work for a company that offers 24x7 SOC as a Service and Incident Response.

When talking about security automation, it’s impossible not to mention SOAR. Whether open source or proprietary, both involve significant efforts and challenges.

In 2022, we decided to add a SOAR to our stack. We had automations based on Python and containers. Maintaining them was becoming complicated.

We were handling around 600-800 alerts per month from different customers. None of our analysts experienced 'alert fatigue' or burnout. Still, we felt there was room for improvement in our processes.

We had two foundational needs that aimed to move the company to the next level:

Skyrocket efficiency: We were transitioning our customers to a fixed-price model for our SOC service. Costs can spiral out of control when managing alerts and incidents in a 24x7 operation. We needed a new way of working and a stack to support it.
Increase technical flexibility: Our previous technology stack wasn’t the best at working with cloud-native applications and APIs.

We started our quest to find the perfect SOAR. After an extensive POC with three products, we chose a winner and implemented it in December 2022.

And we suffered. We wasted time. Until we got better.

Nowadays, the SOAR is the backbone of our operation.

Let’s analyze the pitfalls and key lessons that we learned along the way.

1. Not Using Other Products at Their Fullest First

The world is full of tales of companies that bought technology and didn’t deploy it. I'm sure you experienced something similar at least once in your career.

Organizations of all sizes underestimate the technologies they have running in their environment. After a basic implementation, they become a story of the past, and we start looking for the next shiny thing.

You prefer to invest in other tools rather than in expertise on the existing ones.

This trap is especially damaging for one single reason:

The value return of automations depends on your knowledge and capabilities of other technologies.

But there’s a more pressing issue. A mirage around SOAR that makes companies believe that is the key to fixing their security issues.

“You are able to contain users and workstations.”

“SOAR removes the pain of phishing attacks.”

SOAR is not the problem, but the misinterpretation of its promise.

You don’t need automation for your firewall to block traffic. Nor for your WAF to apply certain rate limiting, or for your EDR to block a malicious file.

If you bought a SOAR for that, I’m sorry to tell you, you’ve wasted your budget.

You don’t need anything new to do most of the containment and prevention actions you have in mind. You need to spend more time increasing the potential of your existing solutions.

You will be surprised how much you can do with your current M365 licenses, anti-phishing, EDR, Next-Gen firewall, and cloud application.

Focus on the features that are not enabled. Once you hit a roadblock, before thinking that a SOAR will save you, make sure that it’s not a product or license limitation.

You must maximize containment within a product before adding an extra layer.

2. Starting With SecOps Automation for the Wrong Reasons or Assumptions

You are using your existing technologies at their best, but there’s still a high chance that you are doing SecOps automation for the wrong reasons.

Consider a scenario: you work for Company X, a fintech company looking into SOAR because they believe it will help you analyze phishing cases faster.

But… are you actually having this problem? How many cases do you analyze daily? How many hours are you losing to this issue? Is it really a bottleneck for your team?

And… do you understand the implications of analyzing phishing cases with external tools? You might need additional, and potentially paid, threat intelligence.

Let’s take another example: Company Y, an MSSP offering 24x7 SOC services, wants to automate its level 1 operations.

First, what does that even mean? Every organization I’ve met has a different definition for level 1 and its corresponding responsibilities.

How much are you spending on your first line? What processes do you want to automate? Is it even possible? Do you have the required access with your customers to trigger certain automations?

Or… do you just want to resell the tool because your customers are asking for it? Be honest about what you expect.

Yes, SecOps automation can have amazing returns, but only if you identify your true constraints and understand the potential effort and time required to address them.

The best advice I can give you is to identify your top three bottlenecks and do a proof of concept around them before investing in new technology or profiles.

For us, it was only two, and it was hard enough.

The journey takes time, lots of iterations, and money.

3. Not Having the Right People or Focus

This was our first challenge.

We'd been doing automations for two years before formalizing our SOAR investment and realigning our SOC strategy.

We had people who knew how to code and make APIs dance.

Still, progress was slow during the first months. After various discussions with my team, we found that there was a lack of focus.

We automated tasks on an ad-hoc basis whenever necessary. These team members also handled alerts, consultancy and performed incident response.

Things improved when we decided to formalize the role of Automation Engineering. We allowed team members to focus and we set clear accountability and goals.

This is the first mistake I see in different teams. I get it, you don't have a big team and try to concentrate capacities as much as possible. But it doesn’t work. Security automation is a complex domain that requires plenty of focus.

Don't just take my word for it. Google's course on Modern Security Operations (2024) states the significance of this role in a SOC team.

Allow at least one team member to focus on automation engineering.

The second mistake I see is the underestimation of knowledge when automating processes.

A SOC team may be proficient in SIEM, incident response, EDR and detection. But it's not going to be enough.

SecOps automation requires a different set of skills and mentality. An almost impossible conjunction of knowledge in different domains.

A solid security automation initiative requires harmony between four key domains:

Blue team knowledge: Most SOC teams already have this. You need proficiency in detections, alert handling, and the associated workload challenges. After all, what are you planning to automate?
Automation knowledge: You need people who are able to code, use git, and be proficient with the SOAR itself. Especially, they need to know how to think automations. What if you push a playbook that completely breaks your organization?
End-Product knowledge: It's not only about knowing your SIEM and SOAR. You need deep product knowledge of other systems within your company: Ticketing systems, EDR, firewalls, cloud applications, and so on.
End-Product API knowledge: Often overlooked. Being proficient with a technology doesn't necessarily mean understanding its API. Deploying a product and working with its API require different skill sets. Failing to understand End-Product APIs can lead to limitations and production impacts. We'll explore this further in the next chapter.

It's nearly impossible to have all this knowledge in one individual. Still, various teams try to make it happen and fail.

The challenge is that it's hard to distribute these domains among different collaborators. You need a certain background in more than one domain if you want to be successful.

If you're starting, stack blue team and automation knowledge into one collaborator. Your engineer will need to work with other product specialists to ensure the validity and safety of your ideas.

4. Focusing On Containment Orchestrations Too Soon

This was our second challenge.

Orchestration refers to the coordination and management of multiple automated tasks or processes working together to achieve a larger goal.

We wanted what every vendor promises:

The End-To-End orchestration containment experience. To detect a threat and automatically block users in Active Directory and in every other application. All while pushing rules firewalls to block traffic from compromised users. And of course, trigger an automatic threat hunt to find more indicators.

From the MSSP perspective, the premise is beautiful. Containment orchestration can offload your SOC tremendously.

For instance, even if your SOC operates 24/7, your customers might not. No one wants to receive a call at 3 AM. That’s why one of our main KPIs is the percentage of alerts that involve contacting the customer. With End-To-End orchestration, you could take automatic containment to mitigate risks outside business hours, so you don’t wake your customers in the middle of the night.

Our customers wanted it, and so did we. We assigned part of our team to work on these playbooks, but we started to hit roadblocks.

Buy-In From Stakeholders

You are removing control and adding a point of failure.

Everyone agreed, but in the end, customers didn’t want to commit to its implications. Even with guardrails and controls, giving admin rights to an external automation that you don’t control isn’t easy to digest.

End-Product API Limitations & Knowledge

First, none of the promised built-in containment playbooks worked out of the box. We ended up creating our own from scratch.

We encountered various API limitations from other products. For example, in Check Point firewalls, you need to push policies to apply rulebase changes. This can increase CPU usage of the appliance or apply unwanted changes.

Your SOAR will use the APIs of a target product. You might know how to use your stack, but you can’t expect your team to know the API of each product and its implications.

Increased Complexity

The complexity factor plays a big role. As soon as you need to stitch two technologies together, complexity skyrockets. More than 2... well...

Let’s say that you make it work. Who’s going to check for changes in the API? Or the product modifications that might affect your automations? Your integrations might not update with the same frequency.

And what if, by overlooking this, you start affecting core operations?

That’s why maximizing your current technologies and End-Product knowledge is so important.

Automated containment is a logical evolution. It required a high organizational and SecOps maturity to make it work.

Instead, starting with what I call facilitation orchestrations is a better approach.

Facilitation orchestrations improve your operational efficiency with almost no risk of affecting production.

These orchestrations also use external APIs, though not for containment purposes.

We decided to switch our focus to using SOAR to streamline the process and make the lives of our analysts easier.

Our first eureka moment with the SOAR was when we were able to enforce processes. An alert is always handled in the same way with a consistent communication template. This approach allowed us to decrease the time spent on alerts by more than 70% in one year.

The second big leap came. We were able to close alerts without human intervention based on enrichment.

Finally, we overhauled our SOC analytics by applying context in our ticketing system.

Your first months should focus on facilitation. It's not an easy task. You will need to tinker with your ticketing system, threat intelligence tools, SOAR, and so on.

Leave containment as much as possible to the End-Product.

5. Not Knowing How to Measure Success

Justifying technology investment and defending your hard-earned budget is a complex undertaking.

Most organizations are not metric-driven. Some want to be but don't have the necessary systems in place to track what matters.

Once you start, It's easy to get lost in the endless KPIs of cybersecurity. You might convince your stakeholders of effectiveness with a pie chart of detection.

But sooner or later, management will challenge your return on investments.

You need to define between 2 and 4 measurements to assess the success of your initiatives.

In my experience, there are two key performance indicators to achieve this:

Average time per alert (Goal: Decrease): This is a great measurement to assess alert handling optimization in your SOC. It's a basic metric for every SOC. If you don't have it, don't start with SecOps automation.
Alerts without human intervention (Goal: Increase): The golden metric. It encompasses alerts that were filtered, auto-closed, or handled without your analyst even seeing them. This metric will help you assess your progress and identify areas where you need to focus in terms of processes.

Less is better. You will see that getting these metrics right is hard enough. Don't invest further if you don't have the means to gather these measurements.

After getting these metrics you will need to establish safety indicators. Be aware that these are more challenging to measure. Some examples are:

Number of failed playbook executions.
Number of missed true positives due to automations.
Production incidents due to automations.

Define which metrics represent operational efficiency improvements in your organization. Then, ensure you have the means to track them.

Conclusions

The described pitfalls stem from my experience working with different SecOps teams.

Security automation is a mandatory quest for every SOC. Sadly, Everyone talks about the benefits and not the downfalls.

I encourage you to start taking the first steps towards automation. The sole effort of setting the foundations right will already benefit your organization.

Wasting some effort will be inevitable; after all, you must improve through iterations. I hope that my ideas allow you to find a strategy that maximizes returns.

Feel free to reach out if you want to discuss further. I'm always up for meeting interesting people with cool ideas.

Fede's Nexus: Mixing Business, SecOps & MSSPs

Discussion about this post