We make IT, security, or any business decision by weighing the risks and the rewards. What investments can we make to drive down costs or increase sales? Or as is often the case with security, what costs can we skip and still escape big penalties later?
Unfortunately for those of us indulging in wishful thinking, the likelihood and costs of data breaches continue to increase. The Ponemon Institute estimates that data breach costs rose to an average cost of $4.24 million per incident—a 10% increase in cost over the previous year.
This dollar amount is scary enough, but we also need to add on the consequences of other incidents such as business email compromise (BEC), distributed denial of service (DDoS), or even something as mundane as severed internet access.
Dealing with an incident is a matter of “when and how bad,” not “if.” Yet, it can get worse. Research conducted by the National Cyber Security Alliance estimated that 60% of small and medium-sized businesses go out of business within six months of being hacked.
No one wants to go out of business because of sloppy preparation. Instead, let’s prepare for incident response, so the damage from the inevitable can be contained.
See our top picks for the Best Incident Response Tools and Software
Creating an Incident Response Plan
Developing and executing an incident response plan is a complicated undertaking, so we’ve broken up the process into a number of steps:
- Incident Response Defined
- Incident Response Preparation
- Tips for Effective Incident Response Preparation
- Setting Up an Incident Response Team
- Incident Response Execution
- Incident Response Vendors
What is an Incident Response Plan?
Let’s start by defining the scope. What is an incident, what is a response? An incident is an event that affects our scope of responsibility, and a response is how we deal with the incident.
For cybersecurity personnel, our scope of responsibility may be limited to cyberattacks on IT systems, such as ransomware attacks, phishing attacks, and DDoS attacks. For IT managers, the scope might expand to encompass physical IT systems and events such as a flooded data center, a lost executive laptop, or squirrels chewing on network cables.
In small companies where managers cover many roles, an incident might broaden to include personnel and business processes with events such as insider data theft, sexual harassment, embezzlement, or the failure of a machine on an assembly line.
Regardless of the incident scope, our goal is to be able to perform the necessary steps and take into account any unexpected contingencies, and for that we need an incident response plan because our response needs to be as quick and thorough as if we’d practiced it (which we should). The foundational principles of incident response preparation and execution outlined below will help you develop your plan.
Incident Response Preparation
At a minimum, our incident response preparation process should:
- Define incident response responsibilities
- List incident response contacts
- Document the incident response process as a plan
- Circulate the incident response plan
- Keep the incident response plan current
When we define incident response responsibilities, we nominate the employee, manager, department, or vendor who needs to manage a specific incident, asset, or threat. For example, we might nominate:
- The IT security manager to handle a ransomware incident;
- Our external accountant to investigate financial fraud; or
- The building manager to handle threats to physical security at a specific office
After deciding responsibilities, we need to create a way for an incident to trigger notification to the responsible parties by listing incident response contact information. Tools can be configured to send automatic notices via text or email, but we should also make sure we list phone numbers as a contingency for tool failure.
Some of us don’t formally document our processes. We trust in our competence to act appropriately in the moment to handle any incident. This might work fine for calling 911, but beyond that, having a written process can help enormously.
Surgeons and aircraft pilots use checklists to handle the basics so they can concentrate on the nuances that require their expertise. Creating a written incident response policy or checklist will similarly help staff dealing with the stress of an incident so that no basic steps will be overlooked.
A serious incident will not happen at our convenience nor will it often allow for a relaxed schedule. Once the staff is in panic mode at 3 a.m., there’s no time to check if the contact list has the old CIO’s phone number or to print out the latest instructions for responding to a company-wide ransomware attack.
We need to regularly update our documentation on a quarterly, annual, or event-driven schedule. Then we must effectively circulate the incident response documents. The circulation can be through a shared file server, but we should probably also use email and printed versions, so key information will remain available for a wide variety of emergencies.
Tips for Effective Incident Response Preparation
The basics alone can create adequacy, but to be truly effective, incident response should:
- Eliminate grey zones
- Document contingencies
- Incorporate stakeholder feedback
- Be in-line with insurance policies
- Plan at a high level
- Be tested
Eliminate gray zones
When assigning responsibility, any grey zone or gap in responsibility can lead to confusion or even cause an incident to be overlooked. To prevent any vagueness, assign secondary responsibilities with overlap for every incident, asset, or threat.
In large organizations, some potential incidents, such as a misconfigured cloud data bucket exposed to the internet, may fall between departments. Ultimately, someone will need to step up and take responsibility for those items—and therefore, those incidents as well. For example, assign the cloud team to initially respond to incidents involving cloud assets with the cybersecurity team providing backup resources.
The assignment of backup resources will also be useful as a contingency. If our cloud team is based in an office currently disabled by a widespread blackout, a cybersecurity team member in another office assigned as a backup already knows to step up and address cloud issues without delay.
Contingencies for likely communication issues (internet or phone disruption, email server crippling, etc.), infrastructure issues (power outages, cloud outage, etc.), or personnel issues (illness, unreachable, etc.) should also be incorporated into our planning.
Incorporate stakeholder feedback
Plans developed only by those assigned direct responsibility will suit their needs and expectations; however, they might overlook the needs and issues of others. A drafted plan should be circulated among business executives, legal counsel, key vendors, and possibly even affected key customers for feedback. These stakeholders may point out additional considerations to protect the organization against lawsuits, violating regulations, or unnecessary business disruptions.
Be in-line with insurance policies
Insurance policies can also heavily influence how we respond to an incident—particularly cybersecurity. Some policies require initial contact to be made with an insurer who will deploy their own incident response team. Others might require specific documentation and forensic evidence to pay out on expenses related to an incident. Work with legal counsel and insurance representatives to make sure the requirements are well understood and incorporated into our incident response plans.
Plan at a high level
When planning, some of us will be tempted to create prescriptive and detailed plans for specific types of attacks. This school of thought wants to include an appendix for different incidents, but this may not be as helpful as we might like. Plans, especially for smaller organizations without resources to change documentation frequently, should be less specific and generalized for two key reasons.
First, technology and personnel changes happen too quickly to be easily captured in a static document. A server web shell attack incident response plan designed last year when we had our on-site data center quickly becomes obsolete when we transition some of our servers to the cloud and transition others onto virtual machines.
Second, we may not know what types of attack we are dealing with until we complete the investigation. Sure, it may be obvious that our St. Louis office has been struck by ransomware, but how long will it take to determine if the source of that attack came from a phishing attack on the VP of sales for the office, a web shell we missed while cleaning our Kansas City Exchange server after it was patched, or an insider attack facilitated by a disgruntled IT employee in our Phoenix office? A rigid step-by-step process will not be flexible and may lead an incident response team away from the evidence at hand in favor of checking off the steps in the process.
The goal of an incident response document is to be useful, not to consume hours of time to keep them current or to misdirect us. However, checklists and decision trees can be helpful in keeping the team focused and reducing errors. The trick will be to strike a balance between details and generalizations to maximize utility and minimize obsolescence.
For example, instead of creating a specific plan for a “ransomware attack on a desktop computer” incident, generalize it to a “malware attack on an endpoint device” and possibly have a secondary checklist for ransomware attacks. Similarly, instead of escalating a detected server data breach incident to “Janet in IT and Dickinson in legal,” escalate incidents to “data breach team leader and legal,” create the email address [email protected], and hyperlink documentation to the continuously updated contact list.
Test your plan
Lastly, test the plan. Run a simulation as a table-top exercise drill to see how easily the plan can be understood and how readily the team can use the document. Most plans make sense when they are read, but when trying to execute the plan, people may notice items that are overlooked.
It is better to discover details in a meeting room than during an attack over the 4th of July weekend
For example, it may make perfect sense to route a web server DDoS attack to the cybersecurity team on paper, but in an exercise, the cybersecurity team may realize that only the marketing department and their outsourced designers hold the passwords to access the server. It is much better to discover such details in a meeting room than during an attack over the 4th of July weekend when the marketing department is on a no-phone, team-building whitewater rafting trip.
Setting Up an Incident Response Team
When we build our contact list, we define roles. If those roles have individual people assigned to them, then that is our team. However, if we list a department, then we also need to go to each department and determine the specific team members with the primary responsibility for incident response.
When selecting members, volunteers are nice, but competence is essential. Some members may need to be drafted, so the incident response team can be effective. Technical savvy is required for technical roles, but there should also be non-technical roles as well.
Typical roles include:
- Company executive to authorize costs, approve escalation, etc.
- Information security analyst and specialist to track and contain IT attacks
- IT systems specialist to manage recovery, temporary systems, etc.
- Operations to help the organization deal with operational challenges
- Legal to deal with legal issues and advise about evidence
- Human resources to deal with internal personnel issues
- Finance to deal with financial incidents and financial consequences to other incidents
- Public relations and marketing to handle internal and external communications
- Incident-specific specialists (often outsourced) such as:
- Firemen for fires
- Physical security for physical breaches
- Forensic IT specialists to gather computer evidence
Not all roles need to be filled by employees. PR can be handled by PR agencies, and some aspects of incident response can be outsourced to technical consultants. However, the company should retain overall supervision and coordinate communication. After all, no one else can be as motivated as the victim to have a successful result.
Incident Response Steps
Regardless of the type of incident, the basic steps will always be:
- Lessons Learned
These steps become easier with thorough preparation, but even completely unprepared organizations need to deal with these steps once an incident occurs.
Identifying the nature of the attack may seem simple, but it can be an evolving process. For example, an IT team dealing with a ransomware attack might discover that the source of the attack came from unpatched firewall vulnerability or that their data has been exfiltrated. Identification may need to recognize several different types of incidents (ransomware, web shell, data breach, etc.) and immediately address them upon discovery.
Some incidents turn out to be false alarms. A SIEM may produce an alert indicating a compromised machine, or a user may desperately call the help desk and mistakenly report that a huge amount of data has been deleted.
Each of these instances trigger the beginning of the incident process, and the first level of incident response must identify if the issue is valid. The alert might be a false positive, and the “deleted data” may be that a folder on the shared server was accidentally dropped into a subdirectory out of sight. Only an investigation into the details can properly identify how the organization should proceed.
Containment involves minimizing the scope of the incident. As the identification process determines the nature of an incident, the team can move to isolate the incident to prevent further infection, damage to other systems, etc.
Often, we need to assume the worst condition and take down many more resources than are initially perceived to be infected. While this can severely disrupt business operations, infections can spread faster than detections and thereby justify drastic action.
Eradication requires the incident response team to remove the cause of the incident. For a physical event, such as a flood, we can easily see when the water has been removed. However, for digital attacks, the traces of the event can be much more subtle and may involve checking many systems for possible infections that were never there.
Our businesses need to be patient to ensure we do not rush to declare the event over before we remove hidden back doors, reset everyone’s passwords, and other steps needed to eradicate advanced persistent threats (APT).
The process of eradication naturally requires us to trace the infection or malicious activity through log files and other records. In today’s world, a significant incident could also lead to lawsuits (customer, shareholder, privacy breach, etc.) or regulatory action (HIPAA, GDPR, etc.), so we should also ensure we capture and preserve possible evidence during the eradication phase.
Recovery must restore systems and operations to full functionality. Depending on the type of incident, this may include restoring data from backups, acquiring replacement systems, reinstalling software, or even physical restoration from flood or fire damage.
Some recovery processes may be delayed by regulatory, investigation, or eradication processes. Recovery teams need to communicate with legal and other advisors to verify how to safely recover without harming other initiatives (law enforcement investigation, preparation of data for lawsuit defense, etc.).
Lessons Learned involves gathering information after recovery and using it to improve future incident response. Trace each step and progression of the incident to identify possible mitigations that could in the future prevent that step or to trigger alerts earlier.
Incident Response Services
Most companies cannot afford to hire experts for every possible future need. Incident response tends to be infrequent, so most do not prioritize retaining incident response specialists.
Fortunately, we don’t have to limit incident response to internal resources. We can always outsource part or all of our incident response needs to consultants, managed service providers (MSPs), managed security service providers (MSSPs), or corporations specializing in incident response and forensics investigations.
The incident response market exceeds $1 billion in annual revenue, and those in need can consider many small and large corporations. Massive brands such as Kaspersky Lab, AT&T, Carbon Black, Cisco, CrowdStrike, IBM, Broadcom-Symantec, Trustwave, and Verizon have service offerings available.
Ideally, an organization should look ahead and screen potential incident response vendors in advance. Many incident response vendors can prepare Master Service Agreements that lock down the legal terms and basic pricing in advance. Once an incident has started, companies lose the luxury to be selective.
When creating a starting list of candidates, discuss options with legal counsel and with the cybersecurity insurance company. Often, they may limit selection to a pre-approved list of vendors.
Whether during an incident or in advance, the same criteria will require consideration:
- Technical skills
- Understanding of the specific technology in use
- Availability on short notice
- Company culture match
The prioritization and weight for each of these categories will vary from organization to organization. Fortunately, the earlier a company prepares for an incident, the more likely it will be able to handle it.