The rapid digitization of financial services in the last few years – both in products and operations – has led to an exponential rise in the number of incidents firms are dealing with – be they internal software bugs, third-party vendor vulnerabilities, or cyber attacks. In order for firms to stay competitive in an ever-evolving market and cyber environment, they must excel at incident response.
Incident response today must be rapid, focused, and streamlined. It must occur in a high-trust environment that allows for control over what is often mission-critical data. It must allow multiple teams who may have minimal day-to-day contact, such as developers, security operations, communications, legal, compliance, executives, and external partners such as crisis management firms to communicate and collaborate, with the incident response team and each other, to solve the issue. And it must enable clear handoff procedures for ongoing escalations so that there is never a question of who is in charge of what at any given time.
Financial institutions tend to fit into one of three categories in terms of incident response maturity:
1. Ad Hoc: The firm has a binder or, with the rise of remote work, a shared Word or Google document or online folder with standard operating procedures for incident response as well as a roster of the responsible people. A new hire may be given the doc upon onboarding and told, “this is what you go to when something bad happens.” This approach is slow and cannot adapt to changing contexts. Nine times out of ten, the manually maintained procedures and rosters are out-of-date. That said, this is the bare minimum a firm should have.
2. Point solutions: In the rapid move to remote work, the market has exploded with digital replacements for what we all used to do in person. The flow for incident response may include separate tools for collaboration, on-call, escalations, reports, archiving, compliance, etc. Many of these tools tend not to integrate, requiring constant context switching and adding tedious work like copying and pasting text, troubleshooting broken links, and provisioning the right access to the right systems and files in a fluid situation with frequent handoffs. There are multiple points of potential failure and multi-way dependencies. Gaps between systems are especially dangerous in a situation where control over data is paramount. So much energy is required to make the system work that once the incident is resolved, there is little energy to do retrospectives and figure out where improvements can be made for next time.
3. Integrated solution: Firms use one incident response system integrated with their primary communication tools and systems. This allows all involved to see who the incident commander is at any given time, what playbook is running, and what steps are completed, in process, or outstanding. Executives can see both the high-level picture and technical details in real time. With these efficiencies, teams do have the energy to do post-incident retrospectives and figure out where the bottlenecks are and what might be automated to streamline the process moving forward. Most importantly, the processes and rosters are updated and improved upon with every cycle.
When the overarching goal is reducing the frequency and severity of incidents, an integrated incident workflow collaboration and response solution allows for a continuous improvement cycle that can save precious time in a crisis. However, it is critical that that integrated solution does not include the same general collaboration or chat tool that could be the target of an attack, because if that main system goes down, then the incident response team (e.g. the “firefighters”) are trying to put out the fire from within the burning building. If an attack wipes out the ability for the incident response team to collaborate remotely--or compromises the integrity of the platform--that will lengthen the time to resolution of the existing problem, sometimes dramatically.
When it comes to incident response, there are two levels of efficiency – how fast you can respond, and the utilization rate of assets used in responding. The more outages you have, the more resources you must divert from other value-creating initiatives. Every organization must forecast and budget for incident response and make the tradeoffs it deems suitable. For example, you can go light on load testing, but recognize you may need more resources to respond to increased outages later on. An integrated incident response platform allows you to do the data analysis necessary to make the best decisions in real time.
A flexible one-stop shop for incident response is not actually just about excelling at incident response. It is also a signal to the talent market. Developers and security operations professionals have gone from back-office cost center to powering the digital transformation necessary to stay ahead in an increasingly competitive financial services landscape. With C-suites focused on digital transformation and innovation, high-caliber tech talent is increasingly central to the business.
Cybersecurity has a near zero percent unemployment rate, and the community is strong. When someone can change jobs in one minute by closing one laptop and opening another, talent retention requires authentic investment in workplace and culture. One way firms can do this is by arming their talent with the best tools to do their jobs. Teams too often rely on ill-fitting collaboration tools that fall short of supporting the fast-moving, highly technical workflows required for effective incident response. Ensuring that incident response happens quickly, smoothly, and predictably with the right integrated and dynamic tools signals a thoughtful firm that appreciates the teams delivering technical innovation.
Rapid digitization is inevitably accompanied by an increase in outages, vulnerabilities, and cyberattacks, making robust incident response increasingly critical to financial firms. Incident response must happen quickly and smoothly, with integrated tools that allow for maximum collaboration between all relevant functions, real-time data analysis, and executive visibility into both the big picture and technical details. Strong incident response is not only an efficiency gain, but also a technical talent retention strategy, and overall vehicle for growth and innovation.
© 2022 FS-ISAC, Inc. All rights reserved.
Ian is CEO and Co-Founder of Mattermost. He previously founded SpinPunch, Inc., an online video game company with millions of players across 190 countries. Prior to SpinPunch, Ian was VP of Product...Read More
at Flickme, a movie streaming startup backed by Sequoia Capital, Warner Brothers, and Sony Pictures. He also ran product management for Microsoft SkyDrive (now “OneDrive”) and Hotmail (now “Outlook.com”) and led engineering teams for Microsoft Office. Ian holds over a dozen patents in analytic applications and is an alumnus of the University of Waterloo, where he worked at Trilogy Software during school, and the Stanford Graduate School of Business, where he served as a teaching assistant for Andy Grove and Myron Scholes.