BYC Home TechSets Free Articles Harvard Business Offers Career News Job Boards About Us
A Process-Based Approach to Handling Risks
Wayne Jones
Al Gallo

Abstract: Good cybersecurity risk management is critical to successful management in all areas of the enterprise. The only feasible approach to this is one built on well-defined yet flexible processes.

Select any number of IT professionals at random, and one of the few topics of agreement among them is the critical role that managing risk plays in job success. Such agreement, however, is short-lived since the phrase risk management has come to mean something different to everyone; it's both perceived and expressed in terms of specific disciplines.

Definition of Risk

In 2002, the US National Institute of Standards and Technology released Special Publication 800-30, Risk Management Guide for Information Technology Systems to help a wide range of IT professionals within federal organizations. The guide defines risk as "the net negative impact" resulting from the combination of both probability of occurrence and the adverse impact the event would have. US federal agencies such as the Department of the Defense, NASA, and the Environmental Protection Agency have developed formal policies and guidelines that require the implementation of active risk management programs across their entire enterprises. In the private sector, where risk is a threat to the bottom line, everyone from the airlines to drug companies and from demolition experts to investment bankers has developed working definitions of risk and the optimal approach to managing it based on the specifics of their domains.

IT and cybersecurity professionals are no different when it comes to a need for effective approaches to the day-to-day management of risks. Regardless of the specific application area, there is general concurrence that good risk management is critical to successful management in all areas. Yet there is almost no agreement among similar organizations when it comes to exactly what is appropriate or how much should be done to ensure organizational success. This is true partially because cybersecurity groups are shaped to fit their parent organizations and also because threats come in many forms and generally occur at the worst possible times. It would be impossible for any CIO, IT manager, or professional team to be 100 percent prepared for each of the risks that inevitably arise in the course of performing their day-to-day functions. Therefore, in the absence of complete, essential foreknowledge, the only feasible approach is one built on well-defined yet flexible processes for managing risks on a continuous basis.

NNSA Background

Congress established the National Nuclear Security Administration (NNSA) in 2000 as a semiautonomous agency within the US Department of Energy. It's responsible for enhancing national security through the military application of nuclear energy. The NNSA has four missions with regard to national security:

  • To provide the US Navy with safe, militarily effective nuclear propulsion plants and to ensure the safe and reliable operation of those plants.
  • To promote international nuclear safety and nonproliferation.
  • To reduce global danger from weapons of mass destruction.
  • To support US leadership in science and technology.

As outlined before Congress in its Complex 2030 plan, NNSA's future path is to establish a smaller, more efficient nuclear weapons complex that is able to respond to changing national and global security challenges.

The NNSA's Office of the Chief Information Officer (OCIO) provides policy and guidance pertaining to safe and secure information technology and information management operations throughout the NNSA computing enterprise. The OCIO develops, oversees, and manages systems for the safe and secure transportation of all DOE- or NNSA-controlled data and information in strategic or significant quantities for the NNSA Nuclear Weapons Complex. To ensure compliance with agency goals and objectives for Complex 2030, the OCIO will use the implementation of a risk-based approach to the ongoing challenges associated with cybersecurity.

Risk-Based Approach

The NNSA OCIO cybersecurity management incorporates a risk-based approach. This requires the analysis of threats to information; the search for any vulnerability that might exist in networks, hosts, and applications used to process that information; and the assessment of the value of information and computing resources to the NNSA enterprise. We can think of risk as a combined function of threat, vulnerability, and value, where risk increases proportionally with the degree of each. Each site must determine its risk based on the value of the information it's responsible for protecting, the threats, and the vulnerability of its systems. The results of this assessment help determine appropriate countermeasures, while ensuring that risks accepted by one domain will not be imposed inadvertently on other domains within the enterprise.

An analysis of these threats within the NNSA's computing environment leads to an awareness of the vulnerabilities and the determination of appropriate counter-measures. NNSA's sites and laboratories perform risk assessments and evaluations of cybersecurity threats relevant to their local conditions' information, missions, and environments. The mitigation of site- or lab-specific risks is required in a form consistent with their cybersecurity policy and NNSA-wide cybersecurity guidance, policies, and procedures, and the implementation of the cyber architecture at each site or lab. However, this is only one step in the process of implementing a risk-based approach to cybersecurity. Each location must ensure that there are programs in place that not only test and evaluate the efficiency of the implemented architecture and controls but also provide continued effectiveness in the face of any evolving risks.

The risk management process begins with the development of a policy, including a clear definition of acceptable risk tolerance, which then guides the operational procedures, the output of which is crucial to both decision makers and stakeholders. The risk identification process uses near real-time data to identify emerging vulnerabilities and threats related to security technology, people, and processes. Through risk analysis, potential threats are examined and quantified according to the likelihood of attack, the asset value to the business, the asset's location on the network, and any potential legal or compliance issues related to the risk. The response plan and risk mitigation road map then prioritize appropriate actions to reduce risk as quickly and cost effectively as possible.

Having a clear understanding of the potential benefits and risks associated with cybersecurity has become a critical mission for today's CIO and senior management. Such things as cyber attacks, computer abuse, privacy issues, and identity theft have not only raised the level of agency awareness but also have led to a new wave of regulatory requirements, which must be clearly understood and implemented by the OCIO. Failure to understand either the threats or implementation regulations can result in serious consequences for the agency and its leadership.

The OCIO must perform a clear and decisive leadership role in the area of cybersecurity to ensure the protection of the agency's assets. CIOs and their staffs must recognize that threats are not simply technology issues but have become serious concerns for the entire agency enterprise. The ability to address information risk and security at an enterprise level requires sophisticated thinking that cuts across people, processes, technologies, and multiple locations. The technology leadership must develop a program that not only invests in technology solutions but also provides for similar levels of investment in supporting processes and human resources.

One of the biggest problems facing CIOs is the ability of their staffs not only to interface with, but also to align with the agency's primary business community. The ability to adopt an organizational view of cybersecurity risk provides agencies the opportunity to design improved cybersecurity architectures that will encompass policies, technologies, and human capital. A key element to this entire approach depends on the willingness of senior management to ensure that appropriate investments are made in protecting the agency's critical information assets. Other keys to good cybersecurity risk management include commitment to employee training programs at all levels and the integration of IT management with traditional risk management of such things as physical security.

Cast a Wide Net

A comprehensive risk management approach to cybersecurity requires the early identification of threats and vulnerabilities most likely to occur, the ability to qualify and quantify the potential harm to the agency, and the development and implementation of appropriate mitigation steps to achieve an acceptable risk level. It's too simplistic to say adequate risk management is the ability to manage devices, push sets of rule changes, or update patch levels. Comprehensive risk management also requires knowing which assets to patch first, determining what controls to implement, assessing whether or not patching occurred, and anticipating what affects remediation efforts will have on the organization's overall risk exposure.

In a distributed organization - whether hierarchical or federated in nature - senior managers will be most effective in their decision-making if risk-related information is derived and treated in equal ways across all parts of the enterprise. In other words, they should be able to compare similar elements quickly and with a high degree of confidence that the same processes and standards were applied to the management of risks regardless of locale. To achieve this, a carefully crafted policy must clearly be articulated and the staff trained in the gathering, evaluating, managing, and reporting of emerging risks.

NNSA is working toward both these objectives through the implementation of the Software Engineering Institute's process-based approach to continuous risk management. Both the US Department of the Defense and NASA successfully use SEI's approach, known as continuous risk management (CRM). This model makes some basic assumptions regarding the nature of risk. The first assumption is that risk is a feature that is ever present and continuously changing in all work environments; therefore, the processes needed to manage it must be equally present and continuously applied. The second assumption is that all risks are not created equal, and thus, don't require the same amount of an organization's generally dwindling resources. Another important concept is that risks go through a lifecycle of their own - that over the course of any business or mission activity, some risks come while others go; some might be relatively constant whereas others evolve and morph over time.

In designing any risk management program, it's crucial to make a clear distinction between a risk and a problem. A risk is an adverse condition that might or might not happen, while a problem is one that has occurred and is now fact. They are not the same and are not managed the same way.

Essence of Risk Management

The process model for continuously managing risk is essentially equivalent, whether considering a program or project with discrete start and end dates and well-defined deliverables or for an ongoing service or infrastructure support activity. The CRM process model is built from a series of subprocesses that collectively form a closed loop. The resulting process is then continuously applied over the life of an activity, as Figure 1 shows. The entry point into the model occurs at the upper right part of the figure, where a potential risk is first identified. A certain amount of preliminary examination of the potential risk takes place, and if the issue is determined to be an actual risk, that is, relevant within the context of the specific business area, it will go through the cycle until it's appropriately retired. Each subsequent step is intended to increase understanding and better management of the risk profile.


Figure 1. Graphic representation of continuous risk management

The following steps comprise the CRM process:

  • identify,
  • analyze,
  • plan,
  • track, and
  • control.

First, IT staff members must identify all issues relevant to risks. These are expressed in a standardized two-part format that states a condition (which either might develop or has already occurred) and the possible consequences that are or could be associated with that condition. If the issue passes initial screening and is judged relevant, it goes to the next step; otherwise, it's rejected, and nothing further is done with this issue.

Next, staff analyzes the risk. The same evaluation criteria are applied to all risks regardless of subject domain or location within the enterprise, and attributes are assigned to the risk. Each risk is described in terms of its probability of occurrence, impact, or severity if it were to happen, and time frame when mitigation action needs to begin.

The next step involves planning and implementing the best course of action. There are four risk planning options:

  • accepting the risk as it exists today,
  • actively managing the risk,
  • watching for changes in risk characteristics (and modifying the plan as needed), and
  • investigating more about the risk until enough is known that one of the first three planning options becomes appropriate.

Once management decides on the plan of action, it assigns each open risk to a responsible entity, designated as the risk owner. This party is responsible for implementing the action plan and providing routine status reporting about the risk.

In the next step, staff members monitor and gather relevant metrics data. All open risks are routinely reported on along with other established reporting mechanisms. Provisions should also be made for exception reporting should something unexpected occur. Here, we receive information not only on the risks and their statuses, but this routine tracking also provides insight into the effectiveness of the mitigation actions themselves.

In the final phase, managers must ensure that the management of risk, as with any other management process, is working as intended. Here, risk-related information is reviewed and assessed at the overall process level. Managers have the following four actions available to them based on the updated information and the evolving state of risks:

Continue the current plan option. This is when data indicates the risk is under control.

Invoke contingency. This is appropriate when the original plan did not work as expected.

Replan. Using updated information, go back to the planning step and change course.

Close the risk. This happens when the risk's probability and/or potential impact have gone to (nearly) zero.

Management's commitment to the overarching activities of communicating and documenting risk is the element that makes this process cohesive. This includes commitment to related information at each step of the CRM cycle, until a risk has reached the end of its lifecycle. This process continues with the identification of new or reemerging risks for as long as the infrastructure activity or business process is ongoing.

Conclusion

In dynamic domains such as cybersecurity, risks are highly changeable, and the time frame for managing them can be quite short; whereas, in other areas, more opportunity to plan ahead might exist. In either case, a single process spanning an enterprise can not only play a normalizing role, but also serve as a decision-support structure to managers. By investing in a comprehensive, yet flexible, CRM process that cuts across all cybersecurity activities, the result is to strengthen its effectiveness while paying high dividends well into the future.

Wayne Jones is Associate CIO for Cyber Security at the National Nuclear Security Administration. Contact him at wayne.jones@nnsa.doe.gov.

Al Gallo is Senior Program Manager at the National Nuclear Security Administration. Contact him at al.gallo@nnsa.doe.gov.



Commonly Asked Questions

The following are questions and answers about a processed-based approach to managing cybersecurity risks:

Q: What's the best way to get started?

A: As with everything in the work environment, it starts with senior management. They must not only issue a policy statement, but also be seen as actively supporting the implementation of an organization-wide program of continuous risk management (CRM). After that, a procedure must be articulated and training must be provided. To be successfully incorporated into the organization, the CRM process should be designed to mesh with existing processes and activities.

Q: Where is the best place to look for risks?

A: The short answer is "anywhere and everywhere." Initially, a set of baseline risks will be identified, and, generally speaking, these are the issues that are obvious to anyone with any experience. Beyond that, mine relevant lessons learned, text books, benchmarking similar organizations, professional publications, and peers - basically, make use of any relevant resource.

Above all, however, engage people at all levels of the organization - especially, the line folks. They are the ones who are most often the first to know when something is not working the way it should, and they probably have a good sense of what might be wrong. People are one of the most important organizational resources and must be included in every step of the CRM process.

Q: Are there any other ways to discover risks?

A: Several tools exist to help kick start the identification process. Consider such things as industry standards, checklists, lessons learned, and best practices databases when trying to anticipate what could go wrong. In domains concerned with safety issues or the building complex technical systems, organizations routinely use analytical techniques such as failure modes and effect analyses, probabilistic risk assessments, fault tree analyses, and/or failure modes effects and criticality analysis. Reviewing any of these engineering products can help anticipate future risk conditions.

Q: What's the difference between a risk and a problem?

A: A risk is a potential problem. Even when chances are 90 percent that something adverse might happen, there is still a 10 percent possibility that it might not; so, strictly speaking, it's still only a risk. The moment in which the potential becomes a reality, it's a problem, and you must deal with it as such. Remember, people manage risks, but they deal with problems.

Q: What is meant by the statement: all risks are not created equal?

A: Not every risk is equally critical; nor is each one equally imminent. Since money and time are limited resources, the CRM process is ranks risks in order of their importance. Generally, most organizations tend to follow the 80-20 rule, which implies that 80 percent of an organization's risk profile comes from the top 20 percent of its risks. Thus, it only makes sense to focus and allocate resources to the most important risks, while remaining aware of the others.

Q: What is meant by expressing a risk in a standardized two-part format?

A: In large organizations, especially ones with multiple geographically dispersed divisions, standardizing the statements of risk has several benefits. Beyond enabling Division A to instantly comprehend the issues facing Division B, standardization generally helps breakdown stove pipes and organizational barriers. It also facilitates communication and can lead to increased efficiency, as in the case where multiple parts of the same organization identify similar risks - a frequent occurrence since they are generally working toward common goals and share the same constraints.

Q: What is an example of the two-part format?

A: A risk is made up of both a condition and its logical consequences. The condition is a single phrase stating some assumption regarding key circumstances or situations that are causes of concern; the consequence is a single phrase that describes the possible negative outcome. Figure A shows a way of viewing a risk statement.


Figure A. A risk statement.

Q: What if a condition is so broad that is has multiple consequences?

A: If you will be mitigating the condition, then focus on that, and limit the number of risks you record. If, for example, you cannot influence an impending budget cut, you will likely have to manage a whole range of consequences on almost a case-by-case basis. For example, the 10 percent budget cut might have to be spread across several areas within the organization. R&D activities, plant expansion, outreach, bonuses, and so on, could all be impacted, and managing each of these consequences would be unique. Thus, it would be appropriate to identify more risks so that their management could be assigned and tracked by the most appropriate entity.

Q: What are risk rating criteria?

A: Risk rating criteria are frequently represented in a matrix and are clearly established as a fundamental part of a risk management plan. The criteria and rating system is something that is carefully crafted by senior management and expressed in terms that can be used across various integrated teams or entire organizations. It is the common rating criteria that enable the comparison of like elements to each other.

Q: Is there a recommended way of summarizing and reporting a large number of risks to upper management?

A: The best thing to do is to use existing reports and formats; however, Figure B illustrates one highly effective style for communicating risks at some point in time. The numbers in the grid indicate the total number of risks in each category of attributes, and the numbers in the stoplight are the sum of the numbers by group.


Figure B. A format for summarizing and reporting a large number of risks.



Reprinted from IT Professional, vol. 9, no. 2, March/April 2007, pp. 10-15.

Advertisement




Suggestions