Several weeks ago, the Colonial Pipeline cybersecurity event took place and resonated into our everyday lives. Many folks found it difficult, if not impossible, to get fuel on the eastern seaboard. In my first blog about smart infrastructure, I focused on the multiple benefits of bringing both Internet of Things (IoT) and Operational Technology (OT) into the public infrastructure space. If you are not familiar with the term OT, Gartner defines OT as hardware and software that detects or causes a change, through the direct monitoring and/or control of industrial equipment, assets, processes, and events.
As we move towards intelligent infrastructures and smart communities, we must realize that many of these systems are ‘critical’ systems. Critical meaning that they are instrumental to human health and safety. Things such as water and power distribution as well as intelligent traffic and transit management systems. However, as the Colonial Pipeline event illustrates, other critical resources could be affected exasperating other critical systems. For example, a drastic shortage of fuel could result in eventual food shortages in our local markets. Fortunately, Colonial was able to get back online. But there were several stumbles that the organization made in response to the event. We need not go into those details here, but the lesson is always to prepare for incident response.
Instead, I would like to take a proactive stance to the situation and focus on the practices and technologies that we can use to implement a proper ‘security shield’ for our smart systems. There are a few obvious challenges that need to be considered when discussing the security of infrastructure systems, as I have listed below:
These systems tend to be highly interactive and depend upon sensor feedback and control actuation if required. In a manufacturing facility, systems are typically environmentally controlled and focused obviously at the production sites. On the contrary, Intelligent Transportation Systems (ITS) or water and power distribution infrastructure need to often deal with extreme environments while remaining operational.
Infrastructure networks typically need to span large geographic areas that bring maintenance challenges and increase the need for central operations and control. Conversely, in many instances, sensor and control streams do not go back into the central operations but will instead be actuated closer to the remote location. Edge computing comes into play to effectively shorten the required command and control loops to manage the systems in remote areas. This is particularly true with ITS. Central control will typically provide for monitoring the overall system, the Human Management Interface (HMI), and intervention if necessary. Edge compute might still be needed at distributed sites.
Critical systems cannot afford any outages. Many of these systems have strict regulations and standards that must be complied with for systems uptime. The response of the network to outages is critical in ensuring systems communication. We also must understand that not all systems outage events will be cyber of nature. In other words, other circumstances such as extreme weather can have a significant impact on our infrastructure systems. They are just as dramatic in their impact. Dealing with a weather emergency can rapidly takes us out of the bounds of normal IT best practices.
I could go on, but I will digress. If you think about the three bullets above, there should be several extensions that you can rapidly draw out in your mind if you run a county or city infrastructure. So, the challenge is clear. How do we provide security to these large complex systems spread across large geographies in sometimes remote locations?
Again, I will provide a list of approaches and best practices below:
This seems to be an obvious point. But surprisingly, it is often negated or implemented incorrectly. High-resolution cameras are of the utmost importance. Particularly night vision capable. Providing license plate recognition and even facial recording in certain enhanced systems can be a requirement. Again, the network plays a critical role in conveying such information or indications of systems tampering to the appropriate teams.
This is a much larger subject, but we will condense it into a few precious nuggets. Maybe you used a video surveillance network to get the information that you needed to react. But let’s just assume that these perpetrators still managed to compromise and disable the service. This has been known to occur in the past, but neither of these methods are commonly used unless they are amateurs. Most often, the adversary is hundreds if not thousands of miles away. Perhaps on a different continent. This situation requires a different mode of thinking, which we will focus on for the rest of this article.
Several practices that can vastly simplify the security implementation of smart infrastructure. First, many industry consortiums help define a framework for the proper implementation of IoT and Industrial Control Systems (ICS) systems. One of the first is Center for Internet Security (CIS) Controls . CIS has some great resources that assist in both planning and implementation. I also highly recommend the CIS specialized handbook for IoT and ICS security. The handbook is an excellent condensation based on a few principles that I will mention later.
Another resource to consider is the overall systems threat framework provided by MITRE ATT&CK, a globally accessible knowledge base of adversary tactics and techniques based on real-world observations. The MITRE ATT&CK database is extremely comprehensive, and unless you are a security professional, it might seem overwhelming. The MITRE knowledge base maps the potential ‘attack tree’ or the steps and methods that an attacker might use at a particular stage. Consider this valuable information to have if you have an early indication of compromise.
We also need to consider application security as well. Vulnerabilities in any application’s implementation can result in a security compromise. We need to consider how the application is sourced and/or created and have the appropriate scanning techniques to identify code vulnerabilities before malicious actors discover them. The Open Web Application Security Project (OWASP) can be beneficial in understanding what to look for and how to address the issue at hand. Every year the project releases the top ten critical risks to web applications.
Let’s not forget cloud security. How cloud services are deployed is another area where vulnerabilities can be introduced. To address cloud security, both a proper architecture and best practices can significantly assist in alleviating this concern. The Cloud Security Alliance is the place to go. The CSA provides recommendations and a community where training and certifications can be acquired.
If you reference these resources, you will find a treasure trove of guidance and practices that you can implement to secure your critical infrastructure. Here is my summary of some necessary steps:
Get an inventory of all IoT/OT/ICS systems in your environment. Understand what they are for, as well as who owns them and the purpose that they provide. Ownership is a very important point that you absolutely must address. If you have an IoT device and have no owner, the device has no place in your network. Remember the golden rule. You cannot secure what you are not aware of.
From this inventory, gain knowledge of what the normal systems communications patterns are. If possible, get this directly from the manufacturer but validate it in a lab. It would be best if you possess a solid understanding of a normal system communication profile. Additionally, if you deploy IoT equipment, and for some reason, the company has gone defunct, those devices should be removed as any potential for patching or upgrades is non-existent.
Once you have the system communication profile, you should implement policies and segmentation design for the systems in question. Share this information only on a need-to-know basis. Frankly, unless someone can give me three valid reasons why we need remote systems access for workers, I always opt for complete systems isolation with very tight controls at the security demarcation. Furthermore, you should perform constant monitoring for any anomalous behavior at this level. You can garner valuable information by comparing the ‘normalized profile’ against what the device is doing. Consider this as one of the best early indicators of systems compromise. Anomalies in behavior caught in this early stage can make a huge difference.
Do not forget about your application and cloud implementations. Be sure that they are accounted for and designate best practices with the right tools to provide consistency and assurance. It never hurts to perform penetration tests internally and externally by red teams that can be employees or outsourced to a third party. The value of using outsourced professionals is that you remove any potential bias that could occur with internal teams. Also, scanning should be performed by other teams or at least different individuals than the developers themselves. This avoids the “fox in the chicken coop” scenario.
Recently, we have seen some activity with the National Institute of Standards and Technology (NIST) in this discussion under 1800-15 and a corresponding RFC 8520. This is a consortium effort for IoT manufacturers to publish what are known as ‘Manufacturer Description Profiles’ which is defined in RFC 8520. The proposal is for a vendor to post an established information description of a system, what it does, and a set of recommended policies represented in the YANG-based JSON description. After being published and finalized just this past May, we have yet to see the industry traction that this will have. One concern is that it places a lot of responsibility and perhaps legal liability for a component that might not be of high value monetarily. One of two things could happen. NIST 1800-15 gets little acceptance or the cost of inexpensive IoT devices might increase. We will need to watch this. As always, it’s a challenge of cost, convenience, and general malaise that are the enemies to security for which we strive. Sometimes we need to make up for those differences.
It’s important to normalize behavior on your own and not be dependent on what a vendor may or may not post. But still, I think that these industry efforts show promise, and we will be taking security for these systems more seriously in the future. Time will tell. In the meantime, we give you the tools to do it on your own. Remember, as a CIO or IT/OT director, malaise is always an option. That is until something like Colonial happens. Then your day, or a year or even your whole career could take a drastic change.