CYA – the Starting Point for Every Security Question
October 19, 2017
No, we don’t mean ‘cover your …’. We mean ‘comprehend your assets’ – and specifically the minimum set of inventories that are vital to accurate metrics and security reporting: devices, users and applications.
In every business, no matter how large, CISOs talk about the problems they have with this. Depending on the maturity of processes in IT operations there are three typical scenarios.
- Inventories are held in disparate excel spreadsheets, which are recognised by security and the people that own them as being out of date by the time they’ve been put together, (usually by an army of consultants).
- There is a recognized ‘golden source’ inventory, which is owned by IT operations and is believed to be largely current based on experience, but which cannot be trusted as totally complete. Even in the best cases, unless there is a rigorous process for getting devices onto the network, there’s usually a lag between ‘appears in network logs’ and ‘is registered and up to date in inventory.
- There are businesses that have mapped business service owners to applications to infrastructure, and have mapped workflows for who is accountable to manage issues across the technology stack (note: they may still share the issues mentioned in the point above).
Businesses in this last category are rare beasts indeed. According to a CISO speaking at the recent FSISAC meeting in Baltimore, Finance as an industry weighs in at 0.5 on the maturity scale for device inventory management on a scale of 1-5. That’s a problem, because without understanding the size and current estate of the ‘things you do security to’, security teams run into difficulties:
- It’s hard to know if controls have the coverage they should do. For example, is your vulnerability scanner hitting every asset with the regularity you expect? Are there assets it never hits? What vulnerabilities do those devices have? Where are they in relation to attack surfaces that are available to threats?
- It’s hard to know if your controls are operating consistently (i.e. as you expect) in terms of how they are rolled out. For example, does the patch process for devices cover everything the same, or for some reason are a percentage of devices left out. Is that happening regularly, or totally randomly? And what is the management issue that’s happening to cause this?
When you layer on exceptions (e.g. temporary risk acceptance) and exemptions (e.g. permanent risk acceptance), it quickly becomes obvious that without the ability to cumulatively track this – and understand whether compensating controls are doing their job – you have no idea if you’re at risk of compromise from ‘death from 1000 cuts’: a build-up of compound issues that, taken by themselves, are acceptable; but which cumulatively are not.
And this is just at the ‘operations’ layer, long before you throw in any probabilistic analysis for financial loss like those of FAIR and other Baysean / Montecarlo methods.
Not only is current inventory important to establishing baselines for more complex risk and threat analytics; accuracy is also important to ensure that things that shouldn’t be there are removed. This could be stale hosts, users who have long left the firm, and other data that creates ‘garbage in’ to the results of metrics and analytics that will end up with at least some percentage of ‘garbage out’.
In the example of device inventory, security teams can look far more to their data to do the work for them. Taking readily available sources that are discovery-led (Nmap, CMDB discovery modules, Vuln Scanner), network based (i.e. what is talking on the wire? Netflow, Windows event logs, DHCP), endpoint detection & response technology (e.g. AV), change management (patching) endpoint tools (e.g. SCCM, BigFix, you can quickly get a rough cut of what devices are in one source, and not appearing in another. You can measure this continuously for temporal ‘check ins’ (i.e. you’d expect AV on a device to register daily in a database, vs vulnerability scanning, which may only touch devices on a 7-day cycle). That gives you your picture of operational consistency. And with except/emptions thrown into the mix, you can see assets where controls are not functioning as they should and address it, measuring the burn-down to mitigation using telemetry rather than tickets.
Once you have this baseline understanding, data cleansing and control remediation in place, you can look for odd things happening that indicate an expected workflow has not been followed. For example, in user provisioning, say you are looking at HR data, and correlate it continuously with active directory. All of a sudden you see a load of users with admin privileges, but who are not admins. And you can see them using a generic user name (Windows 365). It also looks like the group in AD is for interns. Why is this? Is it anything to worry about? What do the logs say about how those privileges are being used?
This kind of ‘risk hunting’ can make sure environments are managed to stop windows of exploitability occurring that threats can exploit, even if they only happen for a short time.
Walking down this path is an effective way for security teams to use data to ‘clean up shop’ on the upstream foundations of cyber-hygiene that are cheaper and less resource intensive than the downstream activities of detecting and responding to potential or actual incidents.
A comprehensive, well maintained inventory can also feed better context into detection processes, helping to reduce confusion and effort that arises when incidents occur, when a trusted inventory is very handy.
It also gives stakeholders a good understanding of how much trust they can have in a data source at a given a point in time. That’s critical, because for people to trust insights to support decisions – especially when those metrics are being reported to Executives – they first have to have confidence in the completeness and accuracy of the picture they’re being shown.
Questions like ‘How sure are you that’s everything?’ or challenges like ‘That bit of data is wrong, and I’m now going to question all your analysis…’ are not uncommon hurdles that security teams have to navigate when pushing data up the chain or across to peers in IT. And while it’s not always the case you need to be at 100% confident you have everything, if you look at recent breaches, it’s increasingly essential to have strong handle on how much you know you know, and how much you know you don’t.