June 14, 2017
A large fraction of my briefings this year have included a focus on data security. This is the first year in the past 35 that that’s been true.* I believe that reasons for this trend include:
- Security is an important aspect of being “enterprise-grade”. Other important checkboxes have been largely filled in. Now it’s security’s turn.
- A major platform shift, namely to the cloud, is underway or at least being planned for. Security is an important thing to think about as that happens.
- The cloud even aside, technology trends have created new ways to lose data, which security technology needs to address.
- Traditionally paranoid industries are still paranoid.
- Other industries are newly (and rightfully) terrified of exposing customer data.
- My clients at Cloudera thought they had a chance to get significant messaging leverage from emphasizing security. So far, it seems that they were correct.
*Not really an exception: I did once make it a project to learn about classic network security, including firewall appliances and so on.
Certain security requirements, desires or features keep coming up. These include (and as in many of my lists, these overlap):
- Easy, comprehensive access control. More on this below.
- Encryption. If other forms of security were perfect, encryption would never be needed. But they’re not.
- Auditing. Ideally, auditing can alert you to trouble before (much) damage is done. If not, then it can at least help you do proactive damage control in the face of breach.
- Whatever regulators mandate.
- Whatever is generally regarded as best practices. Security “best practices” generally keep enterprises out of legal and regulatory trouble, or at least minimize same. They also keep employees out of legal and career trouble, or minimize same. Hopefully, they even keep data safe.
- Whatever the government is known to use. This is a common proxy for “best practices”.
More specific or extreme requirements include:
I don’t know how widely these latter kinds of requirements will spread.
The most confusing part of all this may be access control.
- Security has a concept called AAA, standing for Authentication, Authorization and Accounting/Auditing/Other things that start with”A”. Yes — even the core acronym in this area is ill-defined.
- The new standard for authentication is Kerberos. Or maybe it’s SAML (Security Assertion Markup Language). But SAML is actually an old, now-fragmented standard. But it’s also particularly popular in new, cloud use cases. And Kerberos is actually even older than SAML.
- Suppose we want to deny somebody authorization to access certain raw data, but let them see certain aggregated or derived information. How can we be sure they can’t really see the forbidden underlying data, except through a case-by-case analysis? And if that case-by-case analysis is needed, how can the authorization rules ever be simple?
Further confusing matters, it is an extremely common analytic practice to extract data from somewhere and put it somewhere else to be analyzed. Such extracts are an obvious vector for data breaches, especially when the target system is managed by an individual or IT-weak department. Excel-on-laptops is probably the worst case, but even fat-client BI — both QlikView and Tableau are commonly used with local in-memory data staging — can present substantial security risks. To limit such risks, IT departments are trying to impose new standards and controls on departmental analytics. But IT has been fighting that war for many decades, and it hasn’t won yet.
And that’s all when data is controlled by a single enterprise. Inter-enterprise data sharing confuses things even more. For example, national security breaches in the US tend to come from government contractors more than government employees. (Ed Snowden is the most famous example. Chelsea Manning is the most famous exception.) And as was already acknowledged above, even putting your data under control of a SaaS vendor opens hard-to-plug security holes.
Data security is a real mess.