I have been reading about General Data Protection Regulation (GDPR) for more than a year now. Until now I have gone through various articles about GDPR and the consequences.But still I am yet to find an article which is relevant to what to I do. I am still searching for an answer being a Data Lake Architect what do I need to do, hence I decided to do some research and find out what it is and what do I need to do. So let’s get started.
What is GDPR?
The EU General Data Protection Regulation (GDPR) replaces the Data Protection Directive 95/46/EC and was designed to harmonise data privacy laws across Europe, to protect and empower all EU citizens data privacy and to reshape the way organisations across the region approach data privacy.
So is it only about European Citizens?
Well, yes! If you are a B2C company and have chances that you hold customers/consumers/employees from European Union then yes you need to about GDPR for sure. If you are B2B company and have employees from Europe then also you need to know GDPR.
This applies to all companies processing and holding the personal data of data subjects residing in the European Union, regardless of the company’s location.
But I don’t have any employees or customers from Europe?
If this is the case then you might not need to much about this but sooner or later you should expect the kind of rules to be repeated for most of the countries.
When is GDPR coming into effect?
According the official website, it will be enforced by May 2018. That is now!
What are the penalties for non-compliance?
Organizations can be fined up to 4% of annual global turnover for breaching GDPR or €20 Million.
So is it about ALL data?
Not really. It is about the personal data of data subjects.
So what is personal data?
Personal data is any data that with which data subject can identified. Following are some examples of personal data
- Name, Email, Photo
- Bank Details, Credit Card numbers
- Medical records, Lab results, Bio Metric records
- Posts of social media sites
- Compute IP address
- Other information which can directly or indirectly helps identifying a person.
How to handle this scenario from B2B company point of view?
If you are a B2B company then your focus should be only on employee data. Identify HR/Non-HR systems which might contain examples of such person data.
What to do if you have Data Lakes?
If you have Data Lakes then identify if you are pulling data from any such system/s where personal data exists. If the data is not being used for any analytics/BI then it is better to either drop such data or mask/encrypt it.
If you have been using Hadoop based Lakes then Ranger can be helpful to get such things done. Here is any interesting article about the same.
What if I need to use this data?
If you still need to use the data and you simply cannot drop or encrypt it then you need to take detailed approach as follows
- Identify data systems in your organization
- Classify what is personal data and what is not
- Centralise access to this data
- Monitor how this data is being used
- Anonymize or encrypt any such data to avoid any breaches.
- Automate data retention and recovery strategies.
What are the subject rights under GDPR?
Breach Notification
Under the GDPR, breach notification will become mandatory in all member states where a data breach is likely to “result in a risk for the rights and freedoms of individuals”. This must be done within 72 hours of first having become aware of the breach.
Right to Access
Data subjects have right to obtain from the data controller confirmation as to whether or not personal data concerning them is being processed, where and for what purpose.
Right to be Forgotten
Also known as Data Erasure, the right to be forgotten entitles the data subject to have the data controller erase his/her personal data, cease further dissemination of the data, and potentially have third parties halt processing of the data.
Data Portability
GDPR introduces data portability — the right for a data subject to receive the personal data concerning them, which they have previously provided in a ‘commonly use and machine readable format’ and have the right to transmit that data to another controller.
Privacy by Design
Privacy by design means that security and data protection should be embedded in system right from their designs and should not treated as additional thing to do.