Can you comment on the impact of privacy regulation on Big Data & Analytics?

By: Bart Baesens, Seppe vanden Broucke

This QA first appeared in Data Science Briefings, the DataMiningApps newsletter as a “Free Tweet Sized QA” — where we answer a data science or analytics question of 140 characters maximum. Also want to submit your question? Just Tweet us @DataMiningApps. Want to remain anonymous? Then send us a direct message and we’ll keep all your details private. Subscribe now for free if you want to be the first to receive our articles and stay up to data on data science news, or follow us @DataMiningApps.


You asked: Can you comment on the impact of privacy regulation on Big Data & Analytics?

Our answer:

In recent years, we have seen a dramatic increase in regulatory attention being put towards ensuring privacy and data protection concerns, both in the US and the EU. The emergence of big data and analytics has stimulated a lot of new opportunities to understand patterns in customer behavior, but the ever-increasing thirst to capture and store data has also uncovered new privacy concerns and a call to construct ethical frameworks for data scientists. The White House, for instance, recently released a report “Big Data: A Report on Algorithmic Systems, Opportunity, and Civil Rights”, laying out a national perspective regarding data science ethics. Other authors have also warned against ways how big data can harm minorities and the underprivileged. Finally, the increased awareness regarding cyber security has also raised concerns regarding how data is stored and analyzed.

In the EU, these concerns led to introducing Regulation (EU) 2016/679 (the General Data Protection Regulation, or “GDPR”), published in May 2016 with enforcement starting in May 2018. The GDPR represents a significant step in developing privacy, and lawmakers predict that almost every organization based in the EU or does business in the EU will be affected. The GDPR raises the bar for compliance, openness and transparency. Some key articles in the regulation include: the right to be informed about how your personal data will be used, the right to access and rectify your personal data, the right to erase your personal data (this replaces the stricter regulation on the right to be forgotten in the Directive), and the right for human intervention in automated decision models, such as analytical prediction models. The GDPR will impact a huge range of companies and data processors. Education in data protection and privacy laws will hence become a critical success factor in the years to come.

In the US, data privacy is not highly regulated. Access to personal data contained in credit reports (provided by Experian, Equifax, TransUnion, etc.), for example, may be retrieved by third-parties when seeking employment or medical care, or making purchases on credit terms. There is no all-encompassing law regulating the acquisition, storage, or use of personal data in the U.S., although partial regulation exists, such as the Privacy Act of 1974, which establishes a code of fair practice to govern the collection of personal data, the Health Insurance Portability and Accountability Act of 1996 (HIPAA) to protect health information privacy rights, and the Electronic Communications Privacy Act (ECPA) of 1986 that establishes sanctions for interception of electronic communication.

Unlike the US approach to privacy protection, which relies on industry-specific legislation and self-regulation, the EU relies on comprehensive privacy legislation (see the Directive and the GDPR above). To bridge these different privacy approaches, the US Department of Commerce in consultation with the European Commission developed the EU-US Privacy Shield. The EU-US Privacy Shield is a framework for transatlantic exchanges of personal data between the EU and the US. Considering the now-upcoming GDPR, however, lawmakers have raised the issue that the newer regulation is deemed incompatible with the EU-US Privacy Shield legislation, as it would no longer permit processing EU personal data by US companies. It will remain to be seen how the two sets of legal provisions can be harmonized. To provide an example, in the US, the right to erasure is more limited and only seen in case law (i.e., the law as established by the outcome of former cases, also called precedents) unlike in the GDPR, which guarantees that right to any EU subject. As a result, coming to an agreement will be difficult. The current plan is to perform a joint annual review of the Privacy Shield by EU and US authorities, so changes will likely be made. Although broad EU rules try to unify the privacy regulation within the European Union, we conclude there is still a lack of a clear international agreement on privacy. There is a strong need for a unified organism which regulates cross-border privacy and data protection with a focus on integration and transparency.