Wednesday November 18, 2020
Definitions are important.
How we define words determines the context of how we regulate them. In the United States, the definitions of private information defended by law are changing and affecting the entire scope of information protection. The change in definitions reflects a desire to protect more data that describes how we live.
Data protection in the early digital age in the US tends to have very specific definitions. First, the government began protecting the specific types of data affecting lawmakers, regulators, and the general public – financial / banking information, health care descriptions, and information related to children. This was the data that people believed was the most private and most likely to be misused. It was the data that many people would have been concerned about if they had shared it with strangers.
The definitions around these laws reflected the specificity of their intent.
Then came the official recognition of identity theft as a growing social problem. When information was digitized, connected to the Internet, and accessed remotely, Americans saw how that data could be used to impersonate the people it was supposed to describe and identify. Then came state laws that soon spanned all 50 states and required data subjects to be notified if their data was exposed to unauthorized individuals.
The terms defined in this first wave of data breach reporting laws were based on lists. Each law listed a number of categories of information that could facilitate the theft of a citizen’s identity. The Personal Data Breach Reporting Act definitions have generally been consistent with identifying information – name or address – with data that would allow a criminal to access accounts. This last category included bank account numbers, credit card numbers, social security numbers, driver’s license numbers, and even dates of birth and mother’s maiden name. If it wasn’t on the list, it didn’t trigger the statute. Different states added or subtracted information from the standard list, but the concept of the trigger data listed remained the same.
The CCPA destroyed this concept. As the first omnibus data protection act in the US, California’s Consumer Protection Act introduced European thinking into data protection law. Instead of having a limited vertical market like finance or healthcare, or a narrow legal goal like ending identity theft, the CCPA tried to create new rights that individuals would need to protect the data it had collected about them, and the CCPA tried to enforce those rights for companies, who previously felt that they owned the data. The CCPA has never defined anything as basic or nebulous as “ownership” of the data, but it provided a new, breathtakingly broad definition of the personal information that is at the heart of the law.
The CCPA definition was not a list. Demographics experts have known for years that 85% of the US population can be identified by name if you only have three pieces of information about them: gender, zip code, and date of birth. The more information about a person there is in your file, the easier it is to identify them and to know a lot more about them. It was therefore clear to data protection experts for a long time that relevant personal data is not a list of names or addresses, but a mathematical calculation. If your company had seven, eight, or nine facts about a person – even seemingly different facts, such as where they were at any given time and what they bought – your company could probably identify that person with the right math. This mathematical concept of accretion better encompasses a useful definition of personally identifiable information to enforce a wider range of rights than the standard lists would.
The European Union had already implemented this concept into law when the GDPR was passed. The GDPR includes safeguards for personal data that are broadly defined and a stricter set of sensitive data that is defined by category. I expect to be discussing definitions and protecting sensitive data in this area next week. The GDPR defines personal data as “information that relates to an identified or identifiable natural person (” data subject “); An identifiable natural person is a person who can be identified directly or indirectly, in particular with reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors that are relevant to the physical, physiological, genetic , mental, economic, cultural or social identity of this natural person. “
While “information about an identifiable person” is broad, the California definition is both broad and vague. The CCPA defines personal data as “information that identifies, relates to, describes, or can reasonably be directly or indirectly associated with a particular consumer or household, or can reasonably be associated with them”. It can take years for this definition to be reviewed and clarified in court. Until then, we must work with the knowledge that all information that can reasonably be associated with an individual is regulated data. What about a chunk of data that cannot in itself be assigned to a person, but which may help describe someone when it is linked to other data? That seems to fall under that definition. What is falling outside Given today’s machine learning and analysis, almost nothing.
When California interprets and enforces this definition broadly, hardly any behavioral measure or descriptive fact about any person escapes its jurisdiction. Companies that market to consumers are unwilling to meet this standard to preserve, protect, and limit the use of data. We have jumped from one extreme to the other in defining personal information.
Copyright © 2020 Womble Bond Dickinson (USA) LLP All rights reserved.National Law Review, Volume X, Number 323