Data collection for business purposes is at an all-time high, with organizations managing 10 times more data on average than they did five years ago. Enterprises often leverage data analytics to uncover meaningful insights within these accumulations of data or data reservoirs, which leads to data-driven decisions that can improve business outcomes.
While data collection is undeniably useful for businesses seeking a competitive advantage, it is not without security risks. Collecting data can open a company up to threats like ransomware, malware, hacking, and data breaches or leakage.
The more data a business collects, the larger the surface area for security risks becomes. This increases the number of vulnerable points in data systems and networks. For example, in July 2021, attackers accessed and hacked T-Mobile servers and databases that contained personally identifiable information of millions of current, former, and prospective customers. According to security experts, this was the result of malicious actors exploiting security vulnerabilities in T-Mobile’s expansive digital landscape. Once the attackers had this backdoor access, they were able to locate valuable data and exfiltrate it.
To mitigate these liabilities, companies are employing data minimization principles. These principles limit the scope of personal data collection and retention to only what is necessary for fulfilling a specific purpose.
In this article, we will delve into what data minimization is, its benefits, and how to apply data minimization principles in your organization. We will also explain how solutions such as WinZip Enterprise® enhance data protection and help you satisfy current data minimization standards.
What is data minimization?
Data minimization is one of the essential data protection principles. Instead of collecting and saving every piece of personal data that crosses your company’s system, the data minimization principle requires you to collect and retain only the minimum amount of data needed to provide a product or service.
First introduced by the EU General Data Protection Regulation (GDPR), the data minimization principle requires that when companies collect and process personal data, it must be:
- Adequate to satisfy the stated purpose of data collection.
- Relevant to the rational needs of that purpose.
- Limited to what is necessary for that purpose.
This means that any data collected is to be used for an immediate and necessary purpose. Data cannot be stored on servers or in the cloud on the off chance of future use. As such, organizations need to collect as little data as possible, limit access to the data, and retain the data for only as long as it is needed.
How to apply data minimization principles
Data minimization consists of two primary best practices:
Collect only data that is relevant to the provision of your goods and services.
Do not keep the data for longer than is reasonably necessary.
A successful data minimization strategy starts by narrowing the scope of your data collection activities. If a piece of personal data does not directly help you conduct business, it should not be collected.
For example, if your website has a form where visitors can sign up for your mailing list, asking for their date of birth will result in the processing of irrelevant data. However, it would be appropriate to collect personal data such as names and email addresses.
In addition to refining collection processes, data minimization also requires that organizations reduce the volume of data already in their possession. Start by taking a comprehensive inventory of your existing data stack or inventory. This includes not only the overall volume of data the company has, but where it is located, how long it has been stored, and who can access it.
Once you’ve assessed your current data inventory, the next step is to identify the purpose for its collection, such as the delivery of goods and services, advertising, refining marketing strategies, or other business functions.
Be specific in defining the purpose of the data, and ensure that business stakeholders and data subjects both understand how and why it is collected, retained, and used.
Data minimization and regulatory compliance
Numerous privacy regulations highlight the importance of data minimization.
For example, data minimization is addressed in Article 5 and Article 25 of the General Data Protection Regulation (GDPR):
Article 5 describes the principles that govern how personal data is processed.
Article 25 sets forth requirements for technical and organizational measurements to implement data protection, including data minimization.
Since the GDPR took effect in 2018, there have been over 900 fines issued for violating its principles. In October 2020, for example, clothing retailer H&M was fined 35.3 million euros for violating data minimization principles. The company collected and stored sensitive personal data about its employees, and a lack of access controls led to a company-wide exposure of this protected data following a configuration error.
At the federal level in the United States, data minimization principles are seen in the Health Insurance Portability and Accountability Act (HIPAA) and the Gramm-Leach-Bliley Act (GLBA).
The HIPAA Minimum Necessary Standard requires covered entities to make a reasonable effort to limit access to protected health information (PHI) to the minimum needed to accomplish a specific purpose.
Under the GLBA Safeguards Rule, financial institutions must develop, apply, and maintain processes to securely dispose of customer data within two years after the date of the information’s last usage.
Another privacy standard that deals with data minimization is the Payment Card Industry Data Security Standard (PCI DSS). PCI DSS standards are concerned with securing the confidentiality and privacy of personal cardholder data. The use of data minimization principles can help organizations satisfy PCI DSS Requirement 3 and Requirement 7:
Under Requirement 3, unless absolutely necessary for business functions, cardholder data should not be stored at all. For cardholder data that must be stored, it is the organization’s responsibility to limit the storage time and purge data that has reached a specified retention period.
Requirement 7 restricts access to cardholder data to only those who need it for specific business responsibilities.
There are also state-level laws that include data minimization principles.
For example, in 2020, the California Privacy Rights Act (CPRA) became the first US privacy law to specifically require data minimization. The CPRA requires that data collection must be limited to only what is necessary for an explicit purpose and that the data be retained for no longer than absolutely necessary.
Virginia also has a comprehensive privacy law—the Virginia Consumer Data Protection Act (CDPA). Companies subject to CDPA must limit the collection of personal data to what is necessary for current business purposes and companies cannot use data without prior disclosure to affected individuals.
Like California and Virginia, Colorado has comprehensive consumer privacy legislation. The Colorado Privacy Act (CPA) limits the collection of personal data to what is necessary in relation to its specified purpose. Collected data cannot be used for secondary purposes unless the individual’s consent is obtained first.
Stockpiling data is a business risk
Even if your enterprise is not subject to regulatory provisions that mandate data minimization, the practice of reducing data storage waste is still beneficial. In the age of big data, there is a tendency for companies to collect and store every piece of data they can for potential future use.
Maintaining large stockpiles of unneeded data not only runs afoul of GDPR and other privacy rules, but it also increases privacy risks and operational costs. Most companies only analyze 12% of the data they have, meaning that the remaining 88% takes up storage space without providing any meaningful value.
Around 5 out of 10 organizations today rely on cloud data storage, and the costs can be substantial. For example, storing a single terabyte (TB) of data costs an average of $3,351 per year, and cloud storage spending accounts for 30% of a company’s IT budget. Accordingly, collecting only the data you need reduces the costs associated with data retention and storage.
Data minimization also creates a smaller digital landscape that needs to be secured against cyber-crime, theft, and loss. The average data breach involves more than 25,000 records and costs the affected organization between $3.86–3.92 million. In the event of a data breach, data minimization practices limit the number of records that could be affected by the incident.
By protecting sensitive data, companies not only avoid potential penalties, but they can also enhance their reputation and build customer loyalty. If a business demands too much of an individual’s information, 84% of consumers will refuse to engage with the brand. Customers are more trusting of companies that take data privacy seriously.
WinZip Enterprise enables comprehensive data protection
Minimizing your company’s data inventory makes it easier to achieve and maintain high levels of information security, and it starts with having the appropriate solutions in place.
WinZip Enterprise is a comprehensive, streamlined solution that protects your organizational data. Thanks to customized access controls, your IT teams can restrict data access based on specific job roles and functions. Your files are kept safe with bank- and military-grade encryption, further reducing the risk of data theft or loss.
To assist in evaluating, managing, and ultimately minimizing your data inventory, WinZip Enterprise finds and flags duplicate files. In addition to reducing the burden on data storage, this process also helps identify and mitigate redundant, obsolete, and trivial (ROT) data.
Redundant data exists in multiple places, whether within a single system or across multiple platforms. On average, around 30% of your storage infrastructure might contain duplicate data. WinZip Enterprise can help companies like yours save thousands of dollars in storage and management fees by eliminating data redundancies.