Insurance – ESN Data (Openclassroom)

GDPR‑Compliant Data Collection

Fictional

|

31/12/2024 > 16/01/2025


Context

I was hired by Dev’Immediat (fictional), a car insurance broker sanctioned by the CNIL for GDPR non-compliance. Clara Daucour (CEO) and Jean-Luc asked me to:

  • Formulate 5 data management recommendations for the CRM to lift the sanction,
  • Extract and fully anonymize the CRM data using SQL and Power Query,
  • Document each step of the anonymization process in a 10-page report.

Datasets

  • Initial customer database in SQLite format: one table named CRM_2022_complet with 28 columns and 1,157 customer records.
  • Data dictionary: a file describing for each column its name, data type (VARCHAR, INT, DATE, etc.), and business meaning.
  • Raw CSV export: the SQLite table exported to a CSV file with 1,157 rows, used as the working dataset in Power Query.

Workflow

  • Wrote the GDPR recommendations (2-page PDF), following the principles of purpose limitation, data minimization, transparency, consent, and data retention.
  • SQL extraction from SQLite (SELECT * FROM CRM_2022_complet) then CSV export.
  • Anonymization and preprocessing in Microsoft Power Query using M code:
    • Removal of sensitive columns
    • UUID generation
    • Binning of values (income, quotes, property value)
    • Address and date transformations
    • Conversion of attributes into non-identifiable categories
  • Post-anonymization checks via SQL queries to ensure UUID uniqueness, removal of identifiable values (rounded incomes, month/year date granularity, regional codes), and consistency of binned fields.

Insights

5 operational recommendations validated and ready for implementation (documented purposes, minimized collection, tracked consent, transparent disclosure, defined retention periods).
Anonymized CSV ready to use by the sales performance team (unique UUIDs, 13 non-sensitive columns, standardized bins).
Detailed report (10 pages) explaining 20 Power Query and SQL steps to ensure traceability and compliance.
SQL tests confirmed that no personal data can be re-identified, meeting CNIL compliance criteria.

Business Impact

Thanks to this project, Dev’Immediat was able to:

  • Have the CNIL restriction lifted by proving GDPR-compliant processes,
  • Continue its business operations using anonymized data,
  • Reuse the protocol for future legal data extractions and collections.

Links

Recommendation

Data

Report

Presentation