Create anonymous UUID, store interactions against this in a separate table, ensure PII is removed prior to storing. So instead of Max Reboo has purchased a subscription to jugs and hooters it’s user 12345678901234576 has purchased jugs and hooters. How can a future treadmill de-anonymise this? For sure if the storage is done badly then you can track back to a particular user.
Also, once again, can you link to the netflix issue you quoted above please. Thanks.
Create anonymous UUID, store interactions against this in a separate table, ensure PII is removed prior to storing. So instead of Max Reboo has purchased a subscription to jugs and hooters it’s user 12345678901234576 has purchased jugs and hooters. How can a future treadmill de-anonymise this? For sure if the storage is done badly then you can track back to a particular user.
Also, once again, can you link to the netflix issue you quoted above please. Thanks.
which is more or less exactly what netflix did -> the whole thing’s not that hard to find on google
but you need something to distinguish users at least a bit or the data’s equivalent to sales figures
you combine that “not-quite-pii” with other independent data sources that have similar “not-quite-pii” and build a complete picture
the treadmill effect comes from active research in this exact area trying to de-anonymise data sets finding new techniques to get around old ones