Taking Personal Data Out Of Social Networks?
As we’ve seen recently screen scraping Facebook pages violates their Facebook Terms Of Use. Dare suggests that the Facebook Platform APIs can be used to get some (but not all) of a users data but I think he’s forgetting about the conditions governing Storable Information which does not permit storing friends IDs (amongst other things). Also, in order to use the Facebook Platform in the first place, developers have to agree to the Developer Terms Of Service which clearly indicate that any data gathered (above and beyond that defined as Storable Information) while using the Facebook APIs can only be stored for 24 hours (section 2.A.4). The TOS definition of a data repository is fairly all-encompassing:
any spreadsheet, database, physical document, server, network, or other repository of information, whether centralized or distributed.
Pretty all-encompassing eh! I’ll bet there are more than a few facebook applications that are actively breaking this term of service. (Aside: The 24 hour restriction can be avoided if, and only if, the application explicitly asks the user to opt-in - see section 2.A.6. I wonder does the OutSync tool that Dare uses do that?)
I am of course picking bones here - I could go on but enough said about the minutiae of Facebook legal mumbo-jumbo. A much bigger and much more important question is. How did we end up in the situation whereby we need to take personal data out of social networks? The answer of course is that we allow multiple web services and social networks to indefinitely store overlapping subsets of our personal data as they see fit.
Let me put it another way - what would happen if we inverted the location of your personal data? What if social networks had to (periodically) contact your identity provider to get your personal information and social graph? Then this type of problem would not exist and everyone would have far greater data and service portability.
However, there are several large barriers to this happening:
- We don’t yet have an established global identity scheme for storing the critical personal and social graph information that social network websites need to operate. OpenID and OAuth provide the low level plumbing for such a scheme but a higher level standardized portable personal information protocol is required to allow 3rd parties to find out more about a user with an OpenID.
- Assuming the above existed, it would be impossibly difficult for 99% of the internet users to manage/use/understand unless it (their identity service) was managed on their behalf by the organization their work for or their broadband provider. I was going to initially say ‘was built into their OS’ but nowadays people use multiple computers that have no fixed public internet address so that’s not even close to an option.
- No large social network will ever willingly volunteer to support this. Legislation/Regulation will be required to force the existing social networks to evolve onto this identity model.
The last point is probably the biggest barrier and is likely the reason why no big player is expending significant effort to developing standards for user owned identity profiles. Given the relative lack of voice that average internet users, or even groups of users, now have (Scoble aside) legislation and/or regulation is IMHO the only way to compel the incumbents to change how the whole social network operates.


