Today is International Data Privacy Day 2010, recognizing signing of the first international convention for privacy. Many groups around the world are celebrating this day and it is being recognized by some of the largest companies. Google has highlighted the ways it uses some of the personal data it collects about you to make your life easier, and they have published their five guiding Privacy Principles:
As discussed in my previous post “Privacy or Positioning?”, and reemphasized by Google here in their third principle – there is a wave of activity in making the collection of information transparent, and exposing tools to allow user control of the data that is collected. Most privacy policies assert the ownership of the data as that of the enterprise, as the collector. The reality of course is that while the transactional data captured is owned by the enterprise what that data means about that user can only belong to the user.
At Atigeo our technology solutions are designed to:
As someone who builds technologies that derive knowledge from data for a living, I can tell you why Facebook made their decision. Facebook’s revenue depends on their ability to extract and understand broadly not only the connectivity of individuals, but also all of the content that they generate. Facebook content provides rich semantic cues that define a person or a particular social group, for example, what they’re interested in, what is strongly in-context for them at the present time, and the themes of their life/lives. There is unfathomable value derived from syndicated access to this knowledge, for example via Facebook connect.
Until recently, the technologies acting on these types of information sets focused on simply “attribute match”- style queries, against the transactional data. This is why the data itself rather than the knowledge is commonly shared. At Atigeo, we have been working on the capability to enable the enterprise to make use of the knowledge a user’s data represents without needing to know the exact details of where it was derived from, such as transactional data. “You don’t need to know the explicit data surrounding X to understand that action Y might be of interest to me right now. You just need the ability to leverage the knowledge to offer me Y ”.
Here’s hoping for some evolutionary jumps rather than random mutations!
Happy Data Privacy Day.
I’ve been waiting for it, and this week it happened. Facebook pushed me through it
’s new privacy settings wizard. The experience was great, it was easy to use and it educated me well on the new functionality available to me and how to use it. Just when I was enjoying the experience it dropped me into the New vs. Old settings view allowing me to choose whether to use the recommended new settings or my old settings for each category of my Facebook data.
There were two things that surprised me: first, that the display didn’t show you by default what the settings were. For that, you had to mouse over each one. And secondly, that when doing so I found that—without exception— all of the recommended settings were more public than my previous ones. Needless to say I stuck with my old settings and that was that – however I expect many people would click through that whole process blissfully unaware of the significance of the changes being made.
And that is at the heart of the privacy issue in my mind – empowering consumers to control their personal data.
The past few days have been full of articles debating Facebook’s changes, but Facebook isn’t alone. There is a big push in the industry to allowing more user control of personal data as evidenced by the launch of Google Dashboard at the beginning of November and Yahoo Ad Interest Manager last week, and it is no coincidence at all that the FTC held the first meeting in its round table series, “exploring privacy”.
There is clearly a significant and positive push in the industry toward
s putting the user in control of profiles about them that are aggregated and acted upon for the purposes of advertising, and in the case of the big “sign-in” portals the good news is that opt-ins and opt-outs can be persisted beyond the life of a single cookie – an issue raised in my last post.
Facebook’s changes this week get to one of the biggest challenges in this debate – is there value in user privacy control if the default settings don’t implement that control? Google’s dashboard is another good example – while you can remove items of your search history, you don’t get to see the “insight” that Google has derived from that history, which of course is where the true value is!
Over the past 6 months, there has been a great deal of activity related to the House subcommittee hearings on targeted advertising. AT&T’s advocacy for “more transparency and consumer control” in their testimony, strikes a chord I think with the marketplace today.
I used to work for a small online ad network, and we were more than keen to participate in the privacy best practices at the time – allowing users to “opt-out” of cookie level tracking for the purposes of ad targeting by using a form on the ad network’s website. The problem of course is that opting out simply deletes your current tracking cookie from your browser, which doesn’t prevent you from almost immediately receiving a new cookie when you subsequently visit a website within the ad network. Another option is a temporary opt-out, which, unfortunately by its very nature, interferes with the user’s web experience.
Neither option provides the consumer—or the enterprise for that matter—with what they want. At best it provides the consumer with an illusion of control, but the control is entirely to the inconvenience of the user.
A new interaction paradigm—one that replaces intrusive web tracking and near-random targeting with personalization and user control—is on the horizon. Companies will either accept this new paradigm or risk being left behind.
The technology we have built at Atigeo has been designed from the ground up with the premise that a consumers’ data should be not only be transparent to them it should also be:
Giving users control of their data is not only an issue for marketers. As it turns out, it’s a necessity for all companies wanting to maintain their customers’ trust and loyalty… and to effectively compete in today’s marketplace.
In yesterday’s MIT Technology Review article Computers Can’t Answer Everything Damon Horowitz, CTO of Aardvark states, “the real power of natural language processing can only be unlocked by acknowledging its limitations and filling in the gaps with human intelligence”. Would you agree or disagree?
Aardvark is a startup that provides a service that connects users to people in your social network that you might deem authoritative and able to provide answers to your question. Aardvark is an interesting service which, assuming significant richness in your profile and the profiles of users in your network, may work well in getting your questions answered. It’s the assuming significant richness piece which is the caveat. If you and your friends have shared enough information (Aardvark can leverage your Facebook profile via Facebook Connect) then Aardvark is likely to be able to help you.
I am in the camp that strongly agrees with this position, and in reality it’s the situation that persists in the industry as a whole, although few people will describe it as such. The reason “Semantic Web” companies have focused on niche areas as implied in the article is that typical Semantic Web approaches require the creation of a rich ontology which encodes the understanding of that domain, and that understanding needs to be painstakingly created by hand – the application of the “human intelligence” the article refers to. This can make multi-vertical or general-purpose deployment of these technologies inefficient because of the need to apply human domain expertise to encode good understanding of each domain – sports, health, medicine, movies, etc. Companies like Hunch are very explicitly employing “human intelligence” by learning from user’s specific answers to questions, learning both about users as individuals and about the crowd as a whole – supervised learning.
At Atigeo we’ve taken an approach to developing and evolving understanding in a domain that requires no supervised learning, discovers hierarchy-free semantic ontologies for a domain, by reading unstructured content related to that domain and continually updates the understanding as published information about that domain evolves. While being completely automated, the technology learns implicitly in real-time as users interact with it, again benefitting from human intelligence.
I guess my point is that creating genuine artificial intelligence is always going to need human intelligence — and that there need be no difference between how we as humans learn and the mechanisms through which we enable artificial intelligences to learn; reading and learning from unstructured information, adapting to new information as it is seen and learning through interaction and feedback – Aardvark is a very human form of this.
Saw this very interesting article by Bruce D’Ambrosio yesterday on the state of recommendation systems, the value of the user’s current context, and the value of their ability to express it to a recommender, moving away from the pure passive approach.
At Atigeo we’ve been working on exciting new technology that enables businesses to go from guessing about consumers’ interests and needs to actually knowing about them. We’ve also figured out how to put users in control of their data to create a win/win situation for businesses and their customers.
Our technology combines explicit and implicit models of preference –stated preferences and consented observation of behavior respectively – to make recommendations based on its ability to learn about users’ preferences and interactions with those recommendations.
In doing so, we’ve developed some very interesting technology around inferring preferences – essentially a new mechanism for cold-start recommendations, the situation where no direct explicit or implicit evidence is available from which to deduce a user’s particular preferences, by exploiting associative, dynamically learned relevance and the comprehensive mining and understanding of a domain (e.g. sports, music, movies and entertainment, travel).
This last point helps us address the data sparseness issue that Rich MacManus’ mentions in the article referenced in my last post, namely how do you determine relevance to a profile with very few attributes? Our technology treats an implicit or explicit piece of data about an individual or an entity as defining an entry point in a vast interconnected hyperspace of knowledge about a given domain, the connectivity of which is determined both by a relevance structure learned in an unsupervised fashion from unstructured data, refined through a supervised learning via implicit behavioral feedback and interaction, and by evidence generated by explicit preferences and reasoned with using Bayesian statistics.
Our new technology has a broad range of possible applications. The question is, where will you see it first? Please stay tuned….