In yesterday’s MIT Technology Review article Computers Can’t Answer Everything Damon Horowitz, CTO of Aardvark states, “the real power of natural language processing can only be unlocked by acknowledging its limitations and filling in the gaps with human intelligence”. Would you agree or disagree?
Aardvark is a startup that provides a service that connects users to people in your social network that you might deem authoritative and able to provide answers to your question. Aardvark is an interesting service which, assuming significant richness in your profile and the profiles of users in your network, may work well in getting your questions answered. It’s the assuming significant richness piece which is the caveat. If you and your friends have shared enough information (Aardvark can leverage your Facebook profile via Facebook Connect) then Aardvark is likely to be able to help you.
I am in the camp that strongly agrees with this position, and in reality it’s the situation that persists in the industry as a whole, although few people will describe it as such. The reason “Semantic Web” companies have focused on niche areas as implied in the article is that typical Semantic Web approaches require the creation of a rich ontology which encodes the understanding of that domain, and that understanding needs to be painstakingly created by hand – the application of the “human intelligence” the article refers to. This can make multi-vertical or general-purpose deployment of these technologies inefficient because of the need to apply human domain expertise to encode good understanding of each domain – sports, health, medicine, movies, etc. Companies like Hunch are very explicitly employing “human intelligence” by learning from user’s specific answers to questions, learning both about users as individuals and about the crowd as a whole – supervised learning.
At Atigeo we’ve taken an approach to developing and evolving understanding in a domain that requires no supervised learning, discovers hierarchy-free semantic ontologies for a domain, by reading unstructured content related to that domain and continually updates the understanding as published information about that domain evolves. While being completely automated, the technology learns implicitly in real-time as users interact with it, again benefitting from human intelligence.
I guess my point is that creating genuine artificial intelligence is always going to need human intelligence — and that there need be no difference between how we as humans learn and the mechanisms through which we enable artificial intelligences to learn; reading and learning from unstructured information, adapting to new information as it is seen and learning through interaction and feedback – Aardvark is a very human form of this.