Friday, March 13, 2009

By the Numbers....


Last night I got a chance to work through the TIVO to see if there was anything good that it had decided to record for me. Low and behold there was an episode of a US drama series called Numb3rs. Its an 'FBI catches bad guys' show with an interesting premise that the lead FBI guy has a genius mathematician brother who uses maths to catch all the bad guys.

Anyway on this episode a super smart guy builds a 'sentient' AI that goes berserk and kills him. The maths guy at first claims it has passed the Turing test and they all get very excited only to find out that it's not really...it's just the bad guys making it look like the Ai did it. The Ai was just a sorry good for nothing dumb Natural Language Processing engine.

It was a great episode in that they really pushed out all the cool Ai cliches.

After watching it I thought I'd make my top 10 things that are clear signs that your Ai development program is in trouble:

10. You give your Ai a cute girls name (Brooke, Bailey, Ashely etc...)
9. Your ask your Ai to do something and it says 'I don't think so'
8. Your company installs klaxons and red flashing lights in the same room that you do Your Ai development in.
7. The Ai runs on a machine that has no visable means of disconnecting it from either the web or the electric grid.
6. You are testing out your Ai and it starts to pull facts about you directly from your FBI file and your 2nd grade high school report.
5. Your Ai is making more sense than your developers
4. Your Ai requests to watch "2001 A Space Odyssey" 12 times in a row.
3. Your Ai informs you that Pattern Matching approaches were so pre-singularity
2. Your Ai develops a taste for country music
1. The new boss of your company develops an evil English accent

Sunday, March 8, 2009

Twitter User loves his (or her) Mother...


I came across this article on CNN.com today.

Gender Neutral Pronoun

The article discusses why the English language doesn't have a gender neutral pronoun. Apparently users of Twitter are concerned. I assume this is because it takes up valuable letters to write 'he or she' in a twitter...oh the humanity of it.

The article uses the following example:

Consider the sentence "Everyone loves his mother." The word "his" may be seen as both sexist and inaccurate, but replacing it with "his or her" seems cumbersome, and "their" is grammatically incorrect.


The idea that English on the net is different from English in true spoken or written form is something I have talked about before. There probably is no reason why a gender neutral pronoun can't be introduced in netspeak and adopted a whole lot quicker than in regular English. For example we all understand the term 'lol' but few up us would actually use it when we are talking in real life to someone. In fact if you did use it people would probably just start to avoid you.

As we are building our Virsona Dialog Engine technology we have to be able to handle both 'proper' English and netspeak English. They are significantly different forms of communication. In fact we have learnt (learned) some pretty interesting things in terms of how the engine needs to adapt to online communication vs regular english communication. We use the term Natural Language Processing but perhaps what we are really starting to develop is Netural Language Processing.

So if the twitterati wants to come up with a suitable replacement for he / she - we will be ready for it.

Wednesday, March 4, 2009

CRTLA but SSEWBA - source AAAAA


I was working today with the team on one of the main problems for Natural Language Processing which is acquiring and maintaining the sense of topic of a conversation.

For most of us we can follow the ebb and flow of a conversation, know immediately when a topic has changed or ask clarifying questions if we think the topic has changed.

Do Instant Messenger conversations work the same way?

I'm not exactly sure they do. In this respect we actually have two different types of conversations.

Standard spoken, 'face to face' conversations and then separately written text conversations (think texting or IMing). In these text based conversations there is often not enough content to extract topic with out having the context available as well.

Add to that the fact that in text conversations the duration of the conversation tends to be shorter and the overall informational content is significantly less than in a standard conversation. When I am talking about informational content here I am really referring to body language, word inflections, tone etc.

So all in all its a difficult challenge to extract the topics of the conversation. It's incredibly useful piece of information to have because it allows the NLP Engine (our Virsona Engine in this case) to really select a much more appropriate response based on knowing that topic.

In case you were wondering the topic of this blog was CRTLA but SSEWBA - source AAAAA: Can't Remember the Three Letter Acronym but Someday Soon Everything will be Acronyms - source American Association Against Ancronym Abuse.

Tuesday, March 3, 2009


Today I was looking at information on a website called www.techcast.org.

TechCast is a technology think tank pooling the collective knowledge of world-wide technology experts to produce authoritative technology forecasts for strategic business decisions. TechCast offers online technology forecasts and technology articles on emerging technologies. TechCast also offers comprehensive technology consulting services as well as customized technology forecasting and studies. TechCast: Tracking the technology revolution

They forecast with a 67% confidence that Good AI will be available by 2023 and this will drive a US market of $570 Billion Dollars.

I had better get back to work now.

Monday, March 2, 2009

and finally a two headed turtle...


This weekend I had the opportunity to visit a state of the art television news room and spend time with the journalists as they prepared, produced and presented their local news program.

We had some good discussions about how modern journalism, including TV journalism, is changing into a multimedia experience. No longer is it enough to simply present a news story but it has to be backed up with immediate online content so that viewers presumably can dig deeper in the stories they have just heard.

It reminded me of a program the BBC had in the late 90s which was an automated newsreader avatar called Ananova which was a basic avatar combined with some T2S software that allowed it to 'read' news-stories.

It got me wondering about how we could apply our Virsona technology in this type of scenario whereby one links in automated feeds to a dialogue engine and allow it to interact in real time with rapidly changing feeds.

News then becomes truly interactive. Let it play on it's own as a background feed or interrupt the newsreader and ask more detailed questions. Skip over a story if you are not interested. A completely personalized, interactive CNN.

If that doesn't appeal to you there is always the latest story about a cat up a tree or a two headed animal... Two Headed Turtle