Leedberg.com

The online home for Greg Leedberg, since 1995.

Friday, January 20, 2006

The Future of Billy and Daisy


Hopefully, you're familiar with my two artificial intelligence / natural language processing-based "chatbot" efforts, Billy and Daisy. If not -- they're basically programs you can download and talk to, and they will be able to learn from talking to you, enabling them to converse better over time.

I've been blessed that these two programs have actually been met with quite a bit of popularity. Billy alone has been downloaded hundreds of thousands of times (and continues to be downloaded about 100 times a day), and these programs have been featured in several websites and so forth -- they were even featured on a segment on the TV channel Tech TV, and I was interviewed for the newspaper the Sydney Morning Herald.

The last new release of either of these programs was Billy 4.1, which was released in January of 2004. I feel sometimes like a perception has grown lately that I am no longer working on these programs. While it is true that I have had much less free time since I was at grad school and have started working professionally, I actually am indeed still working on these systems.

In particular, I've been working on an AI system I call "Nani". To understand the motivation behind Nani, you have you understand how Billy and Daisy work. Within both of those systems, there is basically a few AI subsystems which process your input, and then say, "My response should contain these keywords". That list of keywords is then passed to a natural language generation system, which produces an actual response. This list of keywords currently is produced in a rather primitive manner. Indeed, in Daisy this list is actually just produced by picking out keywords from the human's input sentence.

My focus with Nani is entirely on the mechanism for producing this list. I don't want to get in too much detail, but I've been working with interesting AI technologies such as neural networks in order to produce a good list. And the key is, these neural networks will learn over time to produce better lists of keywords, based on observing how you, the human, respond to things. It's pretty interesting.

Now, I've made a major decision recently that Nani would form the basis of either the next version of Daisy, or the next version of Billy. However, I am completely at a loss as to which brand should be continued in this effort.

Maybe it help to provide some background on how I ended up with two chatbots.

Programming Billy had always been a dream of mine. The idea of a program that could talk to you, and learn, fascinated me. I wrote a very simple program in 6th grade called "Best Friend Billy", which would just randomly respond with one of about 4 sentences. I realized that my goal was beyond me at the time, but I continued to dream of achieving a real AI someday.

I started work in earnest in high school. In the beginning, there was only Billy. Billy 2 was a simple bot which could parse input sentences, and then would mostly follow a script, and would respond with a pre-programmed sentence. These sentences could contain "blanks", in which Billy could fill in words from your input sentence. So, there was some variety, but there was little actual AI.

After working on Billy 2, I decided I wanted to pursue true AI. By which I mean, I wanted a system which could start effectively no knowledge, and then learn by observation. My true interest was in machine learning of natural language generation -- basically, the ability for a program to pick up on patterns in language it observes, and then learn to reproduce intelligent-sounding sentences. I knew this was pretty experimental, so I produced a new bot -- Daisy -- in which to explore this. Daisy had absolutely no pre-programmed language of any kind, and no scripting. Because of this Daisy could learn any natural language, not just English.

I worked on Daisy for a while and refined her in versions 1.1 and 1.2 (1.2 was never publicly released). At which point, I decided the technology was good enough, and rolled Daisy into a new generation of Billy, as a subsystem. Billy 3 was a new bot that combined Daisy's language learning abilities with some new and improved scripting. I think Billy 3 was probably the most popular version of Billy of all time.

After Billy 3, I worked on new language learning algorithms, which were started completely from scratch. I called this "Daisy 2", but there was never actually a standalone Daisy 2.0 program produced. I worked with this technology towards the goal of Billy 4. Billy 4 was designed very differently from my previous programs. I came up with a bunch of AI ideas that I thought were interesting, and worked on them, largely separately. "Daisy 2" was one. There was another, "Feldman", for computing mathematical expressions contained in natural language. And several others as well. In the end, I took these disparate systems, and produced a top-level set of algorithms which attempts to make them all work together. And thus was Billy 4.

Now, 2 years have passed since the last release of Billy. Also, 4 years have passed since design work commenced on Billy 4 (I worked on Billy 4 for 2 years). My current interests in AI don't include scripting at all. I'm more interested in the sort of thing I'm doing with Nani, and the sort of thing I did with Daisy. I continually monitor the web for feedback on my software, and find that increasingly more people seem to favor conversations with Daisy over conversations with Billy, which re-affirms to me that the no-scripting approach is best in the long run.

What I'd like to do is take the keyword-picking algorithms in Nani, combine them with the language-learning algorithms from the original Daisy, and produce a new bot, sans scripting. But should it be a "Daisy" or a "Billy"? There's arguments for both.

In some ways it should be a Daisy, because it's more "philosophically" in line with the goals of Daisy, i.e., it's experimental and contains no scripting. Also, there's a lot of people out there who seem to want to see a new version of Daisy.

However, there's also the argument to be under the Billy brand. For one, "Billy" is the name of the program I really wanted to keep working on throughout my life. Daisy was just intended to be a temporary exploration into some topics. I'd rather take new good ideas and continue on with the Billy name. Also, the Billy software and name are much more widespread. Even though Daisy has a very devoted following, Billy gets downloaded probably 10x more Daisy does. So, Billy is a higher-visibility vehicle for me to get my work out there. Lastly, I hadn't intended on continuing with Daisy as a standalone program, which is why
the last couple releases weren't publicly available. At the time, I considered 1.1 to be the final version of Daisy -- all future versions would be a part of a Billy release.

So, that's where I am. I really would love any feedback or ideas in the forum. For those who are Billy and Daisy fans, you know (I hope!) that this is a big deal. I don't know when exactly "Nani" will be done, but hopefully soon.

But first I need to know what to call it!