Die-Hard Linux Bits-and-Bytes

Thursday, February 09, 2006

GoogleTalk: Google goes head-to-head with Microsoft?

This morning Google activated the GoogleTalk feature in GMail. Now one can send e-mails and chat with people all from the same page.

This sounds familiar somehow... oh ya, MSN's Web Messanger & Hotmail combo.

So how is this different? Well, for one, we know that GMail is already superior to Hotmail in many ways (please, I don't want to go into this one). Furthermore, GoogleTalk has some big advantages over MSN, mainly the fact that it uses the Jabber protocol, which means that it works with many different messaging clients. Google doesn't even need to make GoogleTalk cross-platform; it already is.

This brings up a fundamental question; is Google really going head-to-head with Microsoft? It is obvious that they have been compatitors for a while, but is Google actually trying to match Microsoft or at least MSN service-for-service? With the rumors of a Google Browser or even a Google OS floating around, this doesn't seem so unlikely.

Is this a good thing? It depends on how Microsoft responds, and how long Google can maintain their open, users-come-first approach. If Microsoft responds by improving their products, great! But there is also the danger of a full-out war between the two giants, which could lead to some nasty unethical tactics (from both sides).

I guess we'll just wait and see.

Tuesday, February 07, 2006

How to build an AI using the net... today!

Here is an interesting thought that came to me today, and I have to write it down before I can go to sleep.

Disclaimer: The ideas presented here are mere hypothetical speculations; I will not be held responsible for any consequences that arise from the information contained in this article.


Introduction

You have probably seen the movie Terminator 3 where artificial intelligence arises from the complex interconnection of many "learning" or self-modifying systems. The idea of "neural networks" has also been in the news a lot recently; they are being used for everything from simulating the human brain to making very accurate "Related Text" matches on the Safari Bookshelf system.

The idea behind neural networks, is that there are many independent nodes, performing relatively simple computations in parallel, and communicating constantly with other nodes. In such a system, learning can be accomplished by modifying the "connections" between the nodes (i.e. what set of outputs a certain set of input to a node will produce).

The internet seems to be the natural choice for the basis of very large neural networks. What I will try to show here, is how existing technologies already poses all the essential components necessairy to build a neural network which spans the entire internet.

The SMTP Protocol

One of the oldest uses for the internet is e-mail. Most e-mail servers use some sort of routing system; a daemon that will accept all incoming SMTP requests, process the message based on certain parameters, optionally modify it, and then either store it on the local system, or re-transmit to another destination.

Notice the two keywords, "modify" and "re-transmit" in the paragraph above. In theory, two mail servers could keep bouncing messages back and forth, modifying them on each re-transmission. A node could also forward an e-mail message back to itself. The rules supported by current mail servers are very complex (regular expressions, parameter substitution, envoking external processes, etc.). Thus, each node in a network of mail servers could perform computations based on the content (or maybe just the headers) of an e-mail message, and pass them on to the next node.

Now imagine that one of the rules is to call an external process, which will in turn modify the rules the server uses for the next message, based on the content of the current one. Thus, the system gains the ability to "learn" by altering its responses to specific inputs.

It is not hard to imagine how to implement such a system, program it with a basic set of rules or "instincts", and have it learn and adapt based on the input. It may even be possible to add regular mail servers (i.e. without any special rules) to the system, by exploiting the default routing behaviour. The sheer number of such "dumb" servers would still make them computationally worth-while.

TCP/IP Packets

The same general idea applies to an even more basic and wide-spread system, the TCP/IP layer. Just like mail servers, most firewalls and routers now support an ever more complex set of rules. For example, the iptables module on Linux supports rules based on every possible field and even the content of the IP packet, as well as counters and other environmental parameters. Furthermore, the rules can easily be modified on-the-fly using a command-line utility.

Let's imagine a simple scenerio; a packet of size n arrives on port p. Since p is a non-standard port, it envokes a special iptables rule, which retransmits the packet to a pre-determined list of IP addresses on port n*p (mod 65535) with the size = to the total number of packets processed by the server (mod 1400). One can begin to imagine that you can build complex computational rules from such networks.

Another good example of a useful rule would be using each server/node as an m-to-n logical gate. For example, "if all of the last n inputs had size > 500, transmit a packet of size 1000 to m predetermined IP addresses, else transmit a packet of size 10 to these same addresses"; this implements an AND gate. Again, the sheer number of nodes, and number of possible links between the nodes creates staggering possabilities.

Once again, there is a possability of exploiting un-modified systems as part of such a network, but that it would be far more difficult, since retransmitting packets is not a very common rule in most firewall configurations. However, one could exploit the TCP/IP handshake for a 1-1 link with a "smart", pre-programmed node.

Conclusion

The main idea of this article, is that we don't need dedicated super-computer clusters to build neural networks. The ideas presented here are independent of the operating systems, installed programs, hardware archetectures, location, or computing power of the individual nodes. The computing power, communication infrustructure, and standards already exists to allow the creation of almost indefinitely-complex neural networks. Furthermore, such a network does not have to interfere with the main purpose of the node systems; the network can use out-of-bounds values to transmit information, without affecting "real" traffic on the network. The main question is how do we learn to use such system, and whether they will be used for good or evil. -- Anton

Thursday, February 02, 2006

Bayes rules in human mental processes

Here is a trully fascinating article I found recently (it's up on Slashdot too). This one's to do with psychology, but also artificial intelligence, so I thought it was appropriate: http://economist.com/science/displayStory.cfm?story_id=5354696 On Slashdot: http://science.slashdot.org/article.pl?sid=06/02/02/2343232 Funny quote from the article:
A frequentist way of doing things would reduce the risk of that happening. But by the time the frequentist had enough data to draw a conclusion, he might already be dead.