Multiple GUI modality
A roadmap from human consciousness to artificial intelligence
- R6 -

Humans inhabit a world full of language, symbols and images. In order to bootstrap, an AI will need to inherit as much intellectual capital as humans are able to provide. The vast majority of this knowledge is encoded in the form of serial language, mathematical symbology and 2D imagery. An AI needs to translate this to 3D simulation scripts for subsequent abstraction to memory and integration to the world model.

Humans can interact with computers through keyboard, mouse and sometimes speech. These are all far too slow for AI. If a human wants to select a graphical screen object, he needs complex motor control and accurate visual display and feedback systems. An AI will need to have direct access to human graphical user interfaces from the conscious 3D simulation process itself, without necessarily parsing through language and especially mechanical mouse pointer control. Thus 2D graphical interfaces must ideally enter the AI as a secondary enhanced visual modality, and the spatial/mechanical pointer control replaced by standard virtual 3D object manipulation using environmental morph targets with the usual script trial animation processes for the various screen options.

Of the 120 million or so cones and rods in a human retina, only one million channels actually pass through the optic nerve, a substantial reduction in bandwidth. If you set a VGA (800x600) projector shutter speed to a quarter second and turn off the color. The experience may not be pleasant, but you could easily follow the movie. If your own vision was similarly restricted with very slightly out of focus glasses and a quarter second LCD shutter, you could get about quite OK in normal life. This is a dramatic reduction in vision bandwidth to a mere 2Mbps uncompressed or 50Kbps compressed video. A modern PC could handle many thousands of seconds of such video every single second.

If you consider all the modality inputs of the human organism, there really isn't the massive data bandwidth often thought. There's a lot of parallel processing behind it, but you could almost transmit the whole modality data set (admittedly much degraded) over a compressed 56K modem channel. Even if you don't accept these vision estimates, consider the zero vision input of blind people, who manage great intelligence from the remaining sound, touch, taste and smell senses. Yet another vast bandwidth reduction.

The realization that the essence of 3D simulation can be built from touch (spatial animation) and sound alone is quite intriguing. It implies cognition does not need a visual 2D render plane to concretize simulations. How can a 3D animation be emotionally graded without first being rendered to a perspective? More research is needed here.

Is conscious experience always derived from the subconscious simulation layers, primarily rendered back to a 2D perspective at the visual cortex region - to be perceived as mental imagery, or can the simulations be fully 'experienced' in their native 3D form? What role does rendering thoughts down to 2D vision provide?

Are un-rendered (non visualised) thoughts the origin of background feelings, again, implying emotional grading occurs seperate from visual imagery. - In a sense, feeling what is occuring in the simulation, despite the 'curtain' being pulled.

- R6 -