The sensory scope of virtual reality systems

The sensory scope of virtual reality systems is determined by how many of the human senses are engaged.  The number may be weighted by whether the senses included are “high bandwidth” or “low bandwidth” in nature. Vision, hearing and touch have a higher capacity for rapid, complex transmission and thus can be viewed as high bandwidth senses for communication between humans and computers.  Thus it is not surprising that these three senses have dominated virtual reality systems.  In comparison, the senses of taste and smell are relatively low bandwidth senses and few virtual reality systems engage them.  The sensory scale of virtual reality systems is the degree of sensory bandwidth that is engaged by communication between humans and computers.  This includes both the size of the signal relative to total human perception and the realism of that signal.
Vision is the single most important human sense and three-dimensional depth perception is central to vision.  Thus, three-dimensional perception is critical for immersive virtual reality.  Human eyes convert light into electrochemical signals that are transmitted and processed through a series of increasingly complex neural cells.   Some cells detect basic object and image components such as edges, color, and movement.  Higher-level cells combine these image components and make macro-level interpretations about what is being seen.   Cues that humans use for three-dimensional perception are based on this processing system and can be categorized into three general areas: interaction among objects; the geometry of object edges; and the texture and shading of object surfaces.
Many cues for three-dimensional perception come from interaction among objects.  Key attributes of these interactions are overlap, scale, and parallax.    Objects that overlap on top of other objects are perceived as closer.  Objects believed to be similar in actual size but appearing larger are perceived as closer and objects that grow in apparent size are perceived as moving closer.  Objects that move a greater distance relative to other objects when the viewer’s head moves are perceived as closer.
Parallax vision (or stereoscopic vision) comes from the fact that human eyes see real world objects from two different angles.  Eye muscles and neural processing of the human brain work together to combine these two different images into perception a single image with three dimensions.   Muscles in each eye change the shape of each the lens to focus at the distance of the object viewed.  Other muscles change the orientations of the eyes so that lines of vision from the two eyes intersect at that same distance.  In real world vision, these two muscle functions work in harmony.   In virtual reality, they may conflict. When images are displayed very far away, then the size of the screen required for immersion is prohibitively large and it is difficult to present different images to the eyes.  When images are displayed very close to the eyes extremely high image resolution is required and the two muscle functions of the eyes tend to conflict.
One method to have the eyes see different images on a distant screen is to have eyes view the screen with different polarized filters.  This is how “3D glasses” work in movies.   The interaction of the polarized filters with colors or other attributes of the image on the screen shifts the images, causing different perspectives and depth perception.  However, this method has significant limitations.
Another method to present the eyes with different images is to use “shutter glasses.”  Shutter glasses alternatively block the image from first one eye and then the other, in synchronization with images from two different perspectives shown successively on a single screen.  When the alternating images are shown in sufficiently rapid succession, then the brain combines the two images into a single three-dimensional image.  Most Head Mounted Displays (HMDs) used in virtual reality are some type of helmet that includes: some version of shutter glasses; a relatively close high-resolution screen with an image that spans more than 60 degrees of the field of vision and moves with head motion; and a mechanical, optical, magnetic or other mechanism to track head motion.
An object’s edges separate it from the environment.  The geometry of these edges also provides perceptual cues about its three-dimensionality. The outer edges of an object form its outline and are the bridge between interaction among objects (including overlap, scale, and parallax as discussed above) and the internal orientation of the object. An object’s inner edges bridge the outer boundaries of the object and its inner surfaces and textures.  Together, the outer and inner edges of an object provide powerful cues about its three-dimensional size, location, orientation, and movement.
Early three-dimensional graphics used the basic geometry of object edges, generally combinations of straight lines, to create moving three-dimensional, transparent “wire” figures.   Although three-dimensional graphics are now much more sophisticated, the underlying geometry of object edges remains central to three-dimensional rendering.
An object’s surfaces are in the spaces within its edges. In addition to the interaction among objects and the geometry of object edges discussed above, the texture and lighting of an object’s surfaces also provide important cues for three-dimensional perception.  One of the most important aspects of three-dimensional perception of surfaces is how they interact with light.  Humans are accustomed to viewing objects illuminated from above by the sun and thus most readily interpret the three-dimensionality of objects lit from above by a single light source.  Nonetheless, illumination from multiple light sources or from directions other than above can also convey three-dimensionality if done consistently.

“Texture mapping” is an efficient method to create surfaces for three-dimensional virtual objects by overlaying basically two-dimensional texture gradients on object surfaces.   Depth perception of these surfaces can be then be refined through the use of shading and reflected light.  “Ray tracing” takes light reflection to a high level by tracking individual rays of light as they reflect among objects and ultimately bounce from object surfaces to the viewer.  Texture mapping, light shading, and ray tracing are computationally intensive, particularly for complex virtual environments with moving objects.  Fortunately for the sake of computational economy, humans do not track as much vision detail in moving objects as in stationary objects.  Thus, computational effort in virtual reality can be conserved without significant loss in perceptual realism by rendering the surfaces of moving objects in less detail than the surfaces of stationery objects.
The essence of virtual reality is fooling the human body into perceiving things that are not real.  From this perspective, it is not surprising that the human body can respond negatively, particularly when it receives conflicting signals from different senses and is not entirely fooled.  With respect to vision, one problem with current VR imaging systems is conflict between eye focus (adjusting the lens of each eye at the apparent distance of the object viewed) and eye axial convergence (coordinating the orientation of both eyes to intersect lines of sight at the apparent distance of the object).   This problem is more acute for HMD systems in which images are displayed relatively close to the eyes.
Another problem is latency (a lag) between the kinetic motion signals that the brain receives from the semicircular canals of the inner ear and the visual motion signals that the brain receives from the eyes.  When there is a lag in visual image processing, then the body receives signals of motion from kinetic senses in real-time but signals of motion from vision after the lag.
Eye focus conflict and virtual image latency can cause eye strain, disorientation, nausea and even long-run health problems.  These symptoms are called “Simulation Adaptation Syndrome” or SAS.  Females tend to experience greater SAS than males.
People can adapt to virtual reality to some extent.  Also, SAS is generally less severe when people are exposed to immersive virtual reality gradually through a series of sessions.  The sessions start out only a couple minutes long and then gradually increase in duration, with real world intermissions between sessions.
With current technology it is difficult to avoid these problems.  However, these problems may eventually be greatly reduced by evolving technologies such as: external imaging systems with variable distance imaging (such as domes with multiple layers of translucent screens), holographic imaging (with three-dimensional images projected in mid-air), or direct internal body imaging (projecting images directly onto the retinas or direct neural-coded transmission from a computer to the optic nerve or neural centers in the brain).

Research in the Digital Garbage Dump

I’m going to start by stating the obvious: for a researcher – and most others as well – it’s better to read a book than hear about it, it’s better to see a film than read about it, and it’s better to play a video game than merely watch it. Sometimes you don’t have a choice, but if you do, you go to the original. This, of course, doesn’t necessarily make your conclusions any more correct, but I do claim that the basis for them will be truer to the medium and the item you are studying. This is the obvious bit. With regards to my own doctorate project, it clashes rather badly with reality. In the following, I’ll describe the problem, show you a solution, and then talk about a major new problem which arises.
I’ll be relating this to my own project, so I’ll just take a quick little detour to outline it: My background is film science and software engineering. I will be using these two angles when looking at how video games construct their stories. I’m focusing on what I think is a neglected period; the pre-CD-ROM games. I start with the first video game, “Tennis” from 1958, and look at aesthetics and interactive developments up to about 1985. By doing this, I hope to unlock some of the storytelling secrets of the medium which today is a multi-billion cultural industry.
Doing historical research on a medium rooted in technology has some inherent problems. In comparison; studying old texts or books might be difficult, but that is because the language and/or textual symbols are unknown. As long as you are in possession of the text, you at least know where to start. With more technology-intensive media, electronic ones in particular, the problems are different. Recordings of early radio broadcasts very rarely exist, as is also the case for the first years of television. For film, the problem isn’t as clear-cut, as a good portion of the early films still exist. Yet, film provides a good example of what I’ll eventually come to:
When you are shown an old film, your viewing environment is obviously nowhere near its contemporary viewers’. But that’s not the only difference. During the 100 years of cinema, there have been many technological “standards” for the recording and screening of films. When new ones were introduced, not many bothered to keep the equipment for the old ones. For the introduction of color or sound, this doesn’t matter much, as silent or black and white movies easily can be projected on newer equipment. Changes that did create problems, however, were changes in the projection frame rate and the aspect ratio of the image. The speed change is the reason that all of the old films we see today seem jerky and comically speeded up: we no longer have projectors capable of the original lower frame rate. Similarly, only a few cinemas are still able to project films made in the “old” formats of 1:1.33 and 1:1.75 properly. The result is the same as when screening non-1:1.33 films on TV – a part of the picture is simply cut away. This problem is exaggerated when moving from the opto-mechanical medium of film, to the electronic one of videogames.
Videogames are, historically, a by-product of the computer industry. We all know how fast it moves. I don’t want to get into a lengthy technical discussion, but I feel that it will be helpful to outline some basic concepts. Those of you who are familiar with computer architecture can just doze off for a fem minutes…
Basically, a computer system consists of two building blocks; a collection of electrical and mechanical components, “The Computer”, and a set of operating instructions, “The Program”. One can very well exist without the other, but they don’t actually do very much on their own. These are the two components involved in any operation on any computer, whether it’s a word processing task, like Word, or a storage facility for data, like dBase – or a videogame. The problem is that any program can not control any computer, they have to match; programs have to be specifically written for a give computer system. For example, you can not run programs written for an IBM/Windows machine on a Macintosh, or vice versa. Likewise, programs have to be written with certain computer configurations taken into account, even within a single computer family. For a word processor user this is a monetary nuisance, an update of the program might force a computer update as well, but for one doing research on old programs – video games – it’s a disaster. To put it plainly: old tech isn’t just old tech, it’s obsolete tech. And obsolete machines are no longer interesting, they get thrown away. The result is that your task doubles. Not only do you have to track down a copy of the program in question, you must also find an operational system to run it on. A film researcher who discovers an old reel will, unless it’s damaged, of course, have few, if any problems in viewing it. A program for a non-existent computer system is virtually useless.
Muddying the waters even more, is the issue of program storage. For example: not only have the physical size of the most common data carrier, the floppy disk, changed several times, but the way data – programs – is read from or written to it is also system specific. Try inserting a Macintosh floppy into a Windows system for a quick demonstration. The problem of course multiplies over time, finding a reader for punch hole cards, the floppy’s older sister, is actually quite difficult.
These are the problems that face one who wants to look at old computer software. I admit that I have painted a rather bleak picture, if you look long and hard enough, you will always find someone who has kept the old PDP or Altair machine in the garage, but those are the exceptions that have to thoroughly searched for.
Ironically, the fast-moving technology that creates the problem, also provides a solution. The above description of a computer system was slightly simplified, and I have to be a little more specific to make my point: When I say that the system consists of physical components and a program, hard- and software, I have to add: several layers of software. In other words, the computer runs a program to run a program. This layering of software goes all the way to the center of the hardware, and even inside it, each lowest-level instruction is carried out by executing even lower-level “micro-code”. Given this, any computer should then, in principle, be able to run any program, all you need to do is to reprogram one of the levels. There are several reasons why this doesn’t quite work, the most important one is that the software line has to be drawn somewhere, so to speak, a computer can’t be all soft, it has to be a little hard too. Some of the software levels have to be hardwired and unchangeable, otherwise the machine simply won’t be buildable. In addition, hard-wired software executes faster than non-wired, especially when it’s put in dedicated components hotrodded for speed.
However, given the leaps in computer technology we see almost on a monthly basis, there is a way around this. Since today’s computer is much faster than the one from last week, it is possible to replicate and then reprogram the old one’s hard-wired software levels in the present one’s soft domain. This is called Emulating the older system.
What happens is, in effect, that the new system pretends to be the old one, so that when the old program runs, it sees what it expects to see. For example, when the program tries to access a hard-wired graphics processor, the new host computer intercepts this and gives the program the appropriate signals. In addition, it also performs the tasks the original unit was supposed to, so that the final result for the end-user is identical to the original system.
This is the ideal emulation situation, in real life it’s somewhere between this and something not resembling the original at all. This is because the problems inherent in software trying to mimic hardware. But, even with such limitations, there are several advantages to emulators, especially for research purposes.
The alternative to an emulator (and the original system), is to rewrite the code for the new system. This is not only extremely time-consuming (and in practice impossible for anyone but those in possession of the original source code), but also a problem from a scientific point of view. Some game companies have done this with their old games, and released them for for example the Windows and the Playstation systems. These new games will always be new games, not old ones. A horrid example of rewritten games, is a company who released some classic arcade games for Windows, and in the process managed to debug the original code, that is, removing some of the programming errors of the original game, errors that enabled experience players to earn extra lives by doing certain things at certain times. Clearly, the game is no longer the same. Please note that I’m not saying that the old one is better than the new, I’m merely pointing out that they are different. For historical research purposes, the old one is the interesting one.
Emulated programs do not have these problems, simply because they are not a rewritten version of the original, they are the original. What is new, is an added layer of software-emulated hardware. While playing emulated Space Invaders, if you hit the UFO with shot #27, you do get 5000 points, just like in the arcades in 1978.
Emulators are for me great tools, the one I have been using the most is also the best known; the Multi Arcade Machine Emulator, MAME. At present MAME emulates approximately 1400 different arcade video games. Arcade games are in some respects different from games run on both computers and dedicated games consoles. Arcade games are not made to execute on generic hardware, like a computer, but on dedicated chips, and often with custom made cabinets and controls. What MAME does, is that it emulates the hardware each game needs, and uses the computer keyboard (or mouse, joystick, gamepad, or whatever you might have connected) as a controller. The original game code is hardwired in chips, which are put in the cabinets. MAME takes as its code input “images” of these chips, so-called ROM-images. This is where the new problems start.
Arcade machines are not meant for home use, but many enthusiasts buy games when they no longer appeal to the ordinary gamers. These enthusiasts have, via the Internet, formed a loose-knit network where they help each other with the machines; if one of the ROM chips in, say, a Pac-Man machine go bad, someone might supply the unlucky gamer with an image of the chip, so that he or she can burn a new chip, thus making the game playable again. This is perfectly legal. When you have purchased a game, in practice these often consist of a motherboard and a corresponding set of chips, you are entitled to using it, which includes making your own chips to make it work. With MAME you no longer need access to the hardware – in effect, if you get hold of a game’s ROM images, you can play the game for free. The original copyright holders do not look upon this favorably.
A year ago, the Internet was brimming with arcade ROMs. Today, you really have to know how and where to look to find any. What happened was that the big games corporations decided to put their legal (and financial) weight behind their demands to rid the net of the ROMs. One of the most popular (and best) web-sites for both emulators and ROMs, Dave’s Classics, was overnight forced to shut down completely, with a threat of monstrous fines looming over it. A while later, the site reappeared, but with the ROM sections removed. The same has happened all over the web, as I said, trying to find ROMs today isn’t easy.
What’s my point? I have two: the first is that I have ended up with a moral dilemma. As a researcher of historical videogames, in addition to analyzing them as cultural objects, my main concern is to get hold of and also play as many of them as possible. In this, MAME is an invaluable tool. This is a strawberries-with-chocolate situation; MAME makes playing the games possible, and the existence of the emulator encourages even more ROMs to be put on the net. For me, the situation is ideal. On the other hand, what I’m doing is in principle illegal, and can be considered theft.
My second point is the serious one. The reason the game companies have any legal leverage, is that they are big and rich. The balance between overlooking and prosecuting copyright violations is fragile. When it happens, it is severe. The clampdown on the ROM sites came when it did, simply because they got too popular. A company like Namco might tolerate 10.000 people playing MAME Pac-Man, but when they see 100.000 doing it, they are thinking; “Hey, we could be making money on this!” That will be the death of emulation, “retro-gaming” and game-research as we know it. Which company will be able to resist the temptation to re-engineer the games and remove things like unwanted racism or sexism? This will also in effect stop all the bootleg versions of the games, like the french bootleg of Pac-Man where the dots are replaced with hearts, not to mention all the official territorial versions, where the games are tailored to suit different audiences. It is illegal to sell and play these games outside of their designated areas.
I don’t want to sound like some sort of doomsday prophet, but the end result will be that a portion of our recent cultural history will be lost. It does seem that some cultural artifacts, at least those that are digitally copyable and spreadable, are in the hands of the enthusiasts. I can with a great degree of certainty say that I will continue as a criminal in the future.