Hardware Basics

This document was taken from part of the Hardware Basics chapter of the introductory book The Way Computer Graphics Works, by Olin Lathrop, published by John Wiley and Sons, 1997, ISBN 0-471-13040-0.

Copyright Notice: This document is copyright 1997 by Olin Lathrop. No part of this document, including the images, may be re-published, re-transmitted, saved to disk, or otherwise copied except as part of the necessary and normal operation of the software used to view this document.

In this chapter, we'll talk about how to get all those neat pictures out of the computer so that you can see them. It's useful to understand some of this, since it affects what we might want to do, and how we go about it. There will also be some buzz-words that are worth being exposed to.

Display (CRT) Basics

This section is about the cathode ray tube (CRT), probably the most common means for making computer graphics visible. You may be used to calling this device a monitor. A monitor is really the whole unit that includes the outer shell, internal electronics, on/off switch, front panel twiddle knobs, etc. The CRT is the "screen" part where you see the picture.

What is a Cathode Ray Tube?

The CRT is one of the few types of vacuum tubes still in common use today. It works on the principle that some materials, called phosphors, emit light if you crash enough electrons into them. The face of a black and white CRT (we'll get to color shortly) is made of transparent glass coated on the inside with a continuous layer of this phosphor material. The rest of the CRT's job is to allow control over how many electrons hit the phosphor, and where they hit it.

A spot on the phosphor lights up brighter as more electrons hit it. It starts getting dimmer when electrons stop hitting it. It stops emitting light only a short time later.

The electron gun produces a thin stream of electrons. The electron flow rate, also called the beam current, is controlled by the monitor's electronics. The stronger the beam current, the brighter the phosphor will light up. The deflection yoke magnetically steers the electron stream so that it hits the phosphor at the desired spot. This is also under control of the monitor's electronics.

*Figure 4 - Cathode Ray Tube Diagram*
A thin stream of electrons is produced by the electron gun, which is aimed at a particular spot on the screen by the deflection yoke. The inside of the screen is coated with phosphors that light up when hit by the electron beam.

So, all a CRT really does is to cause a selectable spot on its screen to shine with a selectable brightness.

Raster Scan

But, when you look at a CRT monitor, you don't just see a few lit dots. You see whole areas lit. How is that done?

The dot that can be lit up is rapidly swept across the CRT face (by re-aiming the electron beam) in a pattern called a raster scan. The dot typically starts at the top left corner. It's then swept across to the top right corner. The electron beam is temporarily shut off while it's re-directed to the left side of the screen. The beam then scans across the screen just a little below the previous scan line. This process continues until the beam finally sweeps the bottom scan line, then the whole process is repeated.

An image is formed on the monitor screen by modulating the beam current as the lit dot is moved around. High beam currents make bright areas, and low beam currents make dim areas. The raster scan pattern maps nicely to the pixels in an image. Each horizontal traverse of the beam displays one horizontal row of pixels. The pixel values within each scan line are used to control the beam current as the beam sweeps across the screen.

Figure 4 shows six whole scan lines, with the seventh still being drawn. The point of a raster scan is to eventually hit every spot on the screen, so the scan lines overlap a little. Figure 4 shows spaces between the scan lines only to help you understand a raster scan.

So why don't we just see a dot flying around the screen instead of whole areas lit? There are two reasons for this. First, your eyes continue to perceive light for a short time after the light has gone away. This is called persistence of vision. Second, a phosphor spot keeps glowing a little while after the beam goes away. This is called phosphor persistence. Therefore, as long as the spot is swept over the whole screen fast enough, it will appear as a steady area of light, instead of a flying spot of light.

So how fast is fast? Today's monitors typically scan the entire screen 60 to 80 times per second. If they went much slower, the image would appear to flicker.

Color

I can hear you thinking "I can almost believe this, but how is color possible? After all, electron beams don't come in different colors." No, they don't. What I've described so far is how black and white CRTs work. The basic process doesn't really support color. To get color, people have come up with some strange kludges and hacks. It's amazing to me color CRTs work at all.

While electron beams don't come in different colors, phosphors do. To make a color CRT, little phosphor dots for each of the three primary colors (red, green, blue) are arranged on the monitor screen. Then, three electron guns are used, one for each phosphor color.

The tricky part is to make sure each electron gun can only hit the phosphor dots for its color. This is done by arranging three dots of different colors in groups, called triads. Since there are now three electron beams coming from three separate guns, each hits the phosphors from a slightly different angle. A thin sheet called the shadow mask is suspended in front of the phosphors. The shadow mask has one hole for each triad, and is arranged so that each beam can only "see" the phosphor dots for its color. Take a look at Figure 5.

*Figure 5 - CRT Shadow Mask and Phosphor Dots*
The shadow mask is a thin sheet, shown here as semi-transparent, suspended in front of the phosphor dots. There is one hole in the shadow mask for each phosphor triad, even though it may not appear that way in this picture due to the perspective. Since the three electron beams go thru the shadow mask holes from slightly different angles, each beam can only light up the dots for its color. The three white lines represent the electron beams passing thru a shadow mask hole to light up the center triad.

In case that just sounds too flaky to be true, see Figure 6.

*Figure 6 - CRT Photographs*
These pictures are actual photographs of a color CRT face. The left picture shows the whole screen, whereas the right picture was taken from a small region near the center that was displaying the characters "1d6." Note how the image is really lots and lots of red, green, and blue dots. You see a continuous shade instead of dots, because individual dots are too small to see at the normal viewing distance. Your eyes blend the dots together, much like in the left picture. Try looking closely at a color CRT with a magnifying lens or a jeweler's loupe.

It's important not to confuse phosphor triads with pixels. Pixels are the individual color values that make up the image stored in the computer. Phosphor triads happen to be a hack to get CRTs to display colors. The whole mechanism of three beams, a shadow mask, and phosphor triads only exists to provide separate red, green, and blue color control over the lit spot on the CRT face. As long as there are enough triads so you can't see individual ones, you can think of the CRT face as being continuous. The triads are arranged in a hexagonal pattern, while pixels are in a rectangular pattern. A CRT could be made where the shadow mask and triads are rotated a bit from horizontal with little overall effect.

Does the whole mechanism of three beams, a shadow mask, and phosphor triads sound a bit flaky? It is. How do you make sure the three beams hit exactly the same spot as they are swept across the screen? What if they don't? What if the shadow mask is off a little and the beams don't just hit the spots for their colors? Well, these are real problems.

The degree to which all three beams line up to converge to one spot is called convergence. A monitor that is poorly converged looks blurry, and shows color fringes around the edges of objects.

The degree to which each beam only hits the phosphor dots for its color is called color purity. If a CRT has poor color purity, colors will look less vivid, and there may be patches of tint here and there. For example, the top right corner may be more reddish, whereas the top left corner more greenish.

All this is very sensitive to magnetic fields, since they can affect the electron beam paths. To prevent metal in the monitor chassis from becoming magnetic, any built up magnetism must be periodically removed. This process is called de-Gaussing, and is usually done automatically every time the monitor is turned on. Listen for a low hum lasting about a second right after a monitor is switched on.

Why Do We Care?

A basic understanding of what's going on inside a CRT can be a big help when you are buying color CRT monitors. In this section I'll briefly go over the CRT buzzwords you may see in catalogs or hear from salesmen.

Keep in mind that while many salesmen are quite knowledgeable, there are also far too many that don't really understand what they're talking about. Unfortunately, it's up to you to know what you want and to sort fact from fiction.

In general, any specification is measured to appear as favorable as possible, as long as there is some remote justification for doing so. It would be nice if specifications were for what end users really get. Yeah, right.

Size:

Monitor sizes are measured in screen diagonal length. Computer monitor sizes range from about 12 to 21 inches. You might think that on a 12 inch monitor the upper right image corner would be 12 inches from the lower left image corner. Unfortunately, it's not that simple.

I'm writing this on a 17 inch monitor, but I measure only 16 1/4 inches between opposite inside corners of the bezel surrounding the screen. That's because 17 inches is the size of the bare CRT, not the final displayable area that I get to see. The monitor manufacturer buys 17 inch CRTs, and therefore claims to sell 17 inch monitors.

Worse yet, I can't even use all 16 1/4 inches of the visible CRT face. Most computer displays have a 5:4 aspect ratio, meaning they are 4/5th as tall as they are wide. Example 5:4 resolutions are 640x512, 1024x768, 1280x1024, and 1600x1280. After adjusting the image to the largest possible 5:4 area, I am left with only a 15 1/4 inch diagonal. I guess this means 15 1/4 = 17 in marketing math.

Dot Pitch:

Dot pitch indicates how closely spaced the individual phosphor triads are, which relates to the smallest detail the CRT can possibly show, or resolve. (In practice, monitor resolution is dependent on many parameters.) Remember how there was one shadow mask hole for every color triad (you can refer to Figure 5 for a reminder). The CRT can't resolve anything smaller than the distance between adjacent shadow mask holes. This distance is what's called the "dot pitch", even though it refers to triads, not dots.

The triads are arranged in a hexagonal pattern so that each triad has six neighbors. The distance to each of these neighbors is the dot pitch. See Figure 7 for a diagram of all this.

Typical dot pitch values are in the .2 to .3 millimeter range.

*Figure 7 - Monitor Dot Pitch Measurement*
This diagram shows how a monitor's dot pitch is measured. The monitor in this example has a dot pitch of .28 millimeters. The gray triangles mark triads so that you can see them more easily. The "cartwheel" in the upper left shows how each triad is the same distance from all six of its neighbors.

Triads Versus Pixels:

There seems to be much confusion among computer users about the distinction between triads and pixels, and how they relate to each other. Remember that an image is a rectangular array of pixels. A quick look at Figure 7 should show you that pixels can't have a one-to-one relationship with triads, since triads are arranged in a hexagonal pattern.

Phosphor triads are just a means of making a CRT display colors. But, you can pretty much forget triads exist because they are deliberately so small as to present the illusion of a continuous color screen.

Pixels, on the other hand, are the digital values that are used to drive the monitor's analog controls. Each horizontal row of pixels in the display hardware is used to modulate the electron beam currents for one horizontal sweep across the screen. The exact vertical placement of the beam sweeps, and the horizontal placement of individual pixels within each sweep is without any regard to the placement, spacing, and orientation of the phosphor triads. In other words, pixels just fall where they fall, and it works because the screen can be thought of as continuous, even though it happens to be made of lots of little dots. After all, most monitors have some diddle knobs that allow you to move the image up, down, left, and right, change its size, and sometimes even rotate it slightly. Clearly, pixels can be moved around without regard to the phosphor triads.

The spacing between phosphor triads, the "dot pitch", does affect the monitor's maximum possible resolution. In other words, the visual detail in a color CRT image stops increasing as more pixels are used, once the pixel spacing is about the same as the triad spacing. For example, let's consider a monitor with a dot pitch of .28 millimeters. This means the closest spacing between triads is .28 millimeters, which means there are no more than 36 triads per centimeter, or 91 triads per inch. If you've got a 1,280 x 1,024 display, that means your monitor needs to be at least 14 inches wide to truly resolve all the pixels. This comes out to a diagonal display area size of 18 inches, which would require at least a "19 inch" monitor to achieve. In practice, the advantage of denser pixels doesn't just suddenly stop when the dot pitch is reached, but it does start to seriously fall off. You will probably still perceive a better image at 1,280 x 1024 pixels on a 17 inch monitor with .28 dot pitch than at 1,024 x 800 pixels. But, it's unlikely this would be true on a 12 inch monitor with a .28 dot pitch.

Scan (or Refresh) Rate:

Scan rate, or refresh rate, refers to how fast the electron beams are swept across the screen. There are really two scan rates, horizontal and vertical. The horizontal scan rate is the rate at which individual scan lines are drawn, and is of little interest to end users. Users are more concerned with how many scan lines there are and how often the whole screen is refreshed. Typical horizontal scan rate values are in the 15 to 100 Kilohertz range (15,000 to 100,000 times per second).

The vertical scan rate indicates how often the entire image is refreshed. This directly affects how much the image will appear to flicker. Most people will perceive a monitor image to be "flicker free" at vertical scan rates of 60 to 70 herz (60 to 70 times per second) and higher.

You may be able to see a monitor flicker by looking at it out of the corner of your eyes. Humans are more sensitive to flicker at the periphery of vision than at the spot they are looking directly at. Try this with a regular (non-digital) television. These flicker at 60 hertz in North America and Japan, and 50 hertz most everywhere else. 50 hertz is so low that many people can see the flicker even when looking directly at the screen.

Interlacing:

A monitor running in interlaced mode only refreshes every other scan line each vertical pass. This means the entire image is refreshed every two vertical passes.

Why go thru all that trouble? The apparent flicker you see comes from the vertical refresh rate, whether all the scan lines or only half are displayed each pass. For the same apparent flicker, interlacing only requires half as many scan lines to be drawn. This reduces the data rate and relaxes requirements on the monitor and graphics hardware electronics, saving money. Each vertical pass during which half the scan lines are drawn is called a field. The field containing the top scan line is called the even field, and the other is called the odd field. Both fields together, meaning all the scan lines are drawn once, is called a frame.

*Figure 8 - Scan Line Interlacing*
This diagram shows a frame being drawn part way thru the second, or odd, field. The first, or even, field scan lines are shown in gray, while the odd field scan lines are shown in yellow. Note how each scan line is drawn half way between two scan lines of the previous field.

So, briefly, interlacing is a means of reducing monitor and display hardware cost, while displaying the same image with the same apparent flicker. However, there's no free lunch. Thin horizontal lines in the image will flicker quite noticeably. Since they are on only one scan line, they only get refreshed at half the vertical refresh rate. Also, interlacing introduces one more thing that can go wrong. The image will be of poor quality if the beam sweeps of each vertical pass aren't exactly half way between each other.

Display Controller Basics

A display controller is a piece of computer hardware that takes drawing commands from the processor and drives the display. This is often called the video card or graphics card. See Figure 9. The output of a display controller are the video signals for the monitor.

*Figure 9 - Display Controller Block Diagram*
The main display controller components are the drawing engine, bitmap, and video generator. These are described in the following sections.

The bitmap is the heart of any display controller. This is where the pixels are kept. The bitmap divides the remaining display controller into the drawing "front end", and the video "back end." We'll talk about those first, then get back to some more bitmap details.

Drawing "Front End"

The front end, or drawing engine, receives drawing commands from the processor. The front end figures out which pixels are being drawn, and what color, or value, they should be. The pixels are "drawn" by writing the new values into the bitmap.

The drawing command set can vary greatly from one display controller to another. To give you some idea, a typical command sequence for drawing a red rectangle from 15,10 to 34,24 might be: 1 - Set current fill color to red. 2 - Set current point to 15,10. 3 - Draw rectangle, width = 20, height = 15.

Most display controllers also allow the processor to directly read and write the pixels in the bitmap. The processor could have directly written red into all the pixels from 15,10 to 34,24 to write the same rectangle as before. However, the purpose of the drawing engine is to off-load this kind of work from the processor. Not only can the drawing engine do this task faster, the processor can go do something else once the drawing engine gets started on the rectangle.

Video "Back End"

The job of the video back end is to interpret the bitmap pixel values into their colors, and to create the video signals that drive the monitor so you can see the colors. The bitmap values are re-read each time the monitor image is refreshed. Since this typically happens 60-80 times per second, the bitmap is effectively displayed "live."

Color Lookup Tables (LUTs)

I mentioned before that one of the video back end's jobs is to interpret the bitmap pixel values into their resulting colors. This sounds a little silly. Why aren't the colors just stored in the bitmap directly? There are two reasons for this. The main reason is to require less bitmap memory. A secondary reason is to allow some correction for the weird things monitors can do to colors and brightness levels.

So, how does going thru an interpretation step save memory? Well, let's look at what it would take to store color values directly. As I mentioned before, it takes three numbers to describe a color. The standards for video signals that drive computer monitors use the RGB color space, so the three numbers would need to be the red, green, and blue color components. In computer graphics, we think of RGB color components as being "continuous" (you can't distinguish individual levels anymore) when there are at least 256 levels per RGB component. Since 256 levels requires 8 bits (2**8 = 256), or one byte, a full color requires three bytes. If your bitmap has a resolution of 1024x800 pixels, that would require about 2.5 megabytes for the bitmap. Memory usually comes in standard sizes, so you'd probably end up with four megabytes in your bitmap. (No, this isn't stupidity. There are good reasons for this, but they're beyond the scope of this book).

The cost of low end graphics boards is usually dominated by the cost of the bitmap memory, so we'd like to reduce the amount of this memory. Three bytes per pixel lets us store any color in any pixel, but do we really need this? Unless you are doing imaging, the answer is usually "no." Look at a typical screen with a few windows, text, menus, etc. How many different colors do you see? Probably not more than 16. Suppose we numbered each of these colors from 0 to 15. We would then need only four bits per pixel in the bitmap, but we'd have to interpret the color numbers into their real colors to generate the final RGB video signals.

In practice, we usually use eight bits per pixel instead of the four in the example. Eight bits allows up to 256 different colors on the screen at the same time. That's more than enough for the basic user interface, but also allows some way to see images, supports games, etc. 256 simultaneous colors requires one byte per pixel. The entire 1024x800 bitmap would then fit into just one megabyte with room to spare. Note that we've reduced the bitmap memory from four to one megabyte at a price. First we can only display 256 colors simultaneously, and second, we now have to interpret the color numbers into real RGB colors.

The interpretation job is done in the color lookup table, often just called the LUT. The LUT converts the color numbers, usually called the color index values or pseudo colors, from the bitmap into their assigned RGB colors. In our example, the LUT has 256 entries, since that's how many possible color index values there are. Each entry holds a 24 bit (8 bit per color component) RGB value.

True Color, Pseudo Color:

A system that stores RGB values directly in the bitmap is called a true color system, and one that stores color index values is called a pseudo color system. Figure 10 and Figure 11 show how the final displayed color value is determined for each pixel.

*Figure 10 - True Color Interpretation*
True color is the conceptually simple color configuration. The actual, or "true", pixel color is stored directly in each pixel. The color lookup table, or LUT, is not necessary in a true color system. It is usually present because most true color systems also support pseudo color where the LUT is needed. In true color mode, the LUT is usually loaded with values so that it has no net effect. It is sometimes used to compensate for artifacts introduced by monitors, and for special effects.

*Figure 11 - Pseudo Color Interpretation*
In a pseudo color configuration, each pixel holds an index into the color lookup table instead of a true color value. The color lookup table is required, and converts the color index values into the true RGB color values.

While a lookup table (LUT) is required in a pseudo color system, many true color systems also use them. In that case, they can be used to compensate for some artifacts introduced by the monitor, or for special effects. In practice, most true color lookup tables are just loaded with "straight thru" data, and you can usually forget them.

Let's do some examples to make sure the true color versus pseudo color distinction makes sense.

*Figure 12 - True Color Interpretation Example*
In this example we are trying to determine what the final visible color is for the circled bitmap pixel on the left. Since this is a true color example, each pixel contains separate red, green, and blue values. In this case there are eight bits per color component per pixel, so color components range from 0 to 255. The selected pixel contains red 38, green 41, and blue 40. The corresponding LUT entries are found (the left column on the light blue background) separately for each of the color components. The resulting final color values from the LUT are circled and shown on the right.
Note that in this example the LUT values are such that the final color is the same as the pixel values. This is usually the case in true color because there's usually no need for an additional interpretation step between the pixel values and the final color values.

*Figure 13 - Pseudo Color Interpretation Example*
In this example, a pseudo color is converted to a final color value. The pseudo color value from the selected pixel is 66. Therefore, the final color value is taken from LUT entry 66 for all color components, as shown.

Bitmap

The bitmap is the two-dimensional array of pixels that the drawing front end writes into, and the video back end reads from. Because frequent and high speed access is required to the bitmap, it is always (except for some rare specialty systems) implemented as a separate memory right in the display controller. You have no reason to care how the bitmap is implemented, only the price and performance of the overall display controller.

DRAM Versus VRAM:

Although you shouldn't have to care, you can sometimes choose between DRAM and VRAM. What are those, and what do they do for you?

DRAM stands for Dynamic Random Access Memory. I won't go into what that means, except that DRAM is the "normal" kind of memory that is also used to make the main memory in your computer. VRAM stands for Video Random Access Memory, and is specifically designed to function as bitmap memory.

The drawing front end can independently write into a VRAM bitmap while the video back end is reading the pixels. In a DRAM bitmap, the front and back ends have to share. In some DRAM configurations, the back end can hog the bitmap up to 80% of the time. This doesn't leave much for the front end, and slows down drawing operations. Of course there's always a tradeoff, which in this case is price. VRAMs cost about twice what DRAMs cost for the same amount of memory.

What do I recommend getting? That's kinda like asking whether I recommend a station wagon or a sports car. In general, though, I wouldn't shell out the extra mula for VRAM unless I knew I'd be running drawing-limited applications where performance is important. If you're not sure about that, just get DRAM.

What's in Hardware and What's in Software?

Take another look at Figure 9. Note that the drawing engine isn't absolutely necessary, as long as the processor has direct access to the bitmap. Such a system wouldn't need to lack in features. It would be low cost but slow. At the other extreme, a system might have full hardware support for everything from simple lines to fancy 3D operations and drawing commands. This would be faster but more expensive.

In practice, even low end systems usually have hardware support for simple 2D drawing. The incremental cost of adding such a drawing engine is small compared to the bitmap and the video back end cost. Such a system is sometimes referred to as a 2D display controller or graphics board, or GUI engine. GUI stands for "graphical user interface" and refers to these kinds of operations.

There are systems available with just about any imaginable tradeoff between what's in hardware and what the software must do. Marketing types, however, like fancy labels to make their product sound more sophisticated than the next one. Some "standard" names have emerged for some configurations. I'll make you aware of them, but keep in mind this is a moving target since companies can (and often do) make up new names, and use old names in new ways.

I've already mentioned 2D or GUI engine. This usually means a minimal drawing engine that's good at simple 2D lines, points, rectangles, pixel copies, and maybe some polygons (we'll get into what these are in the next chapter). That's all that's needed by most window systems for menus, text, popups, etc.

A 2 1/2 D display controller is intended for drawing 3D objects, but doesn't have true 3D capability. It provides the 2D support needed for 3D drawing. This usually includes allowing the color to vary across the object being drawn, dithering, and Z buffering.

A full 3D display controller understands true 3D commands. It must do transformations, lighting, and other advanced effects that don't make sense to talk about until you've read the Rendering chapter.

Technology keeps marching on. In the current trend, the cost of logic for implementing drawing engines is falling faster than the cost of the bitmap and the video back end. If this continues, we will see ever more capable "low end" systems. Who knows what tomorrow brings?

Home : Teaching : Courses : Cg