How Data and Programs Are Represented in the Computer
It’s probably best that most people feel this way. However, for those of you with a thirst for knowledge and the desire to see how things work, this is what you are looking for. Before we study the inner workings of the processor, we need to expand on an earlier discussion of data representation in the computer—how the processor “understands” data. We started with a simple fact: electricity can be either on or off. Other kinds of technology also use this two-state on/off arrangement. An electrical circuit may be open or closed. The magnetic pulses on a disk or tape may be present or absent.
Current may be high voltage or low voltage. A punched card or tape may have a hole or not have a hole. This two-state situation allows computers to use the binary system to represent data and programs. The decimal system that we are accustomed to has 10 digits (0, 1, 2, 3, 4, 5, 6, 7, 8, and 9). By contrast, the binary system has only two digits: 0 and 1. (Bi- means “two. ”) Thus, in the computer the 0 can be represented by the electrical current being off (or at low voltage) and the 1 by the current being on (or at high voltage).
All data and programs that go into the computer are represented in terms of these numbers. For example, the letter H is a translation of the electronic signal 01001000, or off-on-off-off-on-off-off-off. When you press the key for H on the computer keyboard, the character is automatically converted into the series of electronic impulses that the computer recognizes. All the amazing things that computers do are based on binary numbers made up of 0s and 1s. Fortunately, we don’t have to enter data into the computer using groupings of 0s and 1s.
Rather, data is encoded, or arranged, by means of binary, or digital, coding schemes to represent letters, numbers, and special characters. There are many coding schemes. Two common ones are EBCDIC and ASCII. Both use 7 or 8 bits to form each byte, providing up to 256 combinations with which to form letters, numbers, and special characters, such as math symbols and Greek letters. One newer coding scheme uses 16 bits, enabling it to represent 65,536 unique characters. EBCDIC: Pronounced “eb-see-dick,” EBCDIC, which stands for Extended Binary Coded Decimal Interchange Code, is commonly used in IBM mainframes.
EBCDIC is an 8-bit coding scheme, meaning that it can represent 256 characters. ASCII: Pronounced “as-key,” ASCII, which stands for American Standard Code for Information Interchange, is the most widely used binary code with non-IBM mainframes and microcomputers. Whereas standard ASCII originally used 7 bits for each character, limiting its character set to 128, the more common extended ASCII uses 8 bits. Unicode: Although ASCII can handle English and European languages well, it cannot handle all the characters of some other languages, such as Chinese and Japanese.
Unicode, which was developed to deal with languages, uses 2 bytes (16 bits) for each character, instead of 1 byte (8 bits), enabling it to handle 65,536 character combinations rather than just 256. Although each Unicode character takes up twice as much memory space and disk space as each ASCII character, conversion to the Unicode standard seems likely. However, because most existing software applications and databases use the 8-bit standard, the conversion will take time. The Parity Bit: Checking for Errors
Dust, electrical disturbance, weather conditions, and other factors can cause interference in a circuit or communications line that is transmitting a byte. How does the computer know if an error has occurred? Detection is accomplished by use of a parity bit. A parity bit, also called a check bit, is an extra bit attached to the end of a byte for purposes of checking for accuracy. Parity schemes may be even parity or odd parity. In an even-parity scheme, for example, the ASCII letter H (01001000) consists of two 1s. Thus, the ninth bit, the parity bit, would be 0 in order to make an even number of set its. Likewise, with the letter O (01001111), which has five 1s, the ninth bit would be 1 to make an even number of set bits. The system software in the computer automatically and continually checks the parity scheme for accuracy. Machine Language: Your Brand of Computer’s Very Own Language So far, we have been discussing how data is represented in the computer, for example, via ASCII code in microcomputers. But if data is represented this way in all microcomputers, why won’t word processing software that runs on an Apple Macintosh run (without special arrangements) on an IBM PC?
In other words, why are these two microcomputer platforms incompatible? It’s because each hardware platform, or processor model family, has a unique machine language. Machine language is a binary programming language that the computer can run directly. To most people an instruction written in machine language is incomprehensible, consisting only of 0s and 1s. However, it is what the computer itself can understand, and the 0s and 1s represent precise storage locations and operations. Many people are initially confused by the difference between the 0 and 1 ASCII code used for data representation and the 0 and 1 code used in machine language.
What’s the difference? ASCII is used for data files, that is, files containing only data in the form of ASCII code. Data files cannot be opened and worked on without execution programs, the software instructions that tell the computer what to do with the data files. These execution programs are run by the computer in the form of machine language. But wouldn’t it be horrendously difficult for programmers to write complex applications programs in seemingly endless series of machine-language groups of 0s and 1s? Indeed it would, so they don’t.
The Processor, Main Memory, and Registers: How is the information in “information processing” processed? As we mentioned earlier, this is the job of the circuitry known as the processor. In large computers such as mainframes, this device, along with main memory and some other basic circuitry, is also called the central processing unit (CPU); in microcomputers, it is often called the microprocessor. The processor works hand in hand with other circuits known as main memory and registers to carry out processing. Together these circuits form a closed world, which is opened only by connection to input/output devices.
The Processor: In Charge The main processor follows the instructions of the software to manipulate data into information. The processor consists of two parts: (1) the control unit and (2) the arithmetic/logic unit. The two components are connected by a kind of electronic roadway called a bus. (A bus also connects these components with other parts of the microcomputer, as we will discuss. ) 10. Control unit: The control unit tells the rest of the computer system how to carry out a program’s instructions. It directs the movement of electronic signals between main memory and the arithmetic/logic unit.
It also directs these electronic signals between main memory and the input and output devices. 11. Arithmetic/logic unit: The arithmetic/logic unit, or ALU, performs arithmetic operations and logical operations and controls the speed of those operations. As you might guess, arithmetic operations are the fundamental math operations: addition, subtraction, multiplication, and division. Logical operations are comparisons. That is, the ALU compares two pieces of data to see whether one is equal to (=), greater than (>), or less than (