A fundamental theorem of automata theory is that any computer with a a certain limited set of capabilities (known as a “universal Turing machine”) can, in principle, run any program for any other computer which is also a universal Turing machine. This property of computers to impersonate one another has been of great practical value during the evolution of computer technology. When a new computer architecture is found to be better than machines of the previous generation, it can be adopted without having to discard prior investment in programs for older machines. Instead, a program can be written for the new machine which emulates the old one. This emulator program simulates the machine language of the old computer and permits its programs to be run without modification. Emulation is not without cost—an emulator introduces additional computational overhead which slows down execution of emulated programs compared to those written in the native machine language of the new computer. But, with computer speed doubling every couple of years, it's often the case that a new machine can run its predecessor's programs in emulation faster than the old machine could directly. Over time, of course, frequently used programs, and those where speed is of the essence, will migrate from running in emulation to using the new machine in native mode, but emulation avoids the painful and expensive “speed bump” which would otherwise occur if every program for an old machine had to be rebuilt for its replacement before the new machine could be used at all.
In order to be useful, an emulator program must be authentic—it must faithfully replicate the behaviour of the machine it is emulating. In principle this seems straightforward; you sit down with the detailed specification of the machine you wish to emulate, then write a program which does precisely the same things for each machine language instruction as the emulated machine. In reality, it's frequently not that simple. Many computers have “undocumented features” which more or less fell out of the low-level design of the hardware, and even detailed machine-language programming specifications are often silent on how the machine behaves when given “ridiculous” instructions, for example when asked to shift a value by more bits than the length of the register containing it. Programmers being programmers, you'll occasionally encounter software which exploits one or more of these “features”, which will fail when run on the emulator until it, too, is made to behave just like the original hardware, however silly that may have been.
In practice, it's pretty easy to determine whether a given emulation is authentic—if it runs a wide variety of software for the emulated machine and no discrepancies are observed in the operation of any of those programs, the emulator can be judged very authentic. That doesn't mean you won't eventually encounter a program which will fail, but then programs which exploit undocumented features of a given processor frequently fail when run on its next-generation hardware successor, so the emulator should not be judged too harshly if and when a discrepancy is found.
In addition to providing a bridge from one computer generation to the next, emulation of historic computers allows modern-day programmers to explore the architecture and programs of long-dismantled machines which marked milestones in the evolution of computer technology. An emulator for a UNIVAC I, an IBM 650, or a Burroughs 220 is the next best thing to having a working museum piece in your basement; you can look at computing through the eyes of the early programmers, and understand the challenges, both practical and theoretical, which they had to overcome before hardware and software technology could attain contemporary levels of development.
Developing an emulator for a machine with no existing operating exemplar can be challenging. First you have to find manuals for the machine, many of which may have been lost over time. When emulating an existing machine, you can always resolve any ambiguity in the specifications by running an experiment on the machine in question. But if you find yourself stumped by an enigmatic statement in a document describing the instruction set of Whirlwind I, wiring up a replica out of 5000 vacuum tubes to see what it really did is out of the question, even if you could locate accurate schematics almost 50 years after the machine went into service.
Still, if it's possible to obtain actual programs which ran on an historical machine, one can be reasonably confident the emulation is authentic as long as all the programs work as intended. Since the principal motivation for emulating historical machines is to permit historic software to live on after the demise of the hardware it was written for, if an emulator properly runs all of the software still in existence, it can be deemed “authentic enough” for the purpose.
In developing an emulator for Babbage's Analytical Engine, we are faced with a greater challenge than emulating an historical computer of the twentieth century. The problem is not a lack of source material; much was written about the Engine by a variety of authors over a period of more than seventy years. The principal obstacle to an authentic emulation of the Engine is that, in modern colloquial parlance, it was mankind's first bold venture into the domain of vapourware. The Engine was designed, drafted, described, discussed, and debated, decade after decade, but it was never actually built. Babbage's own description of the engine evolved substantially between his 1837 memoir, On the Mathematical Powers of the Calculating Engine and the chapter describing the Engine in his 1864 autobiography, Passages from the Life of a Philosopher. Further discrepancies appear in the 1842 Sketch of the Analytical Engine by L. F. Menabrea and the accompanying Translator's Notes by the Countess of Lovelace, as well as suggestions for modifications to the design of the Engine which might have been incorporated had it actually been constructed. The small portion of the Mill and Printing Apparatus built by Charles Babbage's son, Henry P. Babbage, between 1880 and 1888, and now in the Science Museum in London, provides no insight on questions relating to the operation of the Operation, Variable, and Number cards. Changes in the specification of the engine between Babbage's 1837 and 1864 descriptions include:
Component | 1837 Description | 1864 Description |
---|---|---|
Mill and Store | 40 digits | 50 digits |
Store Capacity | Unspecified | 1000 columns |
Combinatorial Cards | Affect Operation and Variable cards | Affect Operations cards only (?) |
It is evident from reading the available historical documents that, notwithstanding the fact that (according to Henry Babbage's paper) more than two hundred drawings of the Engine and its components existed, had construction of the Engine been undertaken, the machinery described in those drawings would have undergone modification based on experience in fabricating components and testing subassemblies, as well as refinements in the design resulting from additional insights during that process. Such is the case when proceeding from the original specification to a working prototype of an electronic computer today, and it is extremely unlikely the first computer ever built would escape this process. Henry Babbage himself, in Item 67 of his paper, noted that “Very few machines indeed are invented which do not undergo modification.”
Consequently, since The Analytical Engine was not built, we cannot be certain of what would have been built had its construction been attempted and resulted in a functional machine. Therefore, in emulating a machine which was never actually manufactured, it is necessary to apply a different definition of authenticity than applies to emulation of machines where a physical hardware realisation, past or present, provides an absolute standard against which the emulator can be tested. The following guidelines were used in developing the Analytical Engine Emulator:
- The emulator should be able to perform any calculation Babbage intended The Analytical Engine to be capable of.
- The emulator should contain no capability not attributed to The Analytical Engine in contemporary historical documents, nor be inconsistent with Babbage's last published description of it in his 1864 autobiography.
- The notation used for expressing programs for the emulator should follow directly from that used in the examples in Sketch of the Analytical Engine and the notes thereto. Transforming a calculation written in that notation into a program for the emulator should be as straightforward and mechanical a process as preparing Jacquard cards for the Engine was envisaged to be.
- The operation of the emulator should be comprehensible in the same way observing an operating Engine would have been. In particular one should be able to see numbers change in the Store, cards succeed one another on the prism of the reader, and results appear on the printer.
- Babbage conceived the Engine as part of a system for mechanising numerical computation. The emulator should provide for all parts of the system, including the machine's human attendant, the clerks who prepare pasteboard cards to control the machine, and a library of previously-prepared and verified cards for frequently performed calculations.
How successful the emulator presented in these pages has been in achieving these goals is up to you to judge based on comparison of the historical documents describing the architecture of the Engine and the Programming Cards manual for the emulator. The following sections discuss some potentially controversial design choices in the emulator, and can best be understood after consulting the aforementioned sources.
Babbage's original 1837 description of the Engine specified 40 digit capacity for columns in the Store, with the Mill capable of multiplying two 40 digit numbers yielding an 80 digit product, and dividing an 80 digit number by a 40 digit divisor with a 40 digit quotient and remainder. The 1842 Sketch of the Analytical Engine is silent on this issue, but in Babbage's 1864 description of the Engine he describes the Store having 50 digit capacity and Mill able to add and subtract 50 digit numbers and accommodate 100 digit products and dividends. The emulator follows Babbage's last word on the subject, and adopts 50 digit capacity.
(From the standpoint of contemporary computer architecture, this “word length” seems colossal. A 50 digit decimal number is equivalent to a 167 bit binary word length, while most contemporary computers cannot operate on integers longer than 64 bits without resorting to software multiple precision packages. The first commercial computer, UNIVAC I, which like The Analytical Engine used decimal arithmetic, operated on numbers of 11 digits plus sign. Why did Babbage choose such a large a number of digits, especially since every additional digit not only increased the complexity of the Mill but required an additional disc in every single column of the Store? In the 1837 document, he observes that “The result of my reflection has been that numbers containing more than thirty places of figures will not be required for a long time to come. I have, however, made the drawings of the Engine for forty places of figures.”, and then goes on to describe how the Engine can perform multiple-precision arithmetic if greater precision is required. I have found no document describing the motivation for subsequently increasing the capacity of the Engine to 50 places. In the 1864 document, he simply states, “It seems to me probable that a long period must elapse before the demands of science will exceed this limit” and then, as more than 20 years before, explains how multiple precision arithmetic can increase the capacity of the Engine to whatever number of digits may be required.)
To those acquainted with early electronic digital computers, the provision of 1000 separate columns, or storage locations, in the Store appears extravagant. Nonetheless, that was Babbage's intent, and in the 1864 description he described this specification as: “only a thousand constants, because I think it will be more than sufficient.”
By comparison, the UNIVAC I had 1000 memory locations, but each could store only 12 characters (11 digits plus sign), so its memory was less than a quarter of that Babbage specified for his Engine. The ubiquitous IBM 1401 transistorised computer of the 1960's had storage capacity ranging from 1400 to 16,000 characters, or between 3% and 32% of that of The Analytical Engine. Further, these electronic computers were stored program machines, in which the storage contained both program instructions and the data operated upon. In The Analytical Engine, the program did not reside in the Store but was punched onto the Operation, Variable, and Number cards. Taking this into account, and keeping in mind that every column of the store was a tower of 51 brass discs, carefully machined to mesh with the ingress and egress axes of the Mill, one wonders what kind of problem Babbage had in mind which would have required 1000 intermediate values to solve. That notwithstanding, 1000 columns of Store was clearly Babbage's intent, and that's what the emulator implements. May you use them well.
Note: the following several paragraphs provide a brief tutorial on fixed point decimal arithmetic as would have been used on the Engine. In this age of ubiquitous floating point, many programmers are rusty on or entirely unacquainted with fixed point computation. If you are familiar with it, please skip ahead to the discussion of specific issues related to the Engine.
The arithmetical operations of the Mill work exclusively with signed integer quantities: no mechanism is provided to handle real numbers or fractions. Since most analytical formulæ involve real numbers and numerical computations involving them are done with decimal fractions with an appropriate number of places, the Engine's being restricted to integer arithmetic might seem a serious shortcoming at first glance. In fact, by employing a technique called (in modern terminology) fixed point arithmetic, the Engine can employ decimal fractions with precision limited only by the magnitude of the numbers involved and the 50-digit capacity of the Store and Mill.
Suppose we wish to perform a calculation using seven decimal places of precision. To do so, we place numbers in the Store with the seven rightmost digits assigned as decimal places, and those to the left reserved for the integral part of the quantity. We imagine a decimal point being written between the seven rightmost digits and the 43 remaining digits. We've fixed the decimal point at a given location in order to do arithmetic, hence “fixed point arithmetic”. Suppose we want to add 2.75 to 1.5. These quantities would appear in the Store as:
27500000 15000000
Or, writing out the imaginary decimal point:
2.7500000 1.5000000
If we then bring these quantities into the Mill, add them, and return the sum to another column in the Store, it will then contain:
42500000
Mentally inserting the decimal point in this quantity, we obtain 4.2500000, or 4.25. The Mill performed this calculation without any knowledge of the position of the decimal point—that was entirely a matter of mental bookkeeping on the part of the analyst, and yet the Mill produced the correct result. We're simply calculating with numbers ten million times larger than those in the real problem, and remembering to write the decimal point at the correct location when the answer appears. A moment's reflection will show that this technique works equally well for subtraction as addition.
Multiplication and division are somewhat more sticky. Let's consider multiplication first. Suppose we wish to multiply 2.75 and 1.5, again using seven decimal places. Proceeding as we did above for addition, we write these quantities as 27500000 and 15000000. But if we send these quantities to the Mill and multiply them, what we get back is 412500000000000, which is 10,000,000 times the correct answer. What's happened is that in multiplying these two quantities, the Mill also multiplied the scale factors of ten million in each of the numbers, resulting in a product in which the decimal point will appear fourteen digits from the rightmost. Since we wish to continue to compute with seven digits of precision, an additional step is required—we must divide the product computed in the Mill by 10,000,000 to restore the scale factor to the original value. Doing so yields 41250000, or 4.1250000, which is the correct product, 4.125.
A complementary situation obtains in the case of division. Since the divisor contains a scale factor of 10,000,000, it is necessary to multiply the dividend by this same factor prior to performing the division. Consider dividing 2.75 by 1.5. We write the quantities as above, but before they can be divided, we must multiply the dividend, 27500000, by 10000000 yielding 275000000000000. When this quantity is divided by 15000000 the Mill returns 18333333 which, written with a decimal point, is 1.8333333, the quotient of 2.75 divided by 1.5 expressed to seven decimal places. Note that by restricting our computations to seven decimal places, any division which results in a continuing fraction or decimal expansion longer than seven places will lose precision. It is therefore necessary to perform calculations with sufficient additional decimal places to avoid errors due to truncation, but this is also the case in manual calculation.
As discussed later in this document and in contemporary descriptions of the Engine, multiplication and division would have been very time-consuming operations, as much as sixty times slower than addition or subtraction, so the burden of having to double the number of such operations when computing decimal fractions would be onerous. In his original 1837 description of the Engine, Babbage envisioned a process of “stepping down” at the end of a multiplication and “stepping up” prior to division, each controlled by a Counting apparatus. Note that the divisions and multiplications needed to adjust the position of the decimal point are all by 10n, where n is the number of decimal places used in the calculation. Multiplying by 10n is just adding n zeroes to the end of a number, and dividing by 10n is simply a matter of discarding the rightmost n digits of a quantity.
The mechanism Babbage designed to perform multiplication and division already incorporated the ability to rapidly shift quantities to the left and right by a given number of digits. By “stepping up or down”, he employed this existing machinery to adjust the decimal point after a multiplication and before a division. In Babbage's own words, from the 1837 document:
The termination of Multiplication arises from the action of the Counting apparatus which at a certain time directs the barrels to order the product thus obtained to be stepped down so the decimal point may be in its proper place,….
. . .
The dividend R next enters. Several processes are gone through in order to ascertain whether it will be necessary to step the dividend up or down, and how many steppings there ought to be of either kind.
Unfortunately, this is about the extent of Babbage's writing on this topic. It is clear he intended the multiplication and division machinery to incorporate a shift left or right (step up or down) to adjust the decimal point for division and multiplication, but how this was to be specified to the Mill (presumably by an Operation or Variable card) was never spelled out. The emulator includes explicit stepping up and down cards to specify these operations. No suggestion was found in any description of the Engine that stepping up and down might be used as a shortcut for multiplying and dividing by powers of ten, so the emulator does not permit these functions to be used as a general decimal shift operation.
From the first description of The Analytical Engine in 1837, Babbage realised the need to mechanise repetitive sequences of calculation such as the evaluation of terms in a series, and for the machine to be able to perform different sequences of calculations based on results obtained in the course of a computation, but unknown when the problem was prepared for the Engine. The 1837 paper refers to “Combinatorial Cards” and “Index Cards” which govern a “Repeating Apparatus” capable of returning the Number and Variable cards to a given place in the sequence and thereby causing them to be repeated a fixed or variable number of times. Whether these cards are repeated together, or independently is not spelled out explicitly.
By 1842, in the Sketch of the Analytical Engine and Translator's Notes, the means by which backing and advancing cards could implement what in present day terminology we call “loops” and “conditionals” were clearly spelled out, though one gets the impression, with the exception of a brief discussion in Translator's Note F to the document, that only the Operation cards are repeated. Further, in Babbage's 1864 description, he speaks only of backing and advancing Operation cards, though of course this doesn't rule out a similar capability for the Variable cards which did not merit mention in a chapter of a wide-ranging autobiography.
From the available documents, we conclude the following:
- A mechanism for backing and advancing the Operation and Variable cards was envisioned.
- The Number cards were never mentioned in conjunction with backing and advancing.
- The means by which the three separate card readers for the Operation, Variable, and Number cards were co-ordinated in their respective turning of successive cards in its own chain was never clearly specified.
Based on this incomplete information, the emulator adopts the following plausible approach, especially in light of the mechanical simplicity it would confer upon the design.
Operation, Variable, and Number cards are turned in lockstep. Backing or advancing always affects all three readers, adjusting each to the corresponding point in its chain of cards.
Any other design would require additional machinery by which one card stream could control the backing or advancing of another. Since such a mechanism was never clearly spelled out, we assume it wasn't intended, and opt for the simplest possible scheme, where all the cards advance or back together. (Well, not exactly, since the source documents are clear that one need only supply a new Operation card when the arithmetic operation of the Mill changes. But we can imagine the Combinatorial cards advancing or backing to marks in the respective streams, independent of the number of intervening cards.)
Given that the historical record provides no guidance on this issue, this decision can be justified in terms of the previously-stated guidelines for the emulator. Clearly, requiring all three card readers to advance or back together is more restrictive than positing a mechanism which allows them to move independently. Thus, imposing this restriction does not expand the capability of the Engine beyond Babbage's intent, whatever it may have been. But equally as clearly, anything which can be accomplished by backing and advancing the various card readers independently can be performed just as well by adding additional cards to the various streams which implement the individual cases. This decision, then, neither expands nor restricts the capability of the Engine, and is thus consistent with the emulator design guidelines.
Babbage's Engine had three separate readers of Jacquard cards: one for the Operation cards, one for the Variables, and a third for the Numbers. From a mechanical standpoint, this was essential—the relationship of the holes on the cards to the functioning of the mechanism was entirely different for each variety of card, and the number of holes and their interpretation had nothing in common among the varieties.
A humble Victorian coder seated on a stool before a high desk, patiently stringing together chains of Operation, Variable, and Number cards based on the handwritten instructions of a noble or prosperous merchant analyst makes for a Dickens of an image, but few if any present-day programmers would undertake such a tedious endeavour. Indeed, no program for The Analytical Engine was ever encoded in cards. Instead, they were written in a notation not unlike that used in modern-day programming languages, assuming that the details of punching them onto cards suitable for the Engine would be performed by its expert attendants.
The emulator adopts a notation intermediate in level between the analytical level of contemporary discussions of the Engine and the concrete level of holes in pasteboard cards. We use a stream of Programming Cards, which represent the three independent streams of Operation, Variable, and Number cards which the Engine would process. Rather than requiring the programmer to prepare these streams independently, we allow them to be interleaved. If you like, you can imagine the Engine's attendant going through the deck and stringing each type of card onto a separate chain for the reader of its kind.
To justify the interleaved card notation, we again appeal to the guidelines for the emulator. Interleaved Operation, Variable, and Number cards can be mechanically (or, in Victorian terms, by manual labour) separated into chains for their respective readers. Hence, interleaving does not in any way add a capability to the Engine not envisioned by Babbage. Conversely, even if one assumes a speculative mechanism for independently advancing and backing the separate card readers (as discussed in the previous section), it adds no capability to the Engine which can't be achieved by card readers which advance and back together. Additional cards may be required, but any calculation possible with independent readers can be equally accomplished with the synchronised readers implied by the interleaved cards of the emulator.
One decidedly non-authentic aspect of the emulator is its speed of calculation; even the most humble electronic computer can shuffle numbers orders of magnitude more quickly than brass gears and cams. In his 1864 description of the Engine, Babbage estimated its speed of calculation as about one addition or subtraction per second and one minute per multiplication of two 50 digit numbers or division of a 100 digit number by a 50 digit divisor. Most present-day computers take the same time to complete a given arithmetic operation regardless of the value of the operands: dividing 4 by 2 takes just as long as dividing 837638765 by 23711. Thanks to Babbage's Hoarding Carriage (the equivalent technique in electronic computers is called “look-ahead carry propagation”), the Mill could add and subtract numbers at the rate of about one per second regardless of the number of digits they contained. This was not, however, the case for multiplication and division, which were performed by a process of shifting and adding or subtracting repeatedly. While multiplication of two fifty-digit numbers might take a minute, two three digit numbers could be multiplied in a matter of seconds.
While I doubt that adding artificial delays to the emulator so it performed its calculation at the same speed as the Engine would be met with public acclaim, it it is useful for the emulator to provide an estimate of how long the actual Engine would have taken to perform a given calculation. The Web Emulator (but not the command-line or Java applet emulators) provides this feature, which can be invoked either by a programming card or a button on the emulator control panel. Please see the timing section in the Web Emulator document for details of how calculation time is estimated.
Click on titles to order books on-line from
|