Cellular Automata Laboratory


JC and CelLab File Formats

Note: this information is believed accurate as of the time of the release of CelLab. It describes internal file formats read and written by the program which are not normally manipulated by user programs. These file formats are subject to change in future releases of CelLab. If you develop programs that manipulate CelLab files based on this information, please be aware that you may have to modify your programs to work with future releases of CelLab.

This information is for experienced systems programmers; as long as you use CelLab and its associated tools, you don't need to worry about its file formats. This information is provided in case you want to interface CelLab to other software via its files. We can't help you debug programs you develop based on this information, but dumping files written in these formats with your debugger should help resolve any problems. That said, onward to the gory details!

PATTERN FILES

Pattern files can be written either in ASCII or binary format. Each format can be written either compressed or uncompressed. All pattern files are written with an extension of .jcp--the format is determined by examining the bytes at the beginning of the file. All formats represent a pattern map consisting of 200 lines, each containing 322 bytes. Within each line the first and last bytes are used internally within CelLab, do not appear on the screen, and can be ignored (they're saved in the pattern file because it's faster to dump uncompressed binary files with them included). Pattern files represent a total of 64,400 bytes (322×200), stored in various forms. The top line on the screen appears first in the pattern file and the the bottom line last. Bytes within each line are written with the leftmost pixel on the screen first and the rightmost pixel last (surrounded, of course, by the two extra bytes).

ASCII Pattern Files

ASCII pattern files are primarily intended for interchange with programming languages which cannot readily process binary files. ASCII files always consume more disk space than their binary equivalent and take longer to load and save.
Uncompressed ASCII Pattern Files
An uncompressed ASCII pattern file consists simply of a total of 64,400 hexadecimal numbers, separated by spaces. Line breaks are inserted to limit the length of lines to less than 80 characters and an extra line break is inserted at the end of each of the 200 screen lines. The hexadecimal numbers are written in without leading zeroes or other number format flags. Numbers from uncompressed pattern files can be read directly with a C format string of "%x". Uncompressed ASCII pattern files are huge and take forever to read and write, but they're the easiest format to decode. Here are the first two lines of an uncompressed ASCII pattern file representing a random map:

0 A1 92 F5 C7 F9 DC D 36 8 1B C6 AB 78 42 A BA CE 5E EE FA F7 3F 4E C8
B6 6F A2 7D AA 7E BE AB 61 94 2 46 8E 23 8B 1 6 B3 32 A3 F0 A5 7 34 81
Compressed ASCII Pattern Files
Compressed ASCII pattern files are run length encoded--they consist of a sequence of space-separated pairs of numbers: the numbers separated by a comma. The first number gives, in decimal, the number of consecutive occurrences of the value given, in hexadecimal, by the second number. New line characters are inserted to keep line lengths below 80 characters. Data in the pattern map may be compressed across lines. The first character of a compressed ASCII pattern file is always an asterisk; this identifies it as a pattern file. The following is a compressed pattern file representing a single dot with state 237 (which in hexadecimal is 0ED):

*32684,0 1,ED 31715,0

Note that the numbers total to 64,400, the total number of bytes in the map.

Binary Pattern Files

Data bytes in binary pattern files are written in the internal format used within CelLab. Bytes within CelLab rotated one bit to the right so that the bit representing Plane 0 appears as the most significant bit of each byte. When you read or write a binary pattern map, you'll have to compensate for this by rotating the data one bit to the left before interpreting it as a normal state code.

All binary pattern files have an ASCII colon ":" (hexadecimal code 3A) as the first byte. The colon is followed by one or more instructions as defined in the following table. The end of the binary pattern file is marked by the appearance of an RLEND instruction (byte with value 6). The instruction codes and count bytes that follow the instruction are not rotated--only value bytes representing cell states are rotated one bit.

        RLUNCOMP = 1      64K of uncompressed data follows
        RLRUN    = 2      2-257 byte run of value follows
        RLONEB   = 3      Single byte of specified value follows
        RLUNCS   = 4      Uncompressed stream follows
        RLEND    = 6      End of pattern
        RLLRUN   = 7      Long run > 256 bytes
        RLLUNCS  = 8      Long uncompressed stream > 256 bytes
Uncompressed Binary Pattern Files
Uncompressed binary pattern files consist of the initial colon, an RLUNCOMP instruction (code 1), followed by 64,400 bytes of pattern map data (each byte rotated one bit to the right, as noted above), followed by an RLEND (code 6) instruction. Since pattern maps can be loaded and dumped directly from the map with one I/O call, uncompressed binary pattern files can be read and written very quickly.
Compressed Binary Pattern Files
Compressed binary pattern files begin with the colon and end with an RLEND instruction. The data between these two bytes is a series of instructions representing the data in run length compressed form. The interpretation of each instruction and the bytes that follow it is as given below:
RLRUN Count Value
Store Count + 1 bytes of Value in the next consecutive cells of the map buffer.

RLLRUN CountHi CountLow Value
Compute Count = ((CountHi × 256) + CountLow), then store Count + 1 bytes of Value in the next consecutive cells of the map buffer.

RLONEB Value
Store Value in the next cell of the map buffer.

RLUNCS Count Value[1] Value[2] ... Value[Count+1]
Store the Count + 1 bytes specified by the Value[n] bytes in the next cells of the map buffer.

RLLUNCS CountHi CountLow Value[1] Value[2] ... Value[Count+1]
Compute Count = ((CountHi × 256) + CountLow), then store the Count + 1 bytes specified by the Value[n] bytes in the next cells of the map buffer.

COLOR PALETTE FILES

Color palette files can be written in ASCII or binary form. All have the default extension of .jcc; the format is determined from the contents of the file.

ASCII Color Palette Files

ASCII color palette files have a very simple format; they are easy to generate manually or programmatically for custom representation of states. The format is as follows.

The first line contains a single number, the format indicator, which ranges from 1 to 3. Values signify:

        1     CGA Palette file
        2     VGA Palette file
        3     Composite (CGA and VGA) palette file

This line is followed by up to 256 lines specifying color assignments for states. If fewer than 256 lines follow the format indicator, the color assignments for those states will be left as before. For a CGA palette file, each line contains a single number which specifies the CGA color for that state (the same color indices you specify with the Alt-F9 key in JC). Each line of a VGA palette file specifies the red, green, and blue intensities from 0 to 63, separated by spaces, for each state. A composite file supplies both CGA and VGA values for each state with the VGA red, green, and blue intensities followed by the CGA color index for the state. JC and CelLab always generate composite palette files when a palette is dumped with Ctrl-F5, even if only one display has ever been used. Material that follows the required specifications is ignored; CelLab takes advantage of this to include comments in the palette files it writes to aid in their interpretation.

Binary Color Palette Files

Binary color palette files are all precisely 771 bytes long. The first three bytes are an ASCII "4" (code 34 hex), a carriage return (0D hex), and a line feed (code 0A hex). This prologue, which identifies the file as being binary format, is followed by 768 bytes, with consecutive triples of bytes specifying the color assignments for the 256 state codes from 0 to 255 (768 = 3 × 256). The three bytes in each triple specify the red, green, and blue intensities for the VGA, from 0 through 63. The first byte of each triple also contains, in its two most significant bits, the CGA color index assignment for the state, from 0 to 3.

RULE DEFINITION FILES

CelLab evaluates cellular automata rules by table look-up. The basic content of a rule definition (.jc) file is the values to be loaded into the lookup table to define the rule. Rule definitions may also set numerous modes that affect the operation of CelLab, request the loading of patterns, color palettes, and the like. CelLab also provides a Save Experiment operation. A saved experiment is actually just a rule definition which contains embedded pattern and color palette definitions.

Rule definition files are always written in binary mode. They consist of a sequence of instruction codes, each followed by data specific to that instruction. The methods and instruction codes used in rule definition files are a very similar to those used in compressed binary pattern files. The instruction codes used in rule definition files are as follows:

Rule lookup table compression instructions:

        RLUNCOMP = 1      64K of uncompressed rule follows
        RLRUN    = 2      2-257 byte run of value follows
        RLONEB   = 3      Single byte of specified value follows
        RLUNCS   = 4      Uncompressed stream follows
        RLCOPYB  = 5      Copy previously specified bank
        RLEND    = 6      End of rule definition, parameters

Rule mode request instructions:

        RSHTEXT  = 64     Horizontal texture request
        RSVTEXT  = 65     Vertical texture request
        RSRAND   = 66     Random stimulus request
        RSPAT    = 67     Pattern load request
        RSPAL    = 68     Palette load request
        RSEPAT   = 69     Embedded palette address
        RSEPAL   = 70     Embedded pattern address
        RSRSEED  = 71     Initial random seed
        RSOCODE  = 72     Own code load request
        RSEOCODE = 73     Embedded own code address

The lookup table always consists of 65,536 bytes of data (even though many rules do not need or use the entire table), and a rule definition file always loads every byte of the table. Each byte in the table represents a the new state for a cell when its state and that of its neighbors select that cell in the rule lookup table. The cell states in the lookup tables are stored in the internal format with Plane 0 as the most significant bit and Planes 1 through 7 as the least significant 7 bits (in other words, the states are rotated circularly one bit to the right). The relationship between the values of the neighbors and the lookup table indices for the various settings of worldtype can be seen by examining the Java, C, or Pascal rule generation code in the source code supplied with CelLab.

The compression algorithms used keep the size of rule files commensurate with the actual data needed by the rule. First, let's examine the instructions used to compress the rule table itself.

RLUNCOMP Value[0] Value[1] ... Value[65535]
The entire contents of the lookup table is specified by the 65536 bytes that immediately follow this instruction. CelLab never writes rules in this format, but it will load rules written with this instruction; it's provided as a convenience to external programs that want to generate rules without all the complexity of compressing them. You can always load a rule in this format and then save it from CelLab to save space.

RLRUN Count Value
Store Count + 1 bytes of Value in the next consecutive cells of the lookup table.

RLONEB Value
Store Value in the next cell of the lookup table.

RLUNCS Count Value[1] Value[2] ... Value[Count+1]
Store the Count + 1 bytes specified by the Value[n] bytes in the next cells of the rule table.

RLCOPYB Pageno
The rule table is arbitrarily subdivided into 256 byte pages, numbered 0 through 255. Due to the structure of rules, often many pages will be identical. This instruction copies a prior page, Pageno, and stores it in the next 256 bytes of the rule table.

There isn't one uniquely correct way to compress a rule table--rules written by a BASIC rule generation program are encoded differently than those written from Java, Pascal, or C, but the contents of the lookup table will be identical after the rule is loaded.

The following instructions specify modes which the rule can set. The RLEND instruction both specifies modes and marks the end of the rule definition; it is required. All of the other mode request instructions are optional--if they are not specified the default values are used. These instructions convey the rule requests made by calls to the various set... method invocations in the jcruleModes method of a Java rule program, setting the variables with the corresponding names in C or Pascal (in BASIC the same functions exist but the variable names are different to comply with BASIC's short identifier restrictions). Please refer to the documentation of these variables for details on their interpretation.

RLEND worldtype randdens auxplane
This instruction marks the end of the rule definition and conveys in the three bytes that follow the instruction code the settings the rule definition function stored in worldtype, randdens, and auxplane. The process of loading a rule is ended after this instruction is processed (although if the file is a saved experiment, pattern and palette information may be present in the file following the RLEND instruction).

RSHTEXT texthb texthn
Requests horizontal texture of texthn bits, starting at plane number texthb.

RSVTEXT textvb textvn
Requests vertical texture of textvn bits, starting at plane number textvb.

RSRAND randb randn
Requests random input each generation. Randn bits of random input are stored with the least significant bit in plane randb.

RSRSEED rseedb rseedn rseedp
Requests that an initial random seed be stored when the rule is loaded or a new pattern is loaded while this rule is in effect. The least significant bit of the random seed is stored in plane rseedb and rseedn bits of seed are stored. The setting of rseedp controls the density of the seed--a value of 255 results in half the bits being zeroes, 0 generates all zeroes, and intermediate values vary the density between these limits.

RSPAT nlen name[0] name[1] ... name[nlen-1] 0
The value of nlen gives the length, including the terminating zero, of a file name which CelLab will attempt to load as a pattern file after loading the rule.

RSPAL nlen name[0] name[1] ... name[nlen-1] 0
The value of nlen gives the length, including the terminating zero, of a file name which CelLab will attempt to load as a color palette file after loading the rule.

RSOCODE nlen name[0] name[1] ... name[nlen-1] 0
The value of nlen gives the length, including the terminating zero, of a file name which CelLab will attempt to load as an own code file after loading the rule.

RSEPAT addr[0] addr[1] addr[2] addr[3]
The four bytes that follow give the address, written with the least significant byte first and the most significant byte last, of a compressed binary format pattern file embedded within this .jc file somewhere after the RLEND instruction. After the rule table is loaded, CelLab will load that embedded pattern. This is used to encode patterns within saved experiments.

RSEPAL addr[0] addr[1] addr[2] addr[3]
The four bytes that follow give the address, written with the least significant byte first and the most significant byte last, of a binary color palette file embedded within this .jc file somewhere after the RLEND instruction. After the rule table is loaded, CelLab will load that embedded color palette. This is used to encode color palettes within saved experiments.

RSEOCODE addr[0] addr[1] addr[2] addr[3]
The four bytes that follow give the address, written with the least significant byte first and the most significant byte last, of user own code embedded within this .JC file somewhere after the RLEND instruction. The own code is written with two bytes which give its length (least significant byte first, most significant byte last), then the contents of the .jco file from which the own code was loaded. Note that own code is embedded within both rules and experiments saved from CelLab when own code was loaded. User evaluators implemented as Windows DLLs for use with CelLab for Windows are not embedded in experiment files; instead a RSOCODE request naming the evaluator file is included. Windows isn't able to load a DLL embedded in another file, so it isn't possible to embed such evaluators.

Rule definition files are the most complex of the files used by CelLab, but if you get confused about their format, you can always examine the source code of the Java, C, Pascal, or BASIC rule maker to resolve any questions. Note that the BASIC rule maker does not compress the rule definitions it writes.

POPULATION HISTORY LOG FILES

Population history log (.jch) files consist of a header followed by zero or more records indicating the population of cells in each state at the time the log entry was made, whether manually or by an automatic dump every so many generations. The file header is 24 bytes in length and consists of the string, defined using the C language convention as:

    "CelLab population log\r\n\032"

The "\032" is a Ctrl-Z which serves as the end of file marker in a DOS text file, and is intended to keep the binary data which follow from appearing if the user inadvertently types the file.

Each population log record is described by the C language structure:

struct populationLogItem {
    char ruleName[16];                 /* Rule currently executing */
    long generationCount;              /* Generation count at time of dump */
    unsigned short cellHistogram[256]; /* Number of cells in each state */
};

This structure is written in the file with no padding and with the long and short values in Intel (least-significant first) byte order. The ruleName field is a zero-terminated ASCII string which gives the name of the rule executing at the time the population was dumped, with generationCount specifying the number of generations which had executed since a rule or pattern was last loaded, or the generation count was manually reset by the user. The balance of the record is the cell population histogram array, cellHistogram, which gives, for each possible 8-bit cell state from 0 to 255, the number of cells in that state when the log entry was made. The sum of all the cellHistogram entries will always be 64000, the total number of cells in the 320×200 cell map.

MOVIE FILES

The following movie file format pertains only to movies created by JC. CelLab for Windows writes moves in Microsoft .avi format, about which the less said the better.

Movie files consist of a header indicating whether the movie was recorded from a CGA screen or a VGA, a copy of the color palette in effect when the movie was recorded, and a sequence of compressed screen images taken directly from the CGA or VGA screen. The header is three bytes, simply "CGA" or "VGA" in ASCII. Immediately following the header is a binary dump of the color palette in effect when the movie was made, written in the format described above for binary color palette files. After the three byte header and 771 bytes of color palette, a sequence of saved frames appears.

Each saved frame consists of the contents of the CGA or VGA frame buffer. For the VGA, the dump is of the 64,000 bytes on the screen, one byte per pixel. For a CGA, the dump consists of the 16,000 bytes that encode the pixels on the screen, four pixels per byte. On the CGA, the even numbered lines are dumped first, then the odd numbered lines. Each compressed frame stands by itself--no inter-frame delta compression is performed. Each compressed frame consists of a series of instruction codes and arguments that follow them. The instructions are as follows; instruction codes are given in decimal.

248
End of CGA bank. Denotes end of the even lines on the CGA. Resets to load the odd lines.

249 Llow Lhigh
Long zero run. (Lhigh × 256 + Llow) specifies the number of zero bytes to store in the screen map.

250 Length
Short zero run. Store Length bytes of zero in the screen map.

251 Llow Lhigh Pixel[0] Pixel[1] ... Pixel[n]
Long uncompressed sequence. Store the (Lhigh × 256 + Llow) Pixel[x] values that follow into the screen map.

252 Llow Lhigh Pixel
Long nonzero run. Store (Lhigh × 256 + Llow) bytes with value Pixel.

253 Length Pixel[0] Pixel[1] ... Pixel[n]
Short uncompressed sequence. Store the Length Pixel[x] values that follow into the screen map.

254 Length Pixel
Short nonzero run. Store Length bytes with value Pixel.

255
End of frame. This appears at the end of the compressed frame.


Next Previous Contents