Cellular Automata Laboratory |
Note: this information is believed accurate as of the time of the release of CelLab. It describes internal file formats read and written by the program which are not normally manipulated by user programs. These file formats are subject to change in future releases of CelLab. If you develop programs that manipulate CelLab files based on this information, please be aware that you may have to modify your programs to work with future releases of CelLab.
This information is for experienced systems programmers; as long as you use CelLab and its associated tools, you don't need to worry about its file formats. This information is provided in case you want to interface CelLab to other software via its files. We can't help you debug programs you develop based on this information, but dumping files written in these formats with your debugger should help resolve any problems.
That said, onward to the gory details!
Pattern files can be written either in ASCII or binary format. Each format can be written either compressed or uncompressed. All pattern files are written with an extension of .jcp—the format is determined by examining the bytes at the beginning of the file. All formats represent a pattern map consisting of 200 lines, each containing 322 bytes. Within each line the first and last bytes are used internally within CelLab, do not appear on the screen, and can be ignored (they're saved in the pattern file because it's faster to dump uncompressed binary files with them included). Pattern files represent a total of 64,400 bytes (322×200), stored in various forms. The top line on the screen appears first in the pattern file and the the bottom line last. Bytes within each line are written with the leftmost pixel on the screen first and the rightmost pixel last (surrounded, of course, by the two extra bytes).
ASCII pattern files are primarily intended for interchange with programming languages which cannot readily process binary files. ASCII files always consume more disk space than their binary equivalent and take longer to load and save.
An uncompressed ASCII pattern file consists simply of a total of 64,400 hexadecimal numbers, separated by spaces. Line breaks are inserted to limit the length of lines to less than 80 characters and an extra line break is inserted at the end of each of the 200 screen lines. The hexadecimal numbers are written in without leading zeroes or other number format flags. Numbers from uncompressed pattern files can be read directly with a C format string of “%x”. Uncompressed ASCII pattern files are huge and take forever to read and write, but they're the easiest format to decode. Here are the first two lines of an uncompressed ASCII pattern file representing a random map:
0 A1 92 F5 C7 F9 DC D 36 8 1B C6 AB 78 42 A BA CE 5E EE FA F7 3F 4E C8 B6 6F A2 7D AA 7E BE AB 61 94 2 46 8E 23 8B 1 6 B3 32 A3 F0 A5 7 34 81
Compressed ASCII pattern files are run length encoded—they consist of a sequence of space-separated pairs of numbers: the numbers separated by a comma. The first number gives, in decimal, the number of consecutive occurrences of the value given, in hexadecimal, by the second number. New line characters are inserted to keep line lengths below 80 characters. Data in the pattern map may be compressed across lines. The first character of a compressed ASCII pattern file is always an asterisk; this identifies it as a pattern file. The following is a compressed pattern file representing a single dot with state 237 (which in hexadecimal is 0xED):
*32684,0 1,ED 31715,0
Note that the numbers total to 64,400, the total number of bytes in the map.
Data bytes in binary pattern files are written in the internal format used within CelLab. Bytes within CelLab are rotated one bit to the right so that the bit representing Plane 0 appears as the most significant bit of each byte. When you read or write a binary pattern map, you'll have to compensate for this by rotating the data one bit to the left before interpreting it as a normal state code.
All binary pattern files have an ASCII colon “:” (hexadecimal code 3A) as the first byte. The colon is followed by one or more instructions as defined in the following table. The end of the binary pattern file is marked by the appearance of an RLEND instruction (byte with value 6). The instruction codes and count bytes that follow the instruction are not rotated—only value bytes representing cell states are rotated one bit.
RLUNCOMP = 1 64K of uncompressed data follows RLRUN = 2 2-257 byte run of value follows RLONEB = 3 Single byte of specified value follows RLUNCS = 4 Uncompressed stream follows RLEND = 6 End of pattern RLLRUN = 7 Long run > 256 bytes RLLUNCS = 8 Long uncompressed stream > 256 bytes
Uncompressed binary pattern files consist of the initial colon, an RLUNCOMP instruction (code 1), followed by 64,400 bytes of pattern map data (each byte rotated one bit to the right, as noted above), followed by an RLEND (code 6) instruction. Since pattern maps can be loaded and dumped directly from the map with one I/O call, uncompressed binary pattern files can be read and written very quickly.
Compressed binary pattern files begin with the colon and end with an RLEND instruction. The data between these two bytes is a series of instructions representing the data in run length compressed form. The interpretation of each instruction and the bytes that follow it is as given below:
Color palette files can be written in ASCII or binary form. All have the default extension of .jcc; the format is determined from the contents of the file. When working with color palettes, the standard pattern chroma.jcp can be useful; it shows you all of the states in the palette, from 0 to 255, in a 16×16 grid of squares, with states reading from left to right, then top to down. Blocks of 8 squares are separated by extra space for easier interpretation. You can load this pattern from the drop-down list in the Pattern URL line of the WebCA control panel.
ASCII color palette files have a very simple format; they are easy to generate manually or programmatically for custom representation of states. The format is as follows.
The first line contains a single number, the format indicator, which can be 2, 3, or 5. Values signify:
2 VGA Palette file 3 Composite (CGA and VGA) palette file 5 RGB Palette file (0–255 intensity)
This line is followed by up to 256 lines specifying color assignments for states. If fewer than 256 lines follow the format indicator, the color assignments for those states will be left as before. Each line of a VGA palette file specifies the red, green, and blue intensities from 0 to 63, separated by spaces, for each state. A composite file supplies both CGA and VGA values for each state with the VGA red, green, and blue intensities followed by the CGA color index for the state. The CGA color index was used by earlier versions of CelLab and is now ignored. RGB palette files are the same as VGA palette files except that intensities of the red, green, and blue components run from 0 to 255.
Procedural color palette files allow a compact textual specification of complex color palettes for rules which employ “housekeeping bits” in their states which you don't want to affect the display of the map. For example, our Perfume rules use two bit planes to represent the gas and containers, but six other planes for internal information. Crafting a palette which displays only the relevant information while excluding the other planes can be tedious and prone to error. A procedural palette can get the job done in just a few lines.
A procedural palette file begins with a line containing the number “7”; this indicates it is in procedural form. Subsequent lines are statements as follows. Blank lines and any material after two slashes (“//”) is ignored and may be used for comments.
- mask number
- Mask the physical states by ANDing with number. The number is assumed to be in decimal, or hexadecimal if it begins with “0x”. If no mask is specified, 0xFF is assumed. Only one mask should be specified; if the palette contains more than one, the last mask will be used.
- state statelist color[-color]
- The CSS color specification is assigned to the states in statelist, which is a comma separated list of states, each of which can be a number (again, hexadecimal if preceded by “0x”), or a range separated by a hyphen, with an optional increment also preceded by a hyphen. “state” may be abbreviated to “s”. If multiple assignments are made to the same state, the last will be used. If a second color is given, separated by a hyphen from the first, the states in statelist will be filled with a linear gradient starting with the first color and ending with the second.
For example, the following is a complete procedural palette file for the Perfume rules.
7 // Procedural palette for the Perfume rules mask 6 state 0 black // Color name s 2 #0000FF // Hex RGB specfication state 4,6 rgb(255, 255, 0) // Decimal RGB
You can specify any number of states and ranges on a “state” declaration:
state 2,3,7-12,0x19-0x21,128-252-4 hsl(120, 60%, 70%)
It is permissible to replace the color assignment to a state in a subsequent statement. This is handy in cases where you want to specify exceptions to a large range. The following sets all states to grey, then assigns different colors to a few states in the range.
state 0-255 grey state 1 red state 17 skyblue state 63-95 yellow-blue
Binary color palette files are all precisely 771 bytes long. The first three bytes are an ASCII format code of “4” (code 34 hex) or “6” (code 36 hex), a carriage return (0D hex), and a line feed (code 0A hex). This prologue, which identifies the file as being binary format, is followed by 768 bytes, with consecutive triples of bytes specifying the color assignments for the 256 state codes from 0 to 255 (768 = 3 × 256). If the format code is “4”, the three bytes in each triple specify the red, green, and blue intensities from 0 through 63. The first byte of each triple also may contain, in its two most significant bits, a legacy color index assignment for the state, from 0 to 3, which is ignored by WebCA. If the format code is “6”, the three bytes of the triple give red, green, and blue intensities from 0 to 255.
CelLab evaluates cellular automata rules by table look-up. The basic content of a rule definition (.jc) file is the values to be loaded into the lookup table to define the rule. Rule definitions may also set numerous modes that affect the operation of CelLab, request the loading of patterns, color palettes, and the like.
Rule definition files are always binary files. They consist of a sequence of instruction codes, each followed by data specific to that instruction. The methods and instruction codes used in rule definition files are a very similar to those used in compressed binary pattern files. The instruction codes used in rule definition files are as follows:
Rule lookup table compression instructions:
RLUNCOMP = 1 64K of uncompressed rule follows RLRUN = 2 2-257 byte run of value follows RLONEB = 3 Single byte of specified value follows RLUNCS = 4 Uncompressed stream follows RLCOPYB = 5 Copy previously specified bank RLEND = 6 End of rule definition, parameters
Rule mode request instructions:
RSHTEXT = 64 Horizontal texture request RSVTEXT = 65 Vertical texture request RSRAND = 66 Random stimulus request RSPAT = 67 Pattern load request RSPAL = 68 Palette load request RSEPAT = 69 Embedded palette address RSEPAL = 70 Embedded pattern address RSRSEED = 71 Initial random seed RSOCODE = 72 Own code load request RSEOCODE = 73 Embedded own code address
The lookup table always consists of 65,536 bytes of data (even though many rules do not need or use the entire table), and a rule definition file always loads every byte of the table. Each byte in the table represents a the new state for a cell when its state and that of its neighbors select that cell in the rule lookup table. The cell states in the lookup tables are stored in the internal format with Plane 0 as the most significant bit and Planes 1 through 7 as the least significant 7 bits (in other words, the states are rotated circularly one bit to the right). The relationship between the values of the neighbors and the lookup table indices for the various settings of worldtype can be seen by examining the JavaScript or Java rule generation code in the source code supplied with CelLab.
The compression algorithms used keep the size of rule files commensurate with the actual data needed by the rule. First, let's examine the instructions used to compress the rule table itself.
There isn't one uniquely correct way to compress a rule table—rules may be encoded in various ways, but the contents of the lookup table will be identical after the rule is loaded.
The following instructions specify modes which the rule can set. The RLEND instruction both specifies modes and marks the end of the rule definition; it is required. All of the other mode request instructions are optional—if they are not specified the default values are used. These instructions convey the rule requests made by calls to the various set… method invocations in the jcruleModes method of a Java rule program or setting the variables with the corresponding names in a JavaScript rule program. Please refer to the documentation of these variables for details on their interpretation.
Rule definition files are the most complex of the files used by CelLab, but if you get confused about their format, you can always examine the source code of the Java rule maker to resolve any questions.
Population census log files consist of ASCII records in CSV format indicating the population of cells in each state at the time the log entry was made, whether manually or by an automatic dump every so many generations. Each record occupies a line in the file and each record is of the form:
generation,state1,count1,…
Where generation is the generation number at which the population census was taken. the staten and countn pairs give the number of cells in each state. States which have a count of zero are omitted. All numbers are decimal.