Cellular Automata Laboratory
JC and CelLab File Formats
Note: this information is believed accurate as of the time of the
release of CelLab. It describes internal file formats read and
written by the program which are not normally manipulated by user
programs. These file formats are subject to change in future releases
of CelLab. If you develop programs that manipulate CelLab files based
on this information, please be aware that you may have to modify your
programs to work with future releases of CelLab.
This information is for experienced systems programmers; as long as
you use CelLab and its associated tools, you don't need to worry about
its file formats. This information is provided in case you want to
interface CelLab to other software via its files. We can't help you
debug programs you develop based on this information, but dumping
files written in these formats with your debugger should help resolve
any problems.
That said, onward to the gory details!
PATTERN FILES
Pattern files can be written either in ASCII or binary format. Each
format can be written either compressed or uncompressed. All pattern
files are written with an extension of .jcp--the format is determined
by examining the bytes at the beginning of the file. All formats
represent a pattern map consisting of 200 lines, each containing 322
bytes. Within each line the first and last bytes are used internally
within CelLab, do not appear on the screen, and can be ignored
(they're saved in the pattern file because it's faster to dump
uncompressed binary files with them included). Pattern files
represent a total of 64,400 bytes (322×200), stored in various
forms. The top line on the screen appears first in the pattern file
and the the bottom line last. Bytes within each line are written with
the leftmost pixel on the screen first and the rightmost pixel last
(surrounded, of course, by the two extra bytes).
ASCII pattern files are primarily intended for interchange
with programming languages which cannot readily process binary files.
ASCII files always consume more disk space than their binary
equivalent and take longer to load and save.
Uncompressed ASCII Pattern Files
An uncompressed ASCII pattern file consists simply of a total of
64,400 hexadecimal numbers, separated by spaces. Line breaks are
inserted to limit the length of lines to less than 80 characters and
an extra line break is inserted at the end of each of the 200 screen
lines. The hexadecimal numbers are written in without leading zeroes
or other number format flags. Numbers from uncompressed pattern files
can be read directly with a C format string of "%x". Uncompressed
ASCII pattern files are huge and take forever to read and write, but
they're the easiest format to decode. Here are the first two lines of
an uncompressed ASCII pattern file representing a random map:
0 A1 92 F5 C7 F9 DC D 36 8 1B C6 AB 78 42 A BA CE 5E EE FA F7 3F 4E C8
B6 6F A2 7D AA 7E BE AB 61 94 2 46 8E 23 8B 1 6 B3 32 A3 F0 A5 7 34 81
Compressed ASCII Pattern Files
Compressed ASCII pattern files are run length encoded--they consist of
a sequence of space-separated pairs of numbers: the numbers separated
by a comma. The first number gives, in decimal, the number of
consecutive occurrences of the value given, in hexadecimal, by the
second number. New line characters are inserted to keep line lengths
below 80 characters. Data in the pattern map may be compressed across
lines. The first character of a compressed ASCII pattern file is
always an asterisk; this identifies it as a pattern file. The
following is a compressed pattern file representing a single dot with
state 237 (which in hexadecimal is 0ED):
*32684,0 1,ED 31715,0
Note that the numbers total to 64,400, the total number of bytes in
the map.
Data bytes in binary pattern files are written in the internal format
used within CelLab. Bytes within CelLab rotated one bit to the right
so that the bit representing Plane 0 appears as the most significant
bit of each byte. When you read or write a binary pattern map, you'll
have to compensate for this by rotating the data one bit to the left
before interpreting it as a normal state code.
All binary pattern files have an ASCII colon ":" (hexadecimal code 3A)
as the first byte. The colon is followed by one or more instructions
as defined in the following table. The end of the binary pattern file
is marked by the appearance of an RLEND instruction (byte with value
6). The instruction codes and count bytes that follow the instruction
are not rotated--only value bytes representing cell states are rotated
one bit.
RLUNCOMP = 1 64K of uncompressed data follows
RLRUN = 2 2-257 byte run of value follows
RLONEB = 3 Single byte of specified value follows
RLUNCS = 4 Uncompressed stream follows
RLEND = 6 End of pattern
RLLRUN = 7 Long run > 256 bytes
RLLUNCS = 8 Long uncompressed stream > 256 bytes
Uncompressed Binary Pattern Files
Uncompressed binary pattern files consist of the initial colon, an
RLUNCOMP instruction (code 1), followed by 64,400 bytes of pattern map
data (each byte rotated one bit to the right, as noted above),
followed by an RLEND (code 6) instruction. Since pattern maps can be
loaded and dumped directly from the map with one I/O call,
uncompressed binary pattern files can be read and written very
quickly.
Compressed binary pattern files begin with the colon and end with an
RLEND instruction. The data between these two bytes is a series of
instructions representing the data in run length compressed form. The
interpretation of each instruction and the bytes that follow it is as
given below:
- RLRUN Count Value
- Store Count + 1 bytes of Value in the next consecutive
cells of the map buffer.
- RLLRUN CountHi CountLow Value
- Compute Count = ((CountHi × 256) + CountLow), then store
Count + 1 bytes of Value in the next consecutive cells of
the map buffer.
- RLONEB Value
- Store Value in the next cell of the map buffer.
- RLUNCS Count Value[1] Value[2] ... Value[Count+1]
- Store the Count + 1 bytes specified by the Value[n] bytes
in the next cells of the map buffer.
- RLLUNCS CountHi CountLow Value[1] Value[2] ... Value[Count+1]
- Compute Count = ((CountHi × 256) + CountLow), then store
the Count + 1 bytes specified by the Value[n] bytes in the
next cells of the map buffer.
COLOR PALETTE FILES
Color palette files can be written in ASCII or binary form. All have
the default extension of .jcc; the format is determined from the
contents of the file.
ASCII color palette files have a very simple format; they are easy to
generate manually or programmatically for custom representation of
states. The format is as follows.
The first line contains a single number, the format indicator, which
ranges from 1 to 3. Values signify:
1 CGA Palette file
2 VGA Palette file
3 Composite (CGA and VGA) palette file
This line is followed by up to 256 lines specifying color assignments
for states. If fewer than 256 lines follow the format indicator, the
color assignments for those states will be left as before. For a CGA
palette file, each line contains a single number which specifies the
CGA color for that state (the same color indices you specify with the
Alt-F9 key in JC). Each line of a VGA palette file specifies the red,
green, and blue intensities from 0 to 63, separated by spaces, for
each state. A composite file supplies both CGA and VGA values for
each state with the VGA red, green, and blue intensities followed by
the CGA color index for the state. JC and CelLab always generate composite
palette files when a palette is dumped with Ctrl-F5, even if only one
display has ever been used. Material that follows the
required specifications is ignored; CelLab takes advantage of this to
include comments in the palette files it writes to aid in their
interpretation.
Binary color palette files are all precisely 771 bytes long. The
first three bytes are an ASCII "4" (code 34 hex), a carriage return
(0D hex), and a line feed (code 0A hex). This prologue, which
identifies the file as being binary format, is followed by 768 bytes,
with consecutive triples of bytes specifying the color assignments for
the 256 state codes from 0 to 255 (768 = 3 × 256). The three bytes in
each triple specify the red, green, and blue intensities for the VGA,
from 0 through 63. The first byte of each triple also contains, in
its two most significant bits, the CGA color index assignment for the
state, from 0 to 3.
RULE DEFINITION FILES
CelLab evaluates cellular automata rules by table look-up. The basic
content of a rule definition (.jc) file is the values to be loaded
into the lookup table to define the rule. Rule definitions may also
set numerous modes that affect the operation of CelLab, request the
loading of patterns, color palettes, and the like. CelLab also
provides a Save Experiment operation. A saved experiment is
actually just a rule definition which contains embedded pattern and
color palette definitions.
Rule definition files are always written in binary mode. They consist
of a sequence of instruction codes, each followed by data specific to
that instruction. The methods and instruction codes used in rule
definition files are a very similar to those used in compressed binary
pattern files. The instruction codes used in rule definition files
are as follows:
Rule lookup table compression instructions:
RLUNCOMP = 1 64K of uncompressed rule follows
RLRUN = 2 2-257 byte run of value follows
RLONEB = 3 Single byte of specified value follows
RLUNCS = 4 Uncompressed stream follows
RLCOPYB = 5 Copy previously specified bank
RLEND = 6 End of rule definition, parameters
Rule mode request instructions:
RSHTEXT = 64 Horizontal texture request
RSVTEXT = 65 Vertical texture request
RSRAND = 66 Random stimulus request
RSPAT = 67 Pattern load request
RSPAL = 68 Palette load request
RSEPAT = 69 Embedded palette address
RSEPAL = 70 Embedded pattern address
RSRSEED = 71 Initial random seed
RSOCODE = 72 Own code load request
RSEOCODE = 73 Embedded own code address
The lookup table always consists of 65,536 bytes of data (even though
many rules do not need or use the entire table), and a rule definition
file always loads every byte of the table. Each byte in the table
represents a the new state for a cell when its state and that of its
neighbors select that cell in the rule lookup table. The cell states
in the lookup tables are stored in the internal format with Plane 0 as
the most significant bit and Planes 1 through 7 as the least
significant 7 bits (in other words, the states are rotated circularly
one bit to the right). The relationship between the values of the
neighbors and the lookup table indices for the various settings of
worldtype can be seen by examining the Java, C, or Pascal rule generation
code in the source code supplied with CelLab.
The compression algorithms used keep the size of rule files
commensurate with the actual data needed by the rule. First, let's
examine the instructions used to compress the rule table itself.
- RLUNCOMP Value[0] Value[1] ... Value[65535]
- The entire contents of the lookup table is specified by the
65536 bytes that immediately follow this instruction.
CelLab never writes rules in this format, but it will load
rules written with this instruction; it's provided as a
convenience to external programs that want to generate
rules without all the complexity of compressing them. You
can always load a rule in this format and then save it
from CelLab to save space.
- RLRUN Count Value
- Store Count + 1 bytes of Value in the next consecutive
cells of the lookup table.
- RLONEB Value
- Store Value in the next cell of the lookup table.
- RLUNCS Count Value[1] Value[2] ... Value[Count+1]
- Store the Count + 1 bytes specified by the Value[n] bytes
in the next cells of the rule table.
- RLCOPYB Pageno
- The rule table is arbitrarily subdivided into 256 byte
pages, numbered 0 through 255. Due to the structure of
rules, often many pages will be identical. This
instruction copies a prior page, Pageno, and stores it in
the next 256 bytes of the rule table.
There isn't one uniquely correct way to compress a rule table--rules
written by a BASIC rule generation program are encoded differently
than those written from Java, Pascal, or C, but the contents of the lookup
table will be identical after the rule is loaded.
The following instructions specify modes which the rule can set. The
RLEND instruction both specifies modes and marks the end of the rule
definition; it is required. All of the other mode request
instructions are optional--if they are not specified the default
values are used. These instructions convey the rule requests made by
calls to the various set... method invocations in the
jcruleModes method of a Java rule program, setting the
variables with the corresponding names in C or Pascal (in BASIC the
same functions exist but the variable names are different to comply
with BASIC's short identifier restrictions). Please refer to the
documentation of these variables for details on their interpretation.
- RLEND worldtype randdens auxplane
- This instruction marks the end of the rule definition and
conveys in the three bytes that follow the instruction code
the settings the rule definition function stored in
worldtype, randdens, and auxplane. The process of loading
a rule is ended after this instruction is processed
(although if the file is a saved experiment, pattern and
palette information may be present in the file following
the RLEND instruction).
- RSHTEXT texthb texthn
- Requests horizontal texture of texthn bits, starting at
plane number texthb.
- RSVTEXT textvb textvn
- Requests vertical texture of textvn bits, starting at plane
number textvb.
- RSRAND randb randn
- Requests random input each generation. Randn bits of
random input are stored with the least significant bit in
plane randb.
- RSRSEED rseedb rseedn rseedp
- Requests that an initial random seed be stored when the
rule is loaded or a new pattern is loaded while this rule
is in effect. The least significant bit of the random seed
is stored in plane rseedb and rseedn bits of seed are
stored. The setting of rseedp controls the density of the
seed--a value of 255 results in half the bits being zeroes,
0 generates all zeroes, and intermediate values vary the
density between these limits.
- RSPAT nlen name[0] name[1] ... name[nlen-1] 0
- The value of nlen gives the length, including the
terminating zero, of a file name which CelLab will attempt
to load as a pattern file after loading the rule.
- RSPAL nlen name[0] name[1] ... name[nlen-1] 0
- The value of nlen gives the length, including the
terminating zero, of a file name which CelLab will attempt
to load as a color palette file after loading the rule.
- RSOCODE nlen name[0] name[1] ... name[nlen-1] 0
- The value of nlen gives the length, including the
terminating zero, of a file name which CelLab will attempt
to load as an own code file after loading the rule.
- RSEPAT addr[0] addr[1] addr[2] addr[3]
- The four bytes that follow give the address, written with
the least significant byte first and the most significant
byte last, of a compressed binary format pattern file
embedded within this .jc file somewhere after the RLEND
instruction. After the rule table is loaded, CelLab will
load that embedded pattern. This is used to encode
patterns within saved experiments.
- RSEPAL addr[0] addr[1] addr[2] addr[3]
- The four bytes that follow give the address, written with
the least significant byte first and the most significant
byte last, of a binary color palette file embedded within
this .jc file somewhere after the RLEND instruction. After
the rule table is loaded, CelLab will load that embedded
color palette. This is used to encode color palettes
within saved experiments.
- RSEOCODE addr[0] addr[1] addr[2] addr[3]
- The four bytes that follow give the address, written with
the least significant byte first and the most significant
byte last, of user own code embedded within this .JC file
somewhere after the RLEND instruction. The own code is
written with two bytes which give its length (least
significant byte first, most significant byte last), then
the contents of the .jco file from which the own code was
loaded. Note that own code is embedded within both rules
and experiments saved from CelLab when own code was loaded.
User evaluators implemented as Windows DLLs for use
with CelLab for Windows are not embedded
in experiment files; instead a RSOCODE request naming
the evaluator file is included. Windows isn't able
to load a DLL embedded in another file, so it isn't
possible to embed such evaluators.
Rule definition files are the most complex of the files used by
CelLab, but if you get confused about their format, you can always
examine the source code of the Java, C, Pascal, or BASIC rule maker to
resolve any questions. Note that the BASIC rule maker does not
compress the rule definitions it writes.
Population history log (.jch) files consist of a header
followed by zero or more records indicating the population of
cells in each state at the time the log entry was made, whether manually
or by an automatic dump every so many generations. The file
header is 24 bytes in length and consists of the string,
defined using the C language convention as:
"CelLab population log\r\n\032"
The "\032" is a Ctrl-Z which serves as the end of
file marker in a DOS text file, and is intended to keep
the binary data which follow from appearing if the user
inadvertently types the file.
Each population log record is described by the C language
structure:
struct populationLogItem {
char ruleName[16]; /* Rule currently executing */
long generationCount; /* Generation count at time of dump */
unsigned short cellHistogram[256]; /* Number of cells in each state */
};
This structure is written in the file with no padding and with
the long and short values in Intel (least-significant
first) byte order. The ruleName field is a zero-terminated
ASCII string which gives the name of the rule executing at the time
the population was dumped, with generationCount specifying
the number of generations which had executed since a rule or pattern
was last loaded, or the generation count was manually reset by
the user. The balance of the record is the cell population histogram
array, cellHistogram, which gives, for each possible 8-bit
cell state from 0 to 255, the number of cells
in that state when the log entry was made. The sum of all the
cellHistogram entries will always be 64000, the total number
of cells in the 320×200 cell map.
MOVIE FILES
The following movie file format pertains only to movies created
by JC. CelLab for Windows writes moves in Microsoft .avi
format, about which the less said the better.
Movie files consist of a header indicating whether the movie was
recorded from a CGA screen or a VGA, a copy of the color palette in
effect when the movie was recorded, and a sequence of compressed
screen images taken directly from the CGA or VGA screen. The header
is three bytes, simply "CGA" or "VGA" in ASCII. Immediately following
the header is a binary dump of the color palette in effect when the
movie was made, written in the format described above for binary color
palette files. After the three byte header and 771 bytes of color
palette, a sequence of saved frames appears.
Each saved frame consists of the contents of the CGA or VGA frame
buffer. For the VGA, the dump is of the 64,000 bytes on the screen,
one byte per pixel. For a CGA, the dump consists of the 16,000 bytes
that encode the pixels on the screen, four pixels per byte. On the
CGA, the even numbered lines are dumped first, then the odd numbered
lines. Each compressed frame stands by itself--no inter-frame delta
compression is performed. Each compressed frame consists of a series
of instruction codes and arguments that follow them. The instructions
are as follows; instruction codes are given in decimal.
- 248
- End of CGA bank. Denotes end of the even lines on the
CGA. Resets to load the odd lines.
- 249 Llow Lhigh
- Long zero run. (Lhigh × 256 + Llow) specifies the
number of zero bytes to store in the screen map.
- 250 Length
- Short zero run. Store Length bytes of zero in the
screen map.
- 251 Llow Lhigh Pixel[0] Pixel[1] ... Pixel[n]
- Long uncompressed sequence. Store the (Lhigh × 256 +
Llow) Pixel[x] values that follow into the screen map.
- 252 Llow Lhigh Pixel
- Long nonzero run. Store (Lhigh × 256 + Llow) bytes with
value Pixel.
- 253 Length Pixel[0] Pixel[1] ... Pixel[n]
- Short uncompressed sequence. Store the Length Pixel[x]
values that follow into the screen map.
- 254 Length Pixel
- Short nonzero run. Store Length bytes with value Pixel.
- 255
- End of frame. This appears at the end of the compressed
frame.