Interactive Genome Browser
by John Walker
Introduction
If your Web browser supports Java, you can use the applet embedded
in this page to interactively browse the genomes of various
organisms, using a variety of methods to decode potentially
hidden information within, displaying the results as an image.
To launch the browser, choose the organism you'd like to
browse initially (you can select a different organism from
within the browser, later on) and the location of the
closest server from which the browser will obtain the
sequence data, then press the "Launch Browser" button to
start the Java browser application. If all goes well, a window
will appear in which the browser is running.
Web-Based vs. Stand-Alone Browsers
With most Web browsers, the genome browser window will be labeled with a prominent
warning that it is an "Application Window", or "Untrusted", or
somesuch. This nothing to worry about. Whenever a Java applet running
within a Web page creates a window independent of the Web page,
a warning usually appears so the user can be fooled by the Java
program displaying, for example, a copy of a system login window
asking the user to enter their password. The genome browser creates
a separate window rather than running inside a box on the Web
page because that's the only way it can allow the user to adjust
the size and position of the browser window to something suitable for
the screen size and the resolution at which the genome is being
viewed.
Other security restrictions imposed (for excellent reasons) on
Java programs run from Web pages prohibit the genome browser
from reading genomes stored as files on your local machine.
If you're planning to do extensive exploration with the
genome browser, it's best to download copies of the genomes
you plan to explore, along with the stand-alone genome
browser application, with you run as a regular application
on your computer under Java virtual machine implementation
(usually supplied as part of a Java development environment
but, as time goes on, it will increasingly become a standard
component of system software. A stand-alone application, where
you must take the initiative to install it and start it, is
permitted to read files on your machine, and thus can access
genomes downloaded to your hard drive rather than chunk by
chunk over the Internet. This makes the browser respond much
more quickly, especially when the Internet is heavily loaded,
which is almost always.
The main reason for providing a version of the browser that
you can launch from this Web page is that starting it up is
a matter of a single click. The instructions for downloading,
installing, and launching a stand-alone Java application differ
from computer to computer, so it's up to you to figure out
these steps--no universal set of instructions can be provided.
Launching the browser from this page is a perfectly viable way
to get started; if you end up using the browser a lot and
are irritated by delays due to transferring information over
the Internet, you might consider downloading the stand-alone
version.
Other than the difference in speed accessing genome data from
your hard drive as opposed to obtaining it on the fly across
the Internet, the Web-launched and stand-alone versions of
the genome browser are absolutely identical. Thus,
the following description of how to use the browser applies equally
to both.
Browser Operation
When you launch the browser, you'll see a window which resembles the
one above, with the obligatory warning that the window was created by
an applet. (The appearance of the warning will vary from system to
system.) At the top is the graphical display of the information in the
genome produced with the settings of the various controls below. The
following sections discuss these controls in detail.
Current Genome URL
This text field gives the URL (location on the World-Wide Web) of
the genome you're currently browsing. If the genome is located on a
Web server, this will begin with "http:"; if the genome database resides
in a file on your own computer, the field will begin with "file:".
You can edit the contents of this field to switch to a different
genome--when you press Enter, the genome in the new URL
will be loaded. In most cases, however, it's more convenient
to use the genome selection choice box below, since you don't have
to remember the exact file name for the genome.
Genome Selector
When browsing a genome, this control will read "URL above", which
indicates the Current Genome URL field contains the source of the
genome being explored. Pressing the button displays a list of all
organisms, real and artificial, available on this server. If you've
created files for additional organisms, you'll have to manually
enter the file name in the URL box, as the Genome Selector
will not display them.
Start Position and Scroll Bar
The portion of the genome visible in the window is set by the Start Position
box and buttons and the scroll bar at the right of the window. The number
in the Start box is the number of the nucleotide which appears at the
upper left of the display. Nucleotides are numbered with 0 being
the start of the published sequence. (Note that for bacteria and
viruses with circular DNA strands, the choice of a starting point
is arbitrary.)
To change the starting nucleotide, enter the new value in the
Start text field and press either the Enter key or the
"Update" button. The small buttons to the left and right of the
Start field increment and decrement the Start value by 1
("+" and "-" buttons), or by half the current display width
("»" and "«" buttons).
The scroll bar adjusts the start position with the usual semantics
for scroll bars. The line down and line up arrows increment and
decrement the start position by the current line width. Clicking
below or above the scroll box moves a full screen forward or
back in the genome, and dragging the scroll box adjusts the
start position to the corresponding position in the genome. As you
drag the scroll box, the Start field will show the nucleotide
number corresponding to the current position of the box. The
screen is not repainted until you release the scroll box; this avoids
jerky response due to the time required to repaint
the screen, particularly when data must be obtained from a remote
Web site.
Bits per Base and Variant
These fields determine how sequences of nucleotides are decoded
into the binary sequence which is, in turn, displayed in the image
panel. The nomenclature used in these fields is as given
in the document Storing Data in DNA.
"Bits/base" determines how many bits are encoded by each nucleotide
(base pair) in the genome, and may be set to 1 or 2. "Variant"
chooses which among the possible encodings of one or two bits
per nucleotide should be used. Detailed tables of how
one bit per base
and
two bit per base
encodings transform nucleotide sequences into streams of
binary data are given in the aforementioned document.
Width
A two-dimensional pattern is generally comprehensible only when
viewed with the same line length as used to encode it. The
Width box (into which a value can be entered) and the associated
buttons which increment and decrement the value in the Width
field.
Please see
Hide and Seek: An Image in the Genome
for an example of how adjusting the width can reveal a message
difficult to discern until the correct width is discovered.
Zoom Factor
Modern computer graphics displays have such high resolution that
individual pixels are difficult to distinguish. Setting the Zoom factor greater
than 1 expands each pixel into a box with edge size equal to the zoom
factor. If the product of the zoom factor and line width exceeds
the width of the window in pixels, the image will be truncated at the
right--resize the browser window until you see the blue margin which
indicates you're seeing the entire width of each line. If your screen
isn't wide enough to show an entire line, reduce the zoom factor
until it does.
Update and Quit Buttons
Pressing Enter in any of the text entry fields causes an immediate update
of the image display panel. Depending on the speed of your computer and
whether the data requested must be obtained across the Internet from
a distant site, this may take some time. If you wish to change a number
of parameters at once, you can avoid waiting for intermediate updates
by not pressing Enter after entering each value and, instead,
pressing Enter only on the last value, or pressing the
Update button after entering the last.
The Quit button exits the genome browser. If you started the
browser from a Web page, you can open a new browser by pressing
the "Launch Browser" button. The standalone browser application
will exit to the command line; to restart it, use the same command
you used to invoke it in the first place.