
Data City – A short story (2001)
The story explains how we can invent the future of biology by using
our imagination to build a data city. Thus far, the story includes
four chapters. Chapter 1 introduces
Data City
and explains how we plan to use it. Chapter 2 takes us into the
quantitative core of biology by peeling away layers of complexity.
It then explains how to design the buildings of the city. Chapter 3
uses these buildings to explore the underlying principles of
biology. Chapter 4 relates these principles to laws of nature. In
effect, the story shares an extremely well hidden secret with the
reader. It shows that the biology literature contains a vast
reservoir of new data sitting just beyond our reach. The secret of
the short story is that it explains how to extend our technology far
enough to capture some of this prodigious wealth.

Course 1: Human Biology (2001)
The course is configured for undergraduate biology students.
However, it can be readily reconfigured into other biology courses -
such as high school biology, advanced histology (for graduate,
medical, and dental students), human pathology, etc. The HuBio
course includes thirty-eight lectures, assignments, exercises,
simulators, and quizzes. Here technology is used extensively
to create a new environment for students – one that offers
experiences quite unlike those they currently enjoy with either
textbooks or lectures.

Course 2: Mathematics
- Stereology (2001) Fundamental to an enterprise approach is
the importance of being connected – well connected. The
mathematics course looks at research data in biology with the view
of connecting them across a hierarchy of size – extending from
individual molecules to complete organisms. The course
identifies data collected with unbiased methods as the best for
making these connections and explains how to recognize and collect
such data.

Course 3: Technology - Enterprise
Biology (2001) The course begins by defining enterprise biology
as the conjunction of three models: qualitative, quantitative, and
relational. The point of the course is to define a mathematical
framework for the biology literature, one that can serve as a
springboard into the unknown. Using this new framework, the student
quickly sees how discovery depends – decisively - on creating new
data from old. The course continues from this observation with
illustrations of how an enterprise approach to biology can be
applied to challenging problems, such as explaining gene function.

Appendix (2001)
The appendix includes directions, tools, and data entry screens for
managing the software. In effect, it offers a realistic view of
what it takes to build and maintain an enterprise system in an
academic setting. It also demonstrates the effectiveness of an
enterprise approach by creating a host of new opportunities. Each
release of the Enterprise Biology Software package includes new and
upgraded programs.

BIOLOGYtabs 2002
BIOLOGYtabs 2002 includes selected EBS programs that have research
data stored therein. As such, they do not require the client-server
configuration of the original package. New features include
abstracts online, methods for minimizing bias, strategies for
unfolding complexity, and methods for generating organs from seed
values with biological algorithms.

BIOLOGYtabs 2003
BIOLOGYtabs 2003 includes selected EBS programs that have research
data stored therein. As such, they do not require the client-server
configuration of the original package. New features include design
codes (simple and complex), ladder equations, change equations, and
a strategy for dealing with complexity by moving to a higher
dimension.

BIOLOGYtabs 2004
BIOLOGYtabs 2004 includes selected EBS programs that have research
data stored therein. New features include
four equation libraries (repertoire, analogy, drill-down, and ladder),
graphs (citations and methods), rule-based connections, structural
networks, and new strategies for finding
mathematical order in biology.

BIOLOGYtabs 2005
BIOLOGYtabs 2005 includes selected EBS programs that have research
data stored therein. New features include
a decimal equation library, puzzles (counting molecules, unfolding
and refolding the hippocampus, and building a universal biology
database), and examples of interpreting the results of an experiment
within the framework of a data-driven biology.

Puzzle 1 (2005): COUNTING MOLECULES - for students in molecular,
cellular, and systems biology... The program introduces the student to the
process of (1) designing
experiments as equations, (2) running experiments and interpreting
results - with and without complexity, and (3) evaluating research
publications. The program will be of special interest to anyone
reporting research results as optical densities, concentrations, or
stereological densities - it shows how to increase the reliability of
these data types.

Puzzle 2 (2005):
HIPPOCAMPUS - unfolded & refolded. The program
introduces the process of (1) writing equations for organs - using
published data, (2) predicting the structure of a hippocampus from a
single value (a volume or cell count), and (3) identifying unique
phenotypic patterns in species with similar genotypes, namely the human,
monkey, mouse, rat, and shrew. The program will be of special
interest to investigators wishing to unravel complexity - across species
- as a way of exploring the relationship of the genome to phenotypic
expression.

Puzzle 3 (2005): UNIVERSAL BIOLOGY DATABASE
1.0 The program introduces a modern digital library
for the basic and clinical sciences. The data of published
research papers are stored in a relational database, standardized to
a universal format, hardened by minimizing bias and variability,
integrated across disciplines, transformed into equations, and
equipped with a user-friendly interface. It offers the
investigator a working model of a data-driven biology, one designed
specifically for exploring biology in novel ways.

UNIVERSAL BIOLOGY DATABASE
2.0 (2006)
The database was upgraded to include both control and experimental
data, connected to the original stereology literature database, and
refitted with a "query by example" interface. In turn, the new
database was used to translate research papers into stacks of
equations, to summarize biological rules for assembling parts
into larger structures, and to explain the process of reverse engineering.

UNIVERSAL BIOLOGY DATABASE
3.0 (2007) The software package was updated by adding
new data and tools - including a concentration trap, cluster
analysis module, and Rule Book. The solution to Puzzle 4,
which began with a careful look at semiquantitative data, led to
hybrid hierarchy equations, gold standards, and a strategy for
capturing the data of molecules and genes.
In effect, we can now access, integrate, and interpret data from all
sixteen levels of the biology hierarchy - seamlessly. One
finding was most surprising. Biology turns out to be unique
among the basic sciences in that its data are inextricably
bound to their locations.
Such a relationship defines a
key element of biological complexity. This means that the
interpretation of research data requires a consideration of two
elemental factors - a numerical quantity and a location – both of
which are embedded in the equation of the experiment. In other
words, biological complexity becomes manageable when our research
data become fully quantitative and integrated.

UNIVERSAL BIOLOGY
DATABASE 4.0 (2008)
The software package challenges the reader by increasing the
difficulty of the puzzles to the level of an information science.
At this level, we play biology’s game according to biology’s rules.
The process is surprisingly straightforward. Find out what biology
does and then do exactly the same thing. When biology changes its
phenotype, mirror the changes with a phenotype of our own. When
biology behaves quantitatively, we behave quantitatively. And the
reward for our efforts is? We can design a quantitative model for
systems biology - one that can be readily translated into software
and shared with contributing authors.

SYSTEMS BIOLOGY
TWO 1.0 (2009)
The software package includes an Information Infrastructure
and its first offspring, Systems Biology Two (SB2).
Within the information
infrastructure, research publications become translated into tables of
digitized data that, when allowed to interact and connect, emerge as
a robust platform for exploring
biological complexity. In addition to creating new
opportunities for diagnosis and prediction, we also increase our
changes of detecting biological changes - routinely - by at least an order of
magnitude. The report explores ways in which this new information infrastructure
can serves as an engine for innovation, discovery, and productivity.

ORGANISM CODES (2010)
The software package includes an Information Infrastructure
with its second offspring, a collection of Organism Codes.
These codes, which are based on data triplets, offer an unrestricted
view of phenotypes defining themselves quantitatively in terms of
nodes and connections. The codes show us exactly how
phenotypes change and predict the existence of template codes for phenotypes. The report examines the relationship of
phenotype
to complexity and considers how this association impacts all segments of the biology enterprise.

MATHEMATICAL MAPPING (2011)
The software package includes an Information Infrastructure
with its third offspring, Mathematical Mapping.
These maps, which are based on data triplets, allow us to extract
biological rules of structural design from its parts and to interpret
large and complex data sets with equations and cutting edge graphics. The report examines the relationship of
triplets
to complexity and considers how this association advances the
development of theory structure in biology. An example, taken
from the literature, explains how we can unravel a complex disease
(schizophrenia) by extracting the rules and using them to play the
complexity game. biology.