EBS 2001 to 2007/8

The Enterprise Biology Software Project takes a somewhat unconventional approach to biology.  Instead of trying to explore biology with existing technologies, it first identifies a problem and then assembles a new technology to solve it.  For example... 

How can a student learn more in less time and get a better grade?  What types of data are needed to explain gene function?   How can we access the organizing principles of biology and then use them to predict biological events?  How does biology construct n-dimensional networks?  How can technology accelerate productivity gains in research and education?  How can the largely independent disciplines of biochemistry, molecular biology, immunocytochemistry, and stereology be integrated mathematically? 

In effect, the enterprise software serves as a new enabling technology for the biology community.  The software includes a short story, courses, databases, appendix, puzzles, software tools, blueprints, a collection of rules, and yearly reports. 

 


Data City – A short story (2001)  The story explains how we can invent the future of biology by using our imagination to build a data city.  Thus far, the story includes four chapters.  Chapter 1 introduces Data City and explains how we plan to use it.  Chapter 2 takes us into the quantitative core of biology by peeling away layers of complexity.  It then explains how to design the buildings of the city.  Chapter 3 uses these buildings to explore the underlying principles of biology.  Chapter 4 relates these principles to laws of nature.  In effect, the story shares an extremely well hidden secret with the reader.  It shows that the biology literature contains a vast reservoir of new data sitting just beyond our reach.  The secret of the short story is that it explains how to extend our technology far enough to capture some of this prodigious wealth.


Course 1: Human Biology (2001)  The course is configured for undergraduate biology students.  However, it can be readily reconfigured into other biology courses - such as high school biology, advanced histology (for graduate, medical, and dental students), human pathology, etc.  The HuBio course includes thirty-eight lectures, assignments, exercises, simulators, and quizzes.  Here technology is used extensively to create a new environment for students – one that offers experiences quite unlike those they currently enjoy with either textbooks or lectures.  


Course 2: Mathematics - Stereology (2001)  Fundamental to an enterprise approach is the importance of being connected – well connected.  The mathematics course looks at research data in biology with the view of connecting them across a hierarchy of size – extending from individual molecules to complete organisms.  The course identifies data collected with unbiased methods as the best for making these connections and explains how to recognize and collect such data.


Course 3: Technology - Enterprise Biology (2001)  The course begins by defining enterprise biology as the conjunction of three models: qualitative, quantitative, and relational.  The point of the course is to define a mathematical framework for the biology literature, one that can serve as a springboard into the unknown.  Using this new framework, the student quickly sees how discovery depends – decisively - on creating new data from old.  The course continues from this observation with illustrations of how an enterprise approach to biology can be applied to challenging problems, such as explaining gene function.   


Appendix (2001)  The appendix includes directions, tools, and data entry screens for managing the software.  In effect, it offers a realistic view of what it takes to build and maintain an enterprise system in an academic setting.  It also demonstrates the effectiveness of an enterprise approach by creating a host of new opportunities.  Each release of the Enterprise Biology Software package includes new and upgraded programs.  

 


BIOLOGYtabs 2002  BIOLOGYtabs 2002 includes selected EBS programs that have research data stored therein.  As such, they do not require the client-server configuration of the original package.  New features include abstracts online, methods for minimizing bias, strategies for unfolding complexity, and methods for generating organs from seed values with biological algorithms.

 


BIOLOGYtabs 2003  BIOLOGYtabs 2003 includes selected EBS programs that have research data stored therein.  As such, they do not require the client-server configuration of the original package.  New features include design codes (simple and complex), ladder equations, change equations, and a strategy for dealing with complexity by moving to a higher dimension.  

 

 


BIOLOGYtabs 2004  BIOLOGYtabs 2004 includes selected EBS programs that have research data stored therein.  New features include four equation libraries (repertoire, analogy, drill-down, and ladder), graphs (citations and methods), rule-based connections, structural networks, and new strategies for finding mathematical order in biology.  

 

 


BIOLOGYtabs 2005  BIOLOGYtabs 2005 includes selected EBS programs that have research data stored therein.  New features include a decimal equation library, puzzles (counting molecules, unfolding and refolding the hippocampus, and building a universal biology database), and examples of interpreting the results of an experiment within the framework of a data-driven biology.  

 


Puzzle 1 (2005): COUNTING MOLECULES - for students in molecular, cellular, and systems biology...  The program introduces the student to the process of (1) designing experiments as equations, (2) running experiments and interpreting results - with and without complexity, and (3) evaluating research publications.  The program will be of special interest to anyone reporting research results as optical densities, concentrations, or stereological densities - it shows how to increase the reliability of these data types. 


Puzzle 2 (2005): HIPPOCAMPUS - unfolded & refolded.  The program introduces the process of (1) writing equations for organs - using published data, (2) predicting the structure of a hippocampus from a single value (a volume or cell count), and (3) identifying unique phenotypic patterns in species with similar genotypes, namely the human, monkey, mouse, rat, and shrew.  The program will be of special interest to investigators wishing to unravel complexity - across species - as a way of exploring the relationship of the genome to phenotypic expression.      


Puzzle 3 (2005): UNIVERSAL BIOLOGY DATABASE 1.0  The program introduces a modern digital library for the basic and clinical sciences.  The data of published research papers are stored in a relational database, standardized to a universal format, hardened by minimizing bias and variability, integrated across disciplines, transformed into equations, and equipped with a user-friendly interface.  It offers the investigator a working model of a data-driven biology, one designed specifically for exploring biology in novel ways.                      

 


UNIVERSAL BIOLOGY DATABASE 2.0 (2006)  The database was upgraded to include both control and experimental data, connected to the original stereology literature database, and refitted with a "query by example" interface. In turn, the new database was used to translate research papers into stacks of equations, to summarize biological rules for assembling parts into larger structures, and to explain the process of reverse engineering.            

 


UNIVERSAL BIOLOGY DATABASE 3.0 (2007)  The software package was updated by adding new data and tools - including a concentration trap, cluster analysis module, and Rule Book. The solution to Puzzle 4, which began with a careful look at semiquantitative data, led to hybrid hierarchy equations, gold standards, and a strategy for capturing the data of molecules and genes.  In effect, we can now access, integrate, and interpret data from all sixteen levels of the biology hierarchy - seamlessly.  One finding was most surprising.  Biology turns out to be unique among the basic sciences in that its data are inextricably bound to their locations.  Such a relationship defines a key element of biological complexity.  This means that the interpretation of research data requires a consideration of two elemental factors - a numerical quantity and a location – both of which are embedded in the equation of the experiment.  In other words, biological complexity becomes manageable when our research data become fully quantitative and integrated.