|
| |
View Catalogue: Control Data View Catalogue: Control Data View Catalogue: Control Data View Catalogue: Control Data View Catalogue: Control Data View Catalogue: Control Data View Catalogue: Control Data View Catalogue: Experimental Data View Catalogue: Experimental Data View Catalogue: Experimental Data Standardized Paper: Control Standardized Paper: Control Standardized Paper: Control Standardized Paper: Experimental Standardized Paper: Experimental Query Data: By Citation - Control Query Data: By Citation - Control Query Data: By Citation - Control Query Data: By Citation - Control Query Data: By Citation - Control Query Data: By Citation - Experimental Query Data: By Citation - Experimental Query Data: By Citation - Experimental Query Data: By Citation - Experimental Enter Data Pairs: Control Enter Data Pairs: Control Enter Data Pairs: Control Enter Data Pairs: Control Enter Data Pairs: Control Enter Data Pairs: Control Enter Data Pairs: Control Enter Data Pairs: Experimental Enter Data Pairs: Experimental Query Data Pairs: Control Query Data Pairs: Control Query Data Pairs: Control Query Data Pairs: Control Query Data Pairs: Experimental Query Data Pairs: Experimental Query Data Pairs: Experimental Query Data Pairs: Control & Experimental Query Data Pairs: Control & Experimental Query Data Pairs: Control & Experimental Query Data Pairs: Control & Experimental Query Data Pairs: Control & Experimental Steady State Change: Enter Data Steady State Change: Enter Data Steady State Change: Query Data Steady State Change: Query Data Steady State Change: Query Data Transitional Change: Enter Data Transitional Change: Enter Data Transitional Change: Enter Data Transitional Change: Query Data Transitional Change: Query Data Transitional Change: Query Data Transitional Change: Query Data Unfolding Complexity: Experiments as equations Unfolding Complexity: Experiments as equations
Unfolding Complexity: Experiments as equations
Unfolding Complexity: Experiments as equations
Unfolding Complexity: Experiments as equations
Unfolding Complexity: Experiments as equations
Unfolding Complexity: Experiments as equations
Unfolding Complexity: Experiments as equations
Engineering: Reverse and Forward Engineering: Reverse and Forward Concentration Trap: Unintended Consequences Concentration Trap: Unintended Consequences
Systems Biology Two: Phenotypes that change
The software tree translates a collection of programs and documents into a working information infrastructure for the basic and clinical sciences. By reorganizing the literature with technology, it allows biology to behave as a quantitative science, one that operates effectively within the realm of biological complexity. The gallery, which contains 131 screens, offers a glimpse into what it takes to build such a resource. The first step in organizing the data in a paper consists of constructing data tables for entering data into a relational database. The hardest part of the job consists of extracting numerical data from data plots and graphs. This tool greatly simplifies the task and is self-explanatory. Data entry requires a playbook, one that tells me exactly where all of the parts go. It takes about five years of actively entering data to get one that works consistently. Every numerical data point carries a name and a hierarchical location. This program collects such information from published papers until they merge into a standard data entry protocol for each biological part. Such an approach works quite successfully because it enters data according to the wishes of the original authors. Let’s try an example. How do I find the template for entering data from the amygdala? I type <amyg> into the white data entry field, press Enter, and amyg appears in the twelve headings with yellow backgrounds. By clicking on each of these heading, I find the amygdale hierarchies already entered. Finally, I select the one appropriate for the data currently being entered.
When I click on the yellow fields, they turn white and automatically search for amyg. If I want to review what I know about the amygdala, the glossary can be helpful. Citations for articles go into two tables - citation and authors. In practice, I use a third screen (Notepad or Wordpad) to copy and paste citation data downloaded from the Internet. This approach helps to minimize data entry errors. Note that a programmable key pad (e.g., XKeys) speeds data entry wonderfully.
I can click on the List button to see the results. If there is a mistake, I return to the data entry screen, make corrections, and click on the Update button to save the results. For each new paper, I look first at the methods and the data - in that order. I look for evidence of unbiased sampling and data collected with the best methods currently available.
Using drop down data lists for the data entry fields speeds data entry and introduces standardization. Since a given paper can use more than one immunological label, a separate table is included. Data entry represents a multistep process. I begin by assembling a structural hierarchy and then map numerical values to it - first for the control and then for the experimental data.
I use the left side of the screen to build the structural hierarchy (using the hierarchy template) and the right side to enter numerical data. I work my way through the tabs entering and storing information. A major part of data entry consists of moving data from one screen to the next - making connections according to the rules of relational databases. Pressing a single button on a programmable key pad automates most of the data entry. The drop down list box labeled utilities contains an item called lookup. It appears in the next screen. We will use it to consider functional data.
To be universally useful, a database for the biology literature must accept data coming from a wide range of disciplines. Since the design of the database is based on structure, any data type related to a structure becomes a candidate for data entry. Recall that there exist three basic data types: concentrations (densities), amounts (absolute data), and mean values. For structures, these data types include volumes, surfaces, lengths, and numbers; for functions, the data types become derivatives thereof. When I click on the Functional Units button, a screen displays at the left. It allows me to store most data types in this database and to map structure to function (biochemistry, molecular biology, physiology). Standardization is introduced by determining the contents of these tables that appear as drop down lists in the major data entry screens.
Making data entry mistakes is surprisingly easy and occurs more often than one might imagine. The main data entry screen (right side) has two row of radio buttons marked D (Data) and H (Hierarchy). I use the D option to correct numerical values and H to correct the hierarchy template. This screen illustrates the D option for cells. This screen illustrates the H option for cells. Experimental data are entered as described for the controls, except that each data point is also connected to its control. As you can see, creating a database for biology requires making the right connections. The data entry screen of the experimental resembles that of the control. Moving published data into a relational database creates a digital data catalogue. Such catalogues promote data access and become the starting point for creating new forms of information.
has the advantage This citation screen comes with two data windows: by all authors and by only first authors. The radio buttons perform sorting and filtering tasks, whereas the white data entry fields can be used to type in numbers or text and run by pressing the Enter key.
If you find an interesting paper and want to read it online, click on the Abstract button and follow the directions. One of the great advantages of using a relational database to store our published data is that we can find them quickly and easily. By connecting text (hierarchy) and data (numerical) tables, we get a set of summary tables, one for each level of the hierarchy. When I click on a hierarchy button, it turns yellow and displays its contents. In turn, it can be sorted and filtered using the screen controls provided. Notice the little back rectangle in the lower left corner of the screen. By clicking and draging, you can use it to create a split screen. When I want to find the data of a given paper, I enter its citation number and press Enter. In this example, entering 3333 and pressing Enter produced a blank screen because there were no data in organ subcompartment 3 (OSc3). I continue my search by clicking on all the remaining hierarchy buttons. There's some in organ (O)... more in organ subcompartment 5 (OSc5)... A similar catalogue exists for experimental data. Hierarchy buttons provide access to the data summaries. Scroll the screen to the right. Notice that numerical values significantly different from their controls are highlighted in red. Data can be displayed hierarchically - paper by paper. I enter the citation number (cit_nu) and press Enter. As an example, I can enter the citation number <3333> and press Enter. Data can be displayed hierarchically - paper by paper. As an example, I can enter the citation number <3333> and press Enter. We have two options. We can find things by sorting and filtering a given database table or we can construct a table from scratch by sending a set of instructions (SQL script) to the database engine. Notice that the SQL script at the bottom of the screen retrieves all the available data. I modify this global search by selecting or adding specific instructions. If I want papers dealing with golgi, for example, typing <like %golgi%> in the Structure Y field will modify the script accordingly. I click on the Retrieve button to run the search. Next, I click on the View Data button to display the results in a table. A similar screen is used for experimental data. In this example, I am looking for all papers studying <stress> in the <ca1> region of the hippocampus. Notice how these instructions are added to the SQL script below. I click on the Retrieve button to run the search. Clicking on the View Data button displays the results. Methods contribute importantly to experimental complexity. When looking for papers that used specific methods, this screen can help. I enter the items of interest, and click on the Retrieve button. The query finds several references ... ...that can be displayed by clicking on the View Data button. I can use the citation number to view the standardized paper or the accession number to read the paper online. A data catalogue becomes a springboard to innovation and discovery by forming new databases. The universal biology database, for example, creates new information by connecting data in novel ways. These connections form quantitative patterns that we can capture with equations. By pursuing a data-driven biology, we begin to profit from the advantages of being a quantitative science. Notice that this screen has three data windows. The one in the middle, which is a version of the familiar data catalogue (stereology literature database), will be used to connect these data by populating the ones above and below (universal biology database).
Notice that four parts taken two at a time give twelve data pairs. When the sinusoid becomes the X Structure, it generates three data pairs as shown in the screen at the top. Clicking on the DP Library button displays these results in the universal biology database. Notice that each data pair becomes associated with a decimal repertoire equation that characterizes the connection quantitatively as a ratio. The decimal repertoire equation predicts the ratio and can be evaluated by comparing the predicted and observed values (note the column with the yellow background). Clicking on the Key button displays a list of terms and definitions. Forming data pairs from experimental data duplicates the process just described for control data. Experimental data pairs are formed from the data catalogue (center data entry screen). Control and experimental data pairs exist together in the same table. Data entry fields along with radio and command buttons can be used to sort and filter the table. Since the table is wider than the screen, the split screen tool is included. I use this sort command to order the table by the decimal repertoire equations. This is the result of such a sort. Query by example (QBE) screens simplify the task of running complex searches. Let's look for follicular cells in the mouse. The screen shows how to set up such a search and displays the instructions that will be sent to the database engine. The results appear after clicking on the Retrieve button. The View Data button displays the results as a scrolling table. Experimental data can also be retrieved with a query by example (QBE) screen. This query looks for volumes and mean volumes of parts in the aging rat. The query returns 52 rows of data. Control and experimental data pairs can be searched at the same time. This query wants all the data pairs that belong to decimal repertoire equation 10. The query returns 847 rows of data. Clicking on the View Data button displays the scrolling table. Notice that data pairs throughout the hierarchy can share a similar ratio (i.e., decimal repertoire equation). When I scroll to the bottom of the screen, a data plot shows that parts large, small, and in between can share similar ratios. This tells us that order - defined as specific connections - can scale from small to large. Change can be quantified as it occurs (transitional) or after it reaches a steady state. A steady state change comparing an experimental data pair ratio (DRE) to its control data pair ratio (DRE-C) can also be expressed as a % Ratio Change (%RC-S). Notice that by simply moving the data pairs into a new configuration, a new form of information appears. Finding patterns of change reduces to querying the database. During aging, what parts change in number by more than 25%? This is the question being asked by the query screen. The question has 337 answers. Data pairs - adjacent in time - detect changes as they occur. These data pair ratios (DPRs) can be plotted as phenotypes or used to calculate % change ratios (%CR-T). Fields highlighted in yellow identify changes at the 15 % level. Here is the right side of the previous screen. Query by example (QBE) screens simplify the task of running complex searches. What happens to pyramidal cells when exposed to corticosterone? The query retrieves 45 rows of data... ...with little evidence of change (fields highlighted in yellow). Biology – by rule and necessity - uses concentrations, amounts, and proportions to run its business. Take these variables out of their biological settings (expressed in this module as equations) they often - unwittingly - lead us astray. Put them back where they belong, and they can tell us what we want to know. In biology, change represents a complex event because it involves many interacting parts (variables). This means that to detect a biological change correctly, we improve our chances of success importantly by writing equations that include the necessary variables and relationships. This, of course, defines the strategy of a quantitative science. One collects the variables as data and solves the experiment by evaluating the equation. A worked example will be helpful. If I change the volume of the cell only very slightly, then the concentration of the molecules appears to decrease by 15% - even when the real number of molecules remains constant. Ignore complexity and the concentration data becomes very mischievous indeed. Alternatively, I can increase the concentration of molecules by changing the cell volume in the opposite direction. Notice the protective effect of the equations. Ignoring key variables welcomes failure, whereas including them in equations leads to success. Did the molecules increase? No. Did the molecules decrease? No. Even a cursory read of the literature reveals that vast amounts of biological data are collected as optical densities (or related techniques) and then used as such for detecting biological changes. Tab 8 introduces the problem by explaining the relationship of optical densities to concentrations. Tab 9 puts optical densities, which are concentrations, into the equation of the experiment.
We can unfold biological complexity with decimal repertoire equations (DREs) and refold it by reconnecting the DREs. In effect, such equations allow us to reverse and forward engineer biology.
Consider the hippocampus. If I know the proportions of the parts and can express all such proportions as equations, then I can forward engineer a hippocampus from a single seed value. This screen does it for humans, monkeys, mice, rats, and shrews. Biology is in the business of building and maintaining phenotypes by rule. The blueprint summarizes the results of this effort and shows us exactly what genomes do to create complexity downstream. This screen collects proportions. To work our way upstream to the genome quantitatively, the blueprint offers us a place to start. What patterns occur and how, when, and where do they change? What do these counts and sums tell us? They tell us that different people running different experiments on similar and different animals routinely find the same proportions. Will ready access to the stoichiometry of the parts be fundamental to working out the details of gene action? Question. What parts occur in the proportion one to three (1:3)? Answer = 9. Click on the View Screen button. Here are the parts and data pairs. Notice that the same data pairs can produce several different proportions. This is a clue. By comparing changes in absolute amounts to those of concentrations, one quickly discovers that concentration data – when used in isolation – quickly diminish the credibility of a study. Notice that the concentration data (VV; volume density) show decreases (blue), but the absolute data (V; volume) show increases (red), decreases (blue), and no changes (green). The software package includes a folder of worked examples for development, disease, and exposure. PDF files include installation, reports, rule book, and forms/templates. The document includes directions related to the software package and ... ... a listing of the software and documents. Reports (2001 to 2009) chronicle the progress of the project. They explain how quantitative patterns were found and ... ... reveal just how well-ordered biology seems to be. A brief primer offers guidelines to a quantitative biology. It starts by defining data for structure ... It looks at the problem of counting molecules and ... ... explores the roots of biological complexity. Standard forms speed the task of moving data from the pages of journal articles into the tables of relational databases. Templates serve as guides for data analysis. I use this form to construct the structural hierarchies for data entry. This template collects the data summaries that generate connection phenotypes.
The software tree translates a collection of programs and documents into a working information infrastructure for the basic and clinical sciences. By reorganizing the literature with technology, it allows biology to behave as a quantitative science, one that operates effectively within the realm of biological complexity. The gallery, which contains 131 screens, offers a glimpse into what it takes to build such a resource.
|