Difference between revisions of "SEED Viewer Manual/Evidence"

Revision as of 07:16, 25 November 2008

The Evidence Page is divided into two parts via a TabView: The Visual Protein Evidence and the Tabular Protein Evidence.

Visual Protein Evidence

After loading the Evidence Page, the first tab of the TabView is selected. It visually shows different pre-computed tool results for the given feature. In this view, you can see evidence for Location of the product of the gene in the cell, evidence for protein Domains and evidence that show Similarities to other features.

Location

Location stand for location of the product of the feature in the cell. This section presents output for tools that look for transmembrane helices (TM) or signal peptides (SP) in the feature. In the example, you can see five transmembrane helices in the protein identified via the Phobius tool. They are visualized as little boxes, and their location on the line depicts the location of the transmembrane helices in the protein.

Domains

This section shows pre-computed domains for the selected feature. In the example, you can find a CDD domain and a Pfam domain for the feature. The blue bar marks the location of the domain found in the protein (the line depicts the whole length of the protein).

Additional tools can be accessed via the Feature Tools Menu in the menu bar.

Similarities

This section graphically lists evidence for similarities to other features in the SEED database (or also other databases). The E-Value Key shown on the top defines the colors that are used to display different E-Value ranges for the similarities to the hit features. Hovering over the E-Value Key shows the value range for each color.

Each similarity is represented by two bars, showing the alignment of the similarity. The first bar is the query feature, the second the hit feature. The abbreviation in front of this bar informs you about the organism the hit feature is in. Hover over the abbreviation to get the complete organism name. Behind the box you can find the functional role of the hit feature.

The length of the outside box shows the complete length of the respective sequence. The color of the outside box represents the range of the evalue score according to the E-Value Key bar. The length of the inner (white) box depicts the actual section of the sequence the similarity to the other feature is in. Hovering over the box will show you some information about the hit feature (see tooltip graphics below), including the functional role, the subsystems and some values describing the hit area.

If you check some of the checkboxes in front of the functional role descriptions of the hit genes, you can access two function via the buttons on top of the Similarity graphics. The button Align Selected leads to an alignment page showing a TCoffee alignment for the selected features. FASTA Download Selected lets you download the selected sequences in aminoacid FASTA format.

To change the evidence view with respect to the sorting and the filtering of the hits, you can find a little control box on top of the similarity graphics. Max Sims is the number of similarities that are listed on the page. Max E-Value filters out all similarities that have a higher E-Value than stated here. In the little combo box below these two values, you can decide to see only hits against the SEED database (Just FIG IDs), or also against other databases (Show all Databases). You can Sort the Results By Score, Percent Identity (default) or Score per position. These values locally refer to the hit as known from BLAST hits, so a high percent identity referring to a very small hit region can make this similarity show up as one of the first hits, as shown in the example. Checking Group by Genome will aggregate all hits to features in the same genome. A blue box will mark hits that belong to the same genome. After selecting the right values, you can press the button Resubmit to change the evidence view.

Tabular Protein Evidence

Activate the second tab of the large page-spanning TabView to see the tabular view of the evidence. You will find most of the information already shown in the visual view, presented differently and enriched with some additional information. Added are the Identical Proteins and the Functionally coupled sections, while Location information is not presented in this tab.

Similarities

The similarity table lists hits to similar features in the SEED database (or also other databases), like described for the Visual Protein Evidence. Each row in the table represents a hit.

The first column provides a checkbox to select a hit feature. Again, the buttons Align Selected and FASTA Download Selected are present and can be used to get to a TCoffee alignment page or download the protein sequences of the selected features in FASTA format. The two buttons in the column header allow mass selection of the features. All will select all features visible in the table, check to last checked lets you select all features up to a selected feature in the table.

The ID of the hit features, as well as a link to the annotation page is displayed in the column Similar FIG Sequence.

The next four columns describe information to the hit regions of the query and hit features (E-value, Percent Identity, Region in Query peg and Region in Similar Sequence).

Cell colors represent the amount and the region of similarity between the query and hit sequence. Click question mark for more information. [?] Organism Organism Color Help Organism cells are colored according to their taxonomy family. [?] Function Function Color Help Colors in the function cell relate to similarity of function to the query sequence. Click question mark for color meaning. [?] Associated Subsystem Evidence Code Evidence Code Help The evidence code reflect significant factors that go into making assignments of function. Click question mark for more information.

Domains

This section shows pre-computed domains for the selected feature. In the example, you can find a CDD domain and a Pfam domain for the feature. The blue bar marks the location of the domain found in the protein (the line depicts the whole length of the protein).

The table lists the Domain DB (the database for the domain that was hit), the ID in the domain database, the Name of the domain, the Location of the hit in the selected feature, the Score for the hit against the domain, as well as the Function of the domain.

The table can be exported using the export table button.

Additional tools can be accessed via the Feature Tools Menu in the menu bar.

Identical Proteins

Essentially Identical Proteins are proteins that share a common sequence, but the start position of the proteins may vary a little. This definition was made because in different databases or close strains of organisms, it often happens that a protein is present, but the start position may be shifted in the finding genes step. So essentially, this table shows aliases of the feature that were based on protein identity.

The first column of the table shows the Database the alias can be found in, while the second column (ID) offers the alias name and a link to the protein in the respective database. The following two columns describe the Organism and the Assignment for the feature for the alias.

Functionally Coupled

This table lists all functionally coupled genes in the organism. You can see the Score, the ID of the feature and the Function of the feature.

@@ Line 45: / Line 45: @@
 The similarity [[WebComponents/Table|table]] lists hits to similar features in the SEED database (or also other databases), like described for the [[SEED_Viewer_Manual/Visual Protein Evidence|Visual Protein Evidence]]. Each row in the table represents a hit.
-The first column provides a checkbox to select a hit feature. Again, the buttons '''Align Selected''' and '''FASTA Download Selected''' are present and can be used to get to a TCoffee [[SEED_Viewer_Manual/AlignSeqs|alignment page]] or download the protein sequences of the selected features in FASTA format. The two buttons in the column header allow mass selection of the features. '''All''' will select all features visible in the table. '''check to last checked''' lets you select all features up to a selected feature in the [[WebComponents/Table|table]].
+The first column provides a checkbox to select a hit feature. Again, the buttons '''Align Selected''' and '''FASTA Download Selected''' are present and can be used to get to a TCoffee [[SEED_Viewer_Manual/AlignSeqs|alignment page]] or download the protein sequences of the selected features in FASTA format. The two buttons in the column header allow mass selection of the features. '''All''' will select all features visible in the table, '''check to last checked''' lets you select all features up to a selected feature in the [[WebComponents/Table|table]].
+The ID of the hit features, as well as a link to the [[SEED_Viewer_Manual/Annotation|annotation page]] is displayed in the column '''Similar FIG Sequence'''.
+The next four columns describe information to the hit regions of the query and hit features ('''E-value''', '''Percent Identity''',	'''Region in Query peg''' and '''Region in Similar Sequence''').
-Organism Color Help
-Organism cells are colored according to their taxonomy family.
-[?]
-	Similar FIG Sequence	E-value	Percent Identity	Region in Query peg
-Alignment Color Help
-Cell colors represent the amount and the region of similarity between the query and hit sequence. Click question mark for more information.
-[?]	Region in Similar Sequence
-Alignment Color Help
 Cell colors represent the amount and the region of similarity between the query and hit sequence. Click question mark for more information.
 [?]	Organism

Difference between revisions of "SEED Viewer Manual/Evidence"

Revision as of 07:16, 25 November 2008

Contents

Visual Protein Evidence

Location

Domains

Similarities

Tabular Protein Evidence

Similarities

Domains

Identical Proteins

Functionally Coupled

Navigation menu

Search