Difference between revisions of "Keyword Box"

From TheSeed
Jump to navigation Jump to search
(Changed to a more complicated example gene)
m (safety save)
Line 1: Line 1:
The NMPDR keyword search works like a typical search engine. You type in the appropriate words, and a list of genes will come back.
+
The NMPDR keyword search works like a typical search engine. You type in the appropriate words, and a list of genes will come back. Our keyword database contains millions of words, including ''vitamins'', ''aldolase'', and ''pyrophosphokinase''. The NMPDR looks at nine specific data items when computing the keywords for a gene. The table below shows each of the nine steps along with the keywords derived by that step for the gene '''fig|171101.1.peg.269''', a dual-role protein encoding gene for ''Streptococcus pneumoniae r6'' that has 40 keywords.
 
 
The keywords for a specified gene are as follows. The examples shown are for the gene '''fig|171101.1.peg.269''', which is sulD for ''Streptococcus pneumoniae R6''.
 
  
 
{|
 
{|
Line 24: Line 22:
  
  
Notes
+
====Notes====
  
 +
* Some keywords appear twice.
 
* In the functional role, hyphenated words are stored in their full form (''2-amino-4-hydroxy-6-hydroxymethyldihydropteridine'') as well as broken up on the hyphen boundaries ('''amino- hydroxy hydroxymethyldihydropteridine''').
 
* In the functional role, hyphenated words are stored in their full form (''2-amino-4-hydroxy-6-hydroxymethyldihydropteridine'') as well as broken up on the hyphen boundaries ('''amino- hydroxy hydroxymethyldihydropteridine''').
 
* Keywords are case-insensitive
 
* Keywords are case-insensitive
* Special keywords indicate attributes of the gene. The list of special keywords currently supported appears below.
+
* Special keywords indicate attributes of the gene. Most of these are incomplete: for example, we know certain genes are virulence-associated, but for most of the genes we have no virulence data.
** ''virulence'', which indicates the gene participates in the process of helping the organism to damage its host
+
** ''virulence'', which indicates the gene participates in the process of helping the organism to damage its host. This attribute is incomplete.
** ''essential'', which indicates that the gene is essential to to the survival of the organism
+
** ''essential'', which indicates that the gene is essential to to the survival of the organism. This attribute is incomplete.
 
** ''iedb'', which indicates that the gene is listed in the [http://www.immuneepitope.org/home.do Immune Epitope Database]
 
** ''iedb'', which indicates that the gene is listed in the [http://www.immuneepitope.org/home.do Immune Epitope Database]
 +
 +
===Advanced Keyword Searching===
 +
 +
Normally, the search process selects the genes relevant to all the words in the keyword box. You can modify the default behavior by prefixing control characters to the keywords.
 +
 +
{|
 +
|Char|Meaning|Example|Explanation of Example
 +
|-
 +
|'''-'''|negation|'''2.7.6.3 -firmicutes''': search for all genes with EC number 2.7.6.3 that are not in firmicutes
 +
|-
 +
|'''()'''|optional|

Revision as of 14:05, 28 July 2007

The NMPDR keyword search works like a typical search engine. You type in the appropriate words, and a list of genes will come back. Our keyword database contains millions of words, including vitamins, aldolase, and pyrophosphokinase. The NMPDR looks at nine specific data items when computing the keywords for a gene. The table below shows each of the nine steps along with the keywords derived by that step for the gene fig|171101.1.peg.269, a dual-role protein encoding gene for Streptococcus pneumoniae r6 that has 40 keywords.

FIG gene identifier 171101.1.peg.269
The aliases 15902313, kegg|spd:SPD_0272, kegg|spr:spr0269, NP_357863.1, sp|P59657, spr0269, sulD, tr|Q04MF8, uni|P59657, uni|Q04MF8
All words in the functional role Dihydroneopterin, aldolase 2-amino-4-hydroxy-6-hydroxymethyldihydropteridine, pyrophosphokinase amino, hydroxy, hydroxymethyldihydropteridine
The genome ID 171101.1
All words in the taxonomy bacteria, firmicutes, lactobacillales, streptococcaceae, streptococcus, pneumoniae, r6
The subsystem names and classifications folate, biosynthesis, cofactors, vitamins, prosthetic, groups, pigments, folates, pterines
The EC number 2.7.6.3, 4.1.2.25
The subsystem role 2-amino-4-hydroxy-6-hydroxymethyldihydropteridine, pyrophosphokinase, amino, hydroxy, hydroxymethylhydroperidine
Special Keywords essential


Notes

  • Some keywords appear twice.
  • In the functional role, hyphenated words are stored in their full form (2-amino-4-hydroxy-6-hydroxymethyldihydropteridine) as well as broken up on the hyphen boundaries (amino- hydroxy hydroxymethyldihydropteridine).
  • Keywords are case-insensitive
  • Special keywords indicate attributes of the gene. Most of these are incomplete: for example, we know certain genes are virulence-associated, but for most of the genes we have no virulence data.
    • virulence, which indicates the gene participates in the process of helping the organism to damage its host. This attribute is incomplete.
    • essential, which indicates that the gene is essential to to the survival of the organism. This attribute is incomplete.
    • iedb, which indicates that the gene is listed in the Immune Epitope Database

Advanced Keyword Searching

Normally, the search process selects the genes relevant to all the words in the keyword box. You can modify the default behavior by prefixing control characters to the keywords.

Meaning|Example|Explanation of Example
negation|2.7.6.3 -firmicutes: search for all genes with EC number 2.7.6.3 that are not in firmicutes
optional|