Scansite

Scansite 4 Web Service

  1. Introduction
  2. Definitions
  3. Usage-Example
  4. Basic Functions
    1. Get valid stringency values
    2. Get valid dataSources
    3. Get valid motifs classes
    4. Get organism classes
    5. Get valid motifs from a given class
    6. Check if proteins exist in Scansite's database mirrors
  5. Scansite Functions
    1. Scan a protein for motifs using a protein accession
    2. Scan a protein for motifs using a protein sequence
    3. Search a sequence database for a Scansite motif
    4. Search a sequence database for a sequence pattern
    5. Scan a protein for evolutionary conserved phosphorylation sites
    6. Predict cellular localization of a protein
  6. Errors

1. Introduction

Welcome to the Scansite 3 webservice. This page gives you instructions about how to access the most important Scansite 3 features programmatically.

The RESTful interface allows you to run most features using a URI in which you specify the desired parameters. Many parameters that are needed are restricted, which means that only specific values are allowed. Please read the Defintions-section for further information. The results of all services are well-formed and valid XML-files.

[Go to top]

2. Definitions

In the following sections some abbreviations and quantifiers will be used for defining parameters. The meaning of these is explained here:

[ANY] Any value
[DEC] Numbers with decimal point are allowed
[DS] Only dataSource's nicknames as returned by the dataSources-service are permitted.
[M] Only one motif nickname as returned by the motifDefinitions-service is permitted.
[MC] Only a motif class as returned by the motifClasses-service is permitted.
[MS] Only motif nicknames as returned by the motifDefinitions-service are permitted. If you choose to enter more than one motif nickname, separate the nicks by a tilde ('~').
[NP] Only a number in the range [0-3] is allowed.
[NUM] Integer value
[OC] Only organism classes as returned by the organismClasses-service are permitted.
[P] A valid protein accession is needed here. You can use the proteinExists-service to find out whether a protein exists in our database.
[SEQ] Only a protein sequence is permitted.
[ST] Only stringency values as returned by the stringencyValues-service are permitted.
? Optional parameter. If you find this quantifier, the value of the parameter ( ie. the right-hand-side of the equals-sign '=') can be left blank.
In general, if no quantifiers are given, parameters are mandatory!
.. This web-page's base-URI. All Services will be an extension of this pages URI (without any index.html). For example, if this pages URI is https://scansite4.mit.edu/webservice/, then ".." is equivalent to this URI and the organismClasses-service can be accessed using this URI:
https://scansite4.mit.edu/webservice/organismclasses
This convention has been chosen to easier fit the services' URIs described below on your screen.
[Go to top]

3. Usage-Example

In order to make it easier to understand how valid Scansite3-webservice URIs are put together this section will walk you through the steps that are required to run a Scansite feature using the example of scanning a protein (VAV_HUMAN) from a public protein database (SwissProt) for mammalian motifs with the highest possible stringency.

  1. First of all, decide what you want to do and take a look at the instructions provided below. In the case of this example, the proteinScan-feature is what we want.
  2. The instructions for how to prepare a service-URI contain a couple of highlighted hyperlinks that refer to definitions above. These variable parts of the URI have to be defined in the next steps. Here, this includes [P], [DS], [MC], [MS]?, and [ST].
  3. Each of the links you find in service-definitions have different meanings: Some allow you to enter whatever you want and they will still work, others allow only input from a specific set of options which you usually can obtain by using one of the basic webservice functions.
    Illustrating this, [P] means that you have to enter a protein accession value that exists in Scansite3's database, and [DS] requires a valid Scansite3 dataSource-definition. But how do you know which proteins and dataSources are valid? This can be achieved by using the dataSources- and proteinExists-function for dataSources and proteins, respectively.
    The dataSources-service at ../dataSources returns an XML-file that looks something like this:
    <dataSources>
        <!-- ... -->
        <dataSource>
            <name>SwissProt</name>
            <nickName>swissprot</nickName>
            <!-- ... -->
        </dataSource>
        <!-- ... -->
    </dataSources>
    
    Knowing that the protein accession you are interested in is from SwissProt, you can go to the next step and check if the accession is available in Scansite3's mirror of SwissProt. For this, run the proteinExists-service with the nickname that is defined for Swissprot (swissprot), and the accession you are interested in (VAV_HUMAN) by accessing ../proteinExists/accession=vav_human/datasourceNickname=swissprot . You will receive a boolean result, containing the value true if the protein exists, and false if it does not exist. Luckily, the protein exists in Scansite3's mirror of Swissprot, and you see:
    <booleanResult>
        <isSuccess>true</isSuccess>
    </booleanResult> 
    
    The motifClasses- (for [MC]), stringencyValues- ([ST]), and motifDefinitions-services ([MS]) all work in a similar manner. Please note that the question mark after [MS] indicates that this parameter is optional. This means that if you want to look for all motifs of a given motif class, just do not enter any value for[MS]. If you want to check for a set of specific motifs, use the motif-nicknames (as returned by the motifDefinitions-service) you are interested in, separated by tildes (~), e.g. Lck_Kin~Shc_SH2~Cdc2_Kin .
    Following through with these instructions, we end up with the following values for our wildcards:
    WildCard Value
    [P] VAV_HUMAN
    [DS] swissprot
    [MC] MAMMALIAN
    [MS]? (nothing, we want to look for all mammalian motifs)
    [ST] High
  4. Having gathered all the information in the preceding step, you are now ready to run the actual proteinScan-service. Putting together this pieces of information we create the URI ../proteinScan/accession=VAV_HUMAN/datasourceNickname=swissprot/motifClass=MAMMALIAN/motifNicknames=/stringencyValue=High .
  5. The result is an XML-file like this:
    <proteinScanResult>
        <predictedSite>
            <motifName>Casein Kinase 2</motifName>
            <motifNickName>Casn_Kin2</motifNickName>
            <score>0.3786</score>
            <site>S135</site>
            <siteSequence>PFPTEEEsVGDEDIY</siteSequence>
        </predictedSite>
        <!-- ... -->
        <proteinName>VAV_HUMAN</proteinName>
        <proteinSequence>
            MELWRQCTHWLIQCRV<!-- ... -->
        </proteinSequence>
    </proteinScanResult>
    
    It includes a list of all the predicted sites, along with the protein sequence, the sites' positions, scores, sequences and matching motifs.

The workflow-pattern described here can easily be applied to all other Scansite3-features and hopefully helps you using this and other Scansite3 webservices.

[Go to top]

4. Basic functions

  1. Query valid stringency values:
    URL:
    ../stringency
    
    Demonstrate:
    ../stringency
    
  2. Query valid dataSources and their nicknames:
    URL:
    ../datasources
    
    Demonstrate:
    ../datasources
    
  3. Query valid motif classes:
    URL:
    ../motifclasses
    
    Demonstrate:
    ../motifclasses
    
  4. Query valid organism classes:
    URL:
    ../organismclasses
    
    Demonstrate:
    ../organismclasses
    
  5. Query valid motifs, their group, classes and nicknames:
    URL:
    ../motifdefinitions/motifclass=[MC]
    
    Example:
    ../motifdefinitions/motifclass=YEAST
    ../motifdefinitions/motifclass=MAMMALIAN
    
  6. Check if a protein exists in one of our database mirrors:
    URL:
    ../proteinexists/identifier=[ANY]/dsshortname=[DS]
    
    Example:
    ../proteinexists/identifier=vav_mouse/dsshortname=swissprot
    
[Go to top]

5. Scansite functions

Please note that in the following definitions line breaks are used to format long URIs in a prettier way. Line breaks are indicated using a backslash ('\'). Wherever there is a backslash, just imagine there is no whitespace at all and the URI is just continued on the same line.
  1. Scan a protein for motifs using a protein identifier:
    URL:
    ../proteinscan/identifier=[P]/dsshortname=[DS]?/motifclass=[MC]?/motifshortnames=[MS]?/stringency=[ST]?/referenceproteome=[Vertebrata | Yeast]?
    
    Optional parameters:
    • motifshortnames
    • referenceproteome

    Examples:
    - ../proteinscan/identifier=vav_human/dsshortname=swissprot/motifclass=YEAST/stringency=High
    - ../proteinscan/identifier=vav_human/dsshortname=swissprot/motifclass=YEAST/motifshortnames=/stringency=High
    - ../proteinscan/identifier=vav_human/dsshortname=swissprot/motifclass=MAMMALIAN/motifshortnames=Lck_Kin~Shc_SH2/stringency=High
    - ../proteinscan/identifier=vav_human/dsshortname=swissprot/motifclass=MAMMALIAN/motifshortnames=Lck_Kin~Shc_SH2/stringency=High/referenceproteome=Vertebrata
    
  2. Scan a protein for motifs using a protein sequence:
    URL:
    ../proteinscan/identifier=[ANY]?/sequence=[SEQ]/motifclass=[MC]/motifshortnames=[MS]?/stringencyValue=[ST]?/referenceproteome=[Vertebrata | Yeast]?
    
    Optional parameters:
    • motifshortnames
    • referenceproteome

    Examples:
    - ../proteinscan/identifier=MY_PROTEIN/sequence=RDGVLLCQLLNNLLPHAINLREVNLRPQMSQFLCLKNIRTFLSTCCEKFGLKRSELFEAFDLFDVQDFGKVIYTLSALSWTPIAQNRGIMPFPTEEESVGDEDIYSGLSDQIDDTVEEDEDLYDCVENEEAEGDEIYEDLMRSEPVSMPPKMTEYDKRCCCLREIQQTE/motifclass=YEAST/stringency=High
    - ../proteinscan/identifier=MY_PROTEIN/sequence=RDGVLLCQLLNNLLPHAINLREVNLRPQMSQFLCLKNIRTFLSTCCEKFGLKRSELFEAFDLFDVQDFGKVIYTLSALSWTPIAQNRGIMPFPTEEESVGDEDIYSGLSDQIDDTVEEDEDLYDCVENEEAEGDEIYEDLMRSEPVSMPPKMTEYDKRCCCLREIQQTE/motifclass=YEAST/motifshortnames=/stringency=High
    - ../proteinscan/identifier=MY_PROTEIN/sequence=RDGVLLCQLLNNLLPHAINLREVNLRPQMSQFLCLKNIRTFLSTCCEKFGLKRSELFEAFDLFDVQDFGKVIYTLSALSWTPIAQNRGIMPFPTEEESVGDEDIYSGLSDQIDDTVEEDEDLYDCVENEEAEGDEIYEDLMRSEPVSMPPKMTEYDKRCCCLREIQQTE/motifclass=MAMMALIAN/motifshortnames=Lck_Kin/stringency=Low
    - ../proteinscan/identifier=MY_PROTEIN/sequence=RDGVLLCQLLNNLLPHAINLREVNLRPQMSQFLCLKNIRTFLSTCCEKFGLKRSELFEAFDLFDVQDFGKVIYTLSALSWTPIAQNRGIMPFPTEEESVGDEDIYSGLSDQIDDTVEEDEDLYDCVENEEAEGDEIYEDLMRSEPVSMPPKMTEYDKRCCCLREIQQTE/motifclass=MAMMALIAN/motifshortnames=Lck_Kin/stringency=Low/referenceproteome=Vertebrata
    
  3. Search a sequence database for a motif:
    URL:
    ../databasesearch/motifshortname=[M]/dsshortname=[DS]/organismclass=[OC]/speciesrestriction=[ANY]?/numberofphosphorylations=[NP]/molweightfrom=[NUM]?/molweightto=[NUM]?/isoelectricpointfrom=[DEC]?/isoelectricpointto=[DEC]?/keywordrestriction=[ANY]?/sequencerestriction=[ANY]?
    
    Optional parameters:
    • speciesrestriction
    • numberofphosphorylations
    • molweightfrom
    • molweightto
    • isoelectricpointfrom
    • isoelectricpointto
    • keywordrestriction
    • sequencerestriction
    Please use at least the species restriction to reduce runtime!
    Examples :
     - ../databasesearch/motifshortname=Shc_SH2/dsshortname=swissprot/organismclass=Mammals/speciesrestriction=rattus/numberofphosphorylations=1/molweightfrom=5000/molweightto=20000/isoelectricpointfrom=4/keywordrestriction=cell/sequencerestriction=PPP
     - ../databasesearch/motifshortname=Shc_SH2/dsshortname=swissprot/organismclass=Mammals/numberofphosphorylations=0
    
  4. Search a protein database for a sequence pattern:
    URL:
    ../sequencematch/sequencematchregex=[ANY]/dsshortname=[DS]/organismclass=[OC]/speciesrestriction=[ANY]?/numberofphosphorylations=[NP]/molweightfrom=[NUM]?/molweightto=[NUM]?/isoelectricpointfrom=[DEC]?/isoelectricpointto=[DEC]?/keywordrestriction=[ANY]?
    
    Examples:
    - ../sequencematch/sequencematchregex=A+VCA/dsshortname=swissprot/organismclass=Mammals/speciesrestriction=human/numberofphosphorylations=0/keywordrestriction=cell
    - ../sequencematch/sequencematchregex=A+VCA/dsshortname=swissprot/organismclass=Mammals/speciesrestriction=human/numberofphosphorylations=0/molweightfrom=2000/molweightto=9000/isoelectricpointfrom=2.2/isoelectricpointto=6
    
  5. Scan a protein for evolutionary conserved phosphorylation sites
    URL:
    
    - ../scanorthologs/identifier=[P]/dsshortname=[DS]/orthologydsshortname=[DS]/alignmentradius=[10|20|40|80]/stringency=[ST]motifgroup=[motifgroup short name]/siteposition=[[NUM]
    
    Example:
    - ../scanorthologs/identifier=BRCA2_HUMAN/dsshortname=swissprot/orthologydsshortname=swissprotorthology/alignmentradius=40/stringency=High/motifgroup=Acid_ST_kin/siteposition=306
    
  6. Predict cellular localization of a protein
    URL:
    ../predictlocation/localizationdsshortname=[DS]/identifier=[P]/dsshortname=[DS]
    
    Example:
    ../predictlocation/localizationdsshortname=loctree/identifier=BRCA2_HUMAN/dsshortname=swissprot
    
[Go to top]

6. Errors

If you do not get the expected result there are usually two ways this may be displayed to you:

If you are sure you did everything right and your query still does not work, please let us know! Just email us. Please include what you wanted to do, what URI you used, and what happened.

[Go to top]