Scansite 4 Web Service
- Introduction
- Definitions
- Usage-Example
- Basic Functions
- Scansite Functions
- Scan a protein for motifs using a protein accession
- Scan a protein for motifs using a protein sequence
- Search a sequence database for a Scansite motif
- Search a sequence database for a sequence pattern
- Scan a protein for evolutionary conserved phosphorylation sites
- Predict cellular localization of a protein
- Errors
1. Introduction
Welcome to the Scansite 3 webservice. This page gives you instructions about how to access the most important Scansite 3 features programmatically.
The RESTful interface allows you to run most features using a URI in which you specify the desired parameters. Many parameters that are needed are restricted, which means that only specific values are allowed. Please read the Defintions-section for further information. The results of all services are well-formed and valid XML-files.
[Go to top]2. Definitions
In the following sections some abbreviations and quantifiers will be used for defining parameters. The meaning of these is explained here:
[ANY] | Any value |
[DEC] | Numbers with decimal point are allowed |
[DS] | Only dataSource's nicknames as returned by the dataSources-service are permitted. |
[M] | Only one motif nickname as returned by the motifDefinitions-service is permitted. |
[MC] | Only a motif class as returned by the motifClasses-service is permitted. |
[MS] | Only motif nicknames as returned by the motifDefinitions-service are permitted. If you choose to enter more than one motif nickname, separate the nicks by a tilde ('~'). |
[NP] | Only a number in the range [0-3] is allowed. |
[NUM] | Integer value |
[OC] | Only organism classes as returned by the organismClasses-service are permitted. |
[P] | A valid protein accession is needed here. You can use the proteinExists-service to find out whether a protein exists in our database. |
[SEQ] | Only a protein sequence is permitted. |
[ST] | Only stringency values as returned by the stringencyValues-service are permitted. |
? | Optional parameter. If you find this quantifier, the value
of the parameter ( ie. the right-hand-side of the equals-sign '=')
can be left blank. In general, if no quantifiers are given, parameters are mandatory! |
.. | This web-page's base-URI. All Services will be an
extension of this pages URI (without any index.html). For example,
if this pages URI is https://scansite4.mit.edu/webservice/,
then ".." is equivalent to this URI and the organismClasses-service can be
accessed using this URI: https://scansite4.mit.edu/webservice/organismclasses This convention has been chosen to easier fit the services' URIs described below on your screen. |
3. Usage-Example
In order to make it easier to understand how valid Scansite3-webservice URIs are put together this section will walk you through the steps that are required to run a Scansite feature using the example of scanning a protein (VAV_HUMAN) from a public protein database (SwissProt) for mammalian motifs with the highest possible stringency.
- First of all, decide what you want to do and take a look at the instructions provided below. In the case of this example, the proteinScan-feature is what we want.
- The instructions for how to prepare a service-URI contain a couple of highlighted hyperlinks that refer to definitions above. These variable parts of the URI have to be defined in the next steps. Here, this includes [P], [DS], [MC], [MS]?, and [ST].
- Each of the links you find in service-definitions have
different meanings: Some allow you to enter whatever you want and
they will still work, others allow only input from a specific set
of options which you usually can obtain by using one of the basic webservice functions.
Illustrating this, [P] means that you have to enter a protein accession value that exists in Scansite3's database, and [DS] requires a valid Scansite3 dataSource-definition. But how do you know which proteins and dataSources are valid? This can be achieved by using the dataSources- and proteinExists-function for dataSources and proteins, respectively.
The dataSources-service at ../dataSources returns an XML-file that looks something like this:<dataSources> <!-- ... --> <dataSource> <name>SwissProt</name> <nickName>swissprot</nickName> <!-- ... --> </dataSource> <!-- ... --> </dataSources>
<booleanResult> <isSuccess>true</isSuccess> </booleanResult>
The motifClasses- (for [MC]), stringencyValues- ([ST]), and motifDefinitions-services ([MS]) all work in a similar manner. Please note that the question mark after [MS] indicates that this parameter is optional. This means that if you want to look for all motifs of a given motif class, just do not enter any value for[MS]. If you want to check for a set of specific motifs, use the motif-nicknames (as returned by the motifDefinitions-service) you are interested in, separated by tildes (~), e.g. Lck_Kin~Shc_SH2~Cdc2_Kin .
Following through with these instructions, we end up with the following values for our wildcards:WildCard Value [P] VAV_HUMAN [DS] swissprot [MC] MAMMALIAN [MS]? (nothing, we want to look for all mammalian motifs) [ST] High - Having gathered all the information in the preceding step, you are now ready to run the actual proteinScan-service. Putting together this pieces of information we create the URI ../proteinScan/accession=VAV_HUMAN/datasourceNickname=swissprot/motifClass=MAMMALIAN/motifNicknames=/stringencyValue=High .
- The result is an XML-file like this:
<proteinScanResult> <predictedSite> <motifName>Casein Kinase 2</motifName> <motifNickName>Casn_Kin2</motifNickName> <score>0.3786</score> <site>S135</site> <siteSequence>PFPTEEEsVGDEDIY</siteSequence> </predictedSite> <!-- ... --> <proteinName>VAV_HUMAN</proteinName> <proteinSequence> MELWRQCTHWLIQCRV<!-- ... --> </proteinSequence> </proteinScanResult>
It includes a list of all the predicted sites, along with the protein sequence, the sites' positions, scores, sequences and matching motifs.
The workflow-pattern described here can easily be applied to all other Scansite3-features and hopefully helps you using this and other Scansite3 webservices.
[Go to top]4. Basic functions
- Query valid stringency values:
URL:../stringency
Demonstrate:../stringency
- Query valid dataSources and
their nicknames:
URL:../datasources
Demonstrate:../datasources
- Query valid motif classes:
URL:../motifclasses
Demonstrate:../motifclasses
- Query valid organism
classes:
URL:../organismclasses
Demonstrate:../organismclasses
- Query valid motifs, their group,
classes and nicknames:
URL:../motifdefinitions/motifclass=[MC]
Example:../motifdefinitions/motifclass=YEAST ../motifdefinitions/motifclass=MAMMALIAN
- Check if a protein exists in
one of our database mirrors:
URL:../proteinexists/identifier=[ANY]/dsshortname=[DS]
Example:../proteinexists/identifier=vav_mouse/dsshortname=swissprot
5. Scansite functions
Please note that in the following definitions line breaks are used to format long URIs in a prettier way. Line breaks are indicated using a backslash ('\'). Wherever there is a backslash, just imagine there is no whitespace at all and the URI is just continued on the same line.- Scan a protein for motifs using
a protein identifier:
URL:../proteinscan/identifier=[P]/dsshortname=[DS]?/motifclass=[MC]?/motifshortnames=[MS]?/stringency=[ST]?/referenceproteome=[Vertebrata | Yeast]?
Optional parameters:- motifshortnames
- referenceproteome
Examples:- ../proteinscan/identifier=vav_human/dsshortname=swissprot/motifclass=YEAST/stringency=High - ../proteinscan/identifier=vav_human/dsshortname=swissprot/motifclass=YEAST/motifshortnames=/stringency=High - ../proteinscan/identifier=vav_human/dsshortname=swissprot/motifclass=MAMMALIAN/motifshortnames=Lck_Kin~Shc_SH2/stringency=High - ../proteinscan/identifier=vav_human/dsshortname=swissprot/motifclass=MAMMALIAN/motifshortnames=Lck_Kin~Shc_SH2/stringency=High/referenceproteome=Vertebrata
- Scan a protein for motifs using
a protein sequence:
URL:../proteinscan/identifier=[ANY]?/sequence=[SEQ]/motifclass=[MC]/motifshortnames=[MS]?/stringencyValue=[ST]?/referenceproteome=[Vertebrata | Yeast]?
Optional parameters:- motifshortnames
- referenceproteome
Examples:- ../proteinscan/identifier=MY_PROTEIN/sequence=RDGVLLCQLLNNLLPHAINLREVNLRPQMSQFLCLKNIRTFLSTCCEKFGLKRSELFEAFDLFDVQDFGKVIYTLSALSWTPIAQNRGIMPFPTEEESVGDEDIYSGLSDQIDDTVEEDEDLYDCVENEEAEGDEIYEDLMRSEPVSMPPKMTEYDKRCCCLREIQQTE/motifclass=YEAST/stringency=High - ../proteinscan/identifier=MY_PROTEIN/sequence=RDGVLLCQLLNNLLPHAINLREVNLRPQMSQFLCLKNIRTFLSTCCEKFGLKRSELFEAFDLFDVQDFGKVIYTLSALSWTPIAQNRGIMPFPTEEESVGDEDIYSGLSDQIDDTVEEDEDLYDCVENEEAEGDEIYEDLMRSEPVSMPPKMTEYDKRCCCLREIQQTE/motifclass=YEAST/motifshortnames=/stringency=High - ../proteinscan/identifier=MY_PROTEIN/sequence=RDGVLLCQLLNNLLPHAINLREVNLRPQMSQFLCLKNIRTFLSTCCEKFGLKRSELFEAFDLFDVQDFGKVIYTLSALSWTPIAQNRGIMPFPTEEESVGDEDIYSGLSDQIDDTVEEDEDLYDCVENEEAEGDEIYEDLMRSEPVSMPPKMTEYDKRCCCLREIQQTE/motifclass=MAMMALIAN/motifshortnames=Lck_Kin/stringency=Low - ../proteinscan/identifier=MY_PROTEIN/sequence=RDGVLLCQLLNNLLPHAINLREVNLRPQMSQFLCLKNIRTFLSTCCEKFGLKRSELFEAFDLFDVQDFGKVIYTLSALSWTPIAQNRGIMPFPTEEESVGDEDIYSGLSDQIDDTVEEDEDLYDCVENEEAEGDEIYEDLMRSEPVSMPPKMTEYDKRCCCLREIQQTE/motifclass=MAMMALIAN/motifshortnames=Lck_Kin/stringency=Low/referenceproteome=Vertebrata
- Search a sequence database for a
motif:
URL:../databasesearch/motifshortname=[M]/dsshortname=[DS]/organismclass=[OC]/speciesrestriction=[ANY]?/numberofphosphorylations=[NP]/molweightfrom=[NUM]?/molweightto=[NUM]?/isoelectricpointfrom=[DEC]?/isoelectricpointto=[DEC]?/keywordrestriction=[ANY]?/sequencerestriction=[ANY]?
Optional parameters:- speciesrestriction
- numberofphosphorylations
- molweightfrom
- molweightto
- isoelectricpointfrom
- isoelectricpointto
- keywordrestriction
- sequencerestriction
Examples :- ../databasesearch/motifshortname=Shc_SH2/dsshortname=swissprot/organismclass=Mammals/speciesrestriction=rattus/numberofphosphorylations=1/molweightfrom=5000/molweightto=20000/isoelectricpointfrom=4/keywordrestriction=cell/sequencerestriction=PPP - ../databasesearch/motifshortname=Shc_SH2/dsshortname=swissprot/organismclass=Mammals/numberofphosphorylations=0
- Search a protein database for a
sequence pattern:
URL:../sequencematch/sequencematchregex=[ANY]/dsshortname=[DS]/organismclass=[OC]/speciesrestriction=[ANY]?/numberofphosphorylations=[NP]/molweightfrom=[NUM]?/molweightto=[NUM]?/isoelectricpointfrom=[DEC]?/isoelectricpointto=[DEC]?/keywordrestriction=[ANY]?
Examples:- ../sequencematch/sequencematchregex=A+VCA/dsshortname=swissprot/organismclass=Mammals/speciesrestriction=human/numberofphosphorylations=0/keywordrestriction=cell - ../sequencematch/sequencematchregex=A+VCA/dsshortname=swissprot/organismclass=Mammals/speciesrestriction=human/numberofphosphorylations=0/molweightfrom=2000/molweightto=9000/isoelectricpointfrom=2.2/isoelectricpointto=6
- Scan a protein for evolutionary
conserved phosphorylation sites
URL:- ../scanorthologs/identifier=[P]/dsshortname=[DS]/orthologydsshortname=[DS]/alignmentradius=[10|20|40|80]/stringency=[ST]motifgroup=[motifgroup short name]/siteposition=[[NUM]
Example:- ../scanorthologs/identifier=BRCA2_HUMAN/dsshortname=swissprot/orthologydsshortname=swissprotorthology/alignmentradius=40/stringency=High/motifgroup=Acid_ST_kin/siteposition=306
- Predict cellular
localization of a protein
URL:../predictlocation/localizationdsshortname=[DS]/identifier=[P]/dsshortname=[DS]
Example:../predictlocation/localizationdsshortname=loctree/identifier=BRCA2_HUMAN/dsshortname=swissprot
6. Errors
If you do not get the expected result there are usually two ways this may be displayed to you:
- 404 Error Page: This happens if the URI you entered does not match the expected pattern. Make sure that your URI conforms with one the patterns described above.
- XML-Document with error-tag: You will be displayed a document like this if something went wrong when processing your query-String. Usually a message is enclosed in the error-tag that describes the problem (for example, invalid input parameters).
If you are sure you did everything right and your query still does not work, please let us know! Just email us. Please include what you wanted to do, what URI you used, and what happened.
[Go to top]