This server provides semantic linking to variant analysis, annotations, variant multiple sequence alignment html page, and variant 3D structure page.
The server accepts list of variants, one variant per line, plus optional text describing your variants,
in genomic coordinates, "+" strand assumed :
<genome build>,<chromosome>,<position>,<reference allele>,<substituted allele>
Genome build is optional (build 19 assumed), accepted values: 'hg19' and 'hg38'
or in protein space:
<protein ID> <variant> <text>,
where <protein ID> can be :
- Uniprot protein accession (e.g. EGFR_HUMAN)
- NCBI Refseq protein ID (e.g. NP_005219)
EGFR_HUMAN R98Q Polymorphism
EGFR_HUMAN G719D disease
NP_000537 G360A dbSNP:rs35993958
NP_000537 S46A Abolishes phosphorylation
ID types can be mixed in one list in any way.
The server maps each variant to both Uniprot and Refseq protein sequences (if possible).
If the reference residue in the Uniprot protein sequence is
different from the one indicated in your variant the analysis will not be performed.
For non-human variants please use Uniprot IDs as mapping to Refseq is not supported.
Uniprot IDs are used to extract information about domain boundaries (Pfam, Uniprot), annotated functional regions (Uniprot),
protein-protein interactions (Piana). Refseq protein IDs are used to extract known alterations in cancer (COSMIC),
SNPs (dbSNP) and known role in cancer (CancerGenes).
The server determines domain boundaries (using Pfam or Uniprot) for the region with the variant and builds multiple
sequence alignment using all Uniprot protein sequences or uses existing one from the repository. To obtain the list
of existing alignments in the repository for a giver protein please see WEBAPI section below.
For each variant the server provides the following annotations (this description also available as a tooltip in the main table) :
|Mutations as given by the user
|Variant based on reference genome (for submitted in genomic coordinates)
|RG variant type
|Variant type based on reference genome: missense,silent,stop loss,nonsense (for submitted in genomic coordinates)
|Optional user data
|Link to multiple sequence alignment browser
|Link to 3D structure browser
|Functional impact of a variant : predicted functional (high, medium), predicted non-functional (low, neutral). Please see paper for details.
|Functional impact combined score
|Variant conservation score
|Variant specificity score
|Issue with variant/protein mapping
|Chromosomal location of a gene
|Uniprot protein accession ID
|Refseq protein ID
|gaps in MSA
|Portion of gaps in variant position in multiple sequence alignment
|Number of diverse sequences in multiple sequence alignment (identical or highly similar sequences filtered out)
|Codon start position
|Start of a codon
|Variant position in Uniprot protein, can be different from the one in Refseq
|Reference residue in Uniprot protein, can be different from the one in Refseq
|Variant position in Refseq protein, can be different from the one in Uniprot
|Reference residue in Refseq protein, can be different from the one in Uniprot
|Variant position is within region annotated by Uniprot as one of the following: ACT_SITE, BINDING, CARBOHYD, CA_BIND, CROSSLNK, DISULFID, DNA_BIND, METAL, MOD_RES, MOTIF, NON_STD, NP_BIND, SITE, ZN_FING
|Number of mutations in COSMIC for this protein
|Number of SNPs in dbSNP for this protein
|Variant position maps to PDB residue which is in a binding site with another protein
|Variant position maps to PDB residue which is in a binding site with DNA/RNA molecule
|Variant position maps to a PDB residue that is in a binding site with a small molecule. Only the first 4 are shown in the main table - browse through mapped PDB structures to see all small molecules. The following small molecules are ignored: PO4,PI,SO4,SUL,CL,BR,NO3,SCN,NH4,K,NA,LI,MG,DOD,NAG,MAN,GOL,SO4,CL,CO3,FS4 (source:Polyphen)
|COSMIC aterations in Refseq ±1 position
|SNPs from dbSNP in Refseq ±1 position
|gene's known role in cancer
|Gene annotations by CancerGenes database
|Known functional regions annotated by Uniprot in variant position
|Nearby Pfam domains in Uniprot position
|All Pfam domains in a protein