Graph-Triplets: Detecting (Non-Admixed) Population Substructure
This webserver classifies individuals into two sub-populations and can also determine the significance of the clusters.


Input Format: The webserver uses the same input format as the software package STRUCTURE. Each row represents the genotype information from one individual. A row begins with an individual ID and is followed with a binary representation of the genotype which is space de-limited. A homozygous allele is represented as 0 0 or 1 1 while a heterozygous site is represented as 0 1 or 1 0. Missing data is represented by the digit 9. Therefore if there are n individuals and m SNPs, then the input should consist of n rows and 2m+1 columns. The first column is alpha-numeric and subsequent columns are binary (unless it contains missing data). Example. Note because of the webserver's buffer size restrictions, very large input might exceed buffer limit and might not result in the program being executed. In such cases, please do send us an email and we will send you the executable.

Output Format: The output format is fairly simple. It consists of a set of lines, each of which contains an individual ID followed by a label: either a 0 or a 1. Therefore two individual IDs with the same label belong to the same sub-population and individuals with different IDs belong to different sub-populations. If the P-value option is used, then the webserver also prints the significance.

—> Home

—> Instructions

—> People

—> Algorithms

—> Contact

—> To do

Please email us if you encounter bugs or have suggestions


This material is based upon work supported by the National Science Foundation under Grant No. 0612099. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

Template adapted from Free Site Templates