Gene-select compares genotypes of an affected person or group of persons with genotypes of healthy individuals from the 1000 Genomes Project. Variation patterns of each gene are compared by maximal likelihood method. Significance value is calculated in a permutation test. List of genes with significant difference between affected and unaffected populations are output with brief descriptions of genes, sorted by P-value. Variations, which contribute to gene difference, are listed for each output gene.
User can specify inheritance mode: dominant or recessive. In recessive mode, at least two heterozygous or one homozygous variants are needed to refer to the gene as different in affected and unaffected populations. In dominant mode, one heterozygous variant is enough.
Only missense and nonsense SNPs are processed in the current version. Non-point polymorphisms and any variations in non-coding regions are ignored.
All these formats are tab-delimited, but any sequence of tabulations or spaces is considered as a column separator. The reference nucleotide for a variation is taken from the reference genome sequence. Please note that all positions must correspond to GRCh37/hg19 genome assembly and base numbering starts from 1.
ANNOVAR and 23andme files can store only genotype of one person, a VCF file can contain genotype information of one or several persons.
This is a short example of input data:
1 9323916 9323916 G A hetero 1 9323991 9323991 T C hetero 1 9009352 9009352 A T hetero 1 9640291 9640291 T A hetero 1 11087677 11087677 G A homo 1 11766425 11766425 C A homo
Output with dominant inheritance mode:
P-value = 9.139049e-04 Gene ID: uc001apt.3 Description: Homo sapiens hexose-6-phosphate dehydrogenase (glucose 1-dehydrogenase) (H6PD), mRNA. Type of gene: Protein coding Clinical significance: polycystic ovary syndrome; Defects in H6PD are a cause of cortisone reductase deficiency (CRD); Key variations: chromosome 1, position 9323916 chromosome 1, position 9323991 P-value = 9.139049e-04 Gene ID: uc001apw.3 Description: Homo sapiens solute carrier family 25, member 33 (SLC25A33), nuclear gene encoding mitochondrial protein, mRNA. Type of gene: Protein coding Clinical significance: unknown Key variations: chromosome 1, position 9640291 P-value = 9.139049e-04 Gene ID: uc001asr.1 Description: Homo sapiens chromosome 1 open reading frame 187 (C1orf187), mRNA. Type of gene: Protein coding Clinical significance: unknown Key variations: chromosome 1, position 11766425
Output with recessive inheritance mode:
P-value = 9.139049e-04 Gene ID: uc001apt.3 Description: Homo sapiens hexose-6-phosphate dehydrogenase (glucose 1-dehydrogenase) (H6PD), mRNA. Type of gene: Protein coding Clinical significance: polycystic ovary syndrome; Defects in H6PD are a cause of cortisone reductase deficiency (CRD); Key variations: chromosome 1, position 9323916 chromosome 1, position 9323991 P-value = 9.139049e-04 Gene ID: uc001asr.1 Description: Homo sapiens chromosome 1 open reading frame 187 (C1orf187), mRNA. Type of gene: Protein coding Clinical significance: unknown Key variations: chromosome 1, position 11766425