To perform gene clustering, click the Analysis>Clustering>Find gene cluster menu command. The "setup for clustering procedure" wizard will guide you through the entire process of specifying the gene clustering parameters.
In the fist wizard's dialog box (Fig. 3.13.1), set the following clustering parameters:
3.13.1. The "Fields..." button calls the "Field selection" dialog box (see section 3.3 about the "Field selection" dialog box). The text field displays the number of selected fields.
3.13.2. The "Distance type" list includes three types of distance measures calculated from the Rij correlation coefficients:
3.13.3. The "Correlation type" list contains the following three correlation coefficients:
3.13.4. The "Distance threshold" input field. Genes are fused to form a single cluster if the distance between them is smaller than the specified threshold value.
To proceed, click "Next"; to cancel the operation, click "Cancel".
After you click the "Next" button, the program starts a clustering procedure. The clustering results are shown in the second wizard's box.
Figure 3.13.1.
1. The "Fields.." button that calls the "Field selection" dialog box. 2. List of distance types. 3. List of correlation coefficients. 4. The "Distance threshold" input field. 5. The "Next" button that takes you to the next wizard's box. 6. The "Cancel" button that cancels the calculations.
In the second wizard's box (Fig. 3.13.2), you can process clustering results:
3.13.5. The "Cluster #, size, score" list displays the number, size, and score of the clusters obtained.
3.13.6. The "Gene NAME, cluster index, gene score" list includes genes that belong to the clusters selected in list 3.13.5. This list contains information about the gene name and score.
3.13.7. The "Sort clusters by:" list offers several choices of how to sort the clusters obtained:
3.13.8. The "Find gene by name" field allows you to search for a gene in list 3.11.6 by its name. To search for a specific gene, enter the desired gene name and click "Find". If the gene is in the list, the program will highlight it with another color. If the list contains no such genes, the "Not Clustered!" message will appear on your screen.
3.13.9. The "Add cluster info for current data" checkbox. If this box is checked, after the wizard is closed, the clustering results will be recorded into the four new fields titled as follows:
where TXT is a text string you entered into the "Fields name" field.
By default, each time you open the dialog box, TXT is set to Cl#, where # is the number of each dialog box launch. If fields with this name already exist, the program updates them; otherwise, it will create new fields.
To finish with the wizard and start clustering, click "Finish". Click "Back" to return to the previous wizard's box and "Cancel" to cancel the operation.
Figure 3.13.2.
1. List of clusters obtained. 2. List of genes included into the selected cluster. 3. List of parameters for sorting clusters. 4. The gene name field. 5. The "Find" button that starts searching for specific genes. 6. The checkbox that allows you to add the clustering results to the initial table. 7. The field for new field names. 8. The "Back" button that takes you to the previous wizard's box. 9. The "Finish" button that closes the wizard and starts the clustering procedure. 10. The "Cancel" button.
Note. If no fields were selected, the "Error" information box will appear on your screen (Fig. 3.13.3).
Figure 3.13.3 "Error" information box.
When the calculation is in progress, you will see the "Please wait" information box (Fig. 3.13.4). This information box will disappear after the calculation is complete.
Figure 3.13.4 "Please wait" information box.
To cluster genes by the Ben-Dor algorithm, click Analysis>Clustering>Find gene cluster (Ben-Dor algorithm) and to cluster fields using this algorithm, click Analysis>Clustering>Find field cluster (Ben-Dor algorithm). A wizard will help you specify clustering parameters.
For both genes and fields, the clustering setup procedure is the same.
The first wizard's box (Fig. 3.13.1.1) allows you to set the following clustering parameters:
3.13.1. The "Fields..." button calls the "Field selection" dialog box (see section 3.3 about the "Field selection" dialog box). The text field displays the number of selected fields.
3.13.2. The "Distance type" list contains the following seven types of distances:
Note. The 1-Rij, 1+Rij, and 1-|Rij| distances are calculated by using one of the correlation coefficients (Pearson, Spearman, or Kendall).
3.13.3. The "Correlation setup" list contains three types of correlation coefficients:
3.11.4. The "Distance threshold" drop-down list contains the following options:
To proceed with the wizard, click "Next"; to cancel the operation, click "Cancel".
The second wizard's box has the same fields as described in section 3.13.
Figure 3.13.1.1.
1. The "Fields.." button that calls the "Field selection" dialog box. 2. List of distance types. 3. List of correlation coefficients. 4. Distance threshold type. 5. The "Threshold value" input field. 6. The "Next" button that takes you to the next wizard's box. 7. The "Cancel" button that cancels the calculations.
To cluster genes by the SOM (Self-Organizing Map) algorithm, click Analysis>Clustering>Find gene cluster (SOM algorithm) and to cluster fields using this algorithm, click Analysis>Clustering>Find field cluster (SOM algorithm). A wizard will help you specify clustering parameters.
For both genes and fields, the clustering setup procedure is the same.
The first wizard's box (Fig. 3.13.2.1) allows you to set the following clustering parameters:
3.11.1. The "Fields..." button calls the "Field selection" dialog box (see section 3.3 about the "Field selection" dialog box). The text field displays the number of selected fields.
3.11.2. The "Distance type" list contains the following four types of distance measures:
3.11.3. In the "Max iterations" field, specify the maximum number of iterations. The recommended value is 3000-4000.
3.11.4. In the "Grid topology" pane, specify the following grid options:
To proceed, click "Next"; to cancel the operation, click "Cancel".
The second wizard's box has the same fields as described in section 3.13.
Figure 3.13.2.1.
1. The "Fields.." button that calls the "Field selection" dialog box. 2. List of distance types. 3. The "Maximum iteration" field. 4. The field that specifies the number of rows. 5. The field that indicates the number of columns. 6. The "Next" button that takes you to the next wizard's box. 7. The "Cancel" button that cancels the calculations.