Research Article

Multiple sequence alignment quality comparison in T-Coffee, MUSCLE and M-Coffee based on different benchmarks

Volume: 42 Number: 3 September 24, 2021
EN

Multiple sequence alignment quality comparison in T-Coffee, MUSCLE and M-Coffee based on different benchmarks

Abstract

Multiple sequence alignment (MSA) is a fundamental process in the studies for determination of evolutionary, structural and functional relationships of biological sequences or organisms. There are various heuristic approaches comparing more than two sequences to generate MSA. However, each tool used for MSA is not suitable for every dataset. Considering the importance of MSA in wide range of relationship studies, we were interested in comparing the performance of different MSA tools for various datasets. In this study, we applied three different MSA tools, T-Coffee, MUSCLE and M-Coffee, on several datasets, BAliBase, SABmark, DIRMBASE, ProteinBali and DNABali. It was aimed to evaluate the differences in the performance of these tools based on the stated benchmarks regarding the % consistency, sum of pairs (SP) and column scores (CS) by using Suite MSA. We also calculated the average values of these scores for each tool to examine the results in comparative perspective. Eventually, we conclude that all three tools performed their best with the datasets from ProteinBali (average % consistency: 29.6, 32.3, 29.7; SP: 0.74, 0.73, 0.74; CS with gaps: 0.27, 0.27, 0.26 for T-Coffee, MUSCLE, M-Coffee, respectively), whereas the lowest performance was obtained in datasets from DIRMBASE (average % consistency: 1.8, 1.1, 4.3; SP: 0.05, 0.04, 0.04 CS with gaps: 0.01, 0, 0.008 for T-Coffee, MUSCLE, M-Coffee, respectively)

Keywords

Thanks

The authors acknowledge Prof.Dr. Jens Allmer for his guidance in the conceptualization of this study.

References

  1. [1]Notredame C., Recent evolutions of multiple sequence alignment algorithms, PLoS Comput. Biol., 3(8) (2007) e123.
  2. [2] Edgar R.C., MUSCLE: multiple sequence alignment with high accuracy and high throughput, Nucleic Acids Res., 32(5) (2004) 1792-1797.
  3. [3] Moretti S., Armougom F., Wallace I.M., Higgins D.G., Jongeneel C.V., Notredame C., The M-Coffee web server: a meta-method for computing multiple sequence alignments by combining alternative alignment methods, Nucleic Acids Res., 35(Web Server issue) (2007) W645-648.
  4. [4] Wang Y., Wu H., Cai Y., A benchmark study of sequence alignment methods for protein clustering, BMC Bioinformatics, 19(Suppl 19) (2018) 529.
  5. [5] Maiolo M., Zhang X., Gil M., Anisimova M., Progressive multiple sequence alignment with indel evolution, BMC Bioinformatics, 19(1) (2018) 331.
  6. [6] Bawono P., Dijkstra M., Pirovano W., Feenstra A., Abeln S., Heringa J., Multiple Sequence Alignment, Methods Mol. Biol., 1525 (2017) 167-189.
  7. [7] Ugurel O.M., Ata O., Turgut-Balik D., An updated analysis of variations in SARS-CoV-2 genome, Turk. J. Biol., 44(3) (2020) 157-167.
  8. [8] Notredame C., Higgins D.G., Heringa J., T-Coffee: A novel method for fast and accurate multiple sequence alignment, J. Mol. Biol., 302(1) (2000) 205-217.

Details

Primary Language

English

Subjects

Structural Biology

Journal Section

Research Article

Publication Date

September 24, 2021

Submission Date

December 17, 2020

Acceptance Date

August 29, 2021

Published in Issue

Year 1970 Volume: 42 Number: 3

APA
Korak, T., Aşır, F., Işık, E., & Cengiz, N. (2021). Multiple sequence alignment quality comparison in T-Coffee, MUSCLE and M-Coffee based on different benchmarks. Cumhuriyet Science Journal, 42(3), 526-535. https://doi.org/10.17776/csj.842265

As of 2026, Cumhuriyet Science Journal will be published in six issues per year, released in February, April, June, August, October, and December