Over the last few weeks I have been working on the understanding of the COVID-19 genome. As a Computer Scientist, I was able to work out a way to compare genome sequences and this post is the result of my comparison.
I needs way much more understanding that is way beyond my knowledge, and I am sharing this unfinished work for anyone interested in the topic that can give a hand on understanding genomic similarities, field which is not my expertise.
First of all, let me explain what was the idea: originally coming from Kaggle call of arms, we want to understand how the virus evolved into human contagion, and the best way to start is from analysing the genomic sequences.
I started taking the genomes published in GenBank. I decided to make the following strategy:
1. Download the FASTA files found on GenBank of:
- COVID-19 from Wuhan
- SARS found in bats
- COVID-19 reported in Spain
- COVID-19 reported in Italy
- COVID-19 reported in USA
- COVID-19 reported in Brasil
- HIV virus(1991)
- Ebola virus
2. Perform a Levenshtein similarity analysis between Wuhan COVID vs. all.
3. Perform a sequence-to-sequence analysis with Wuhan COVID, HIV, and Ebola.
5. Ideas?? It would be HUGELY GRATEFUL if a MICROBIOLOGIST or a GENOMICS expert takes this ideas further!
https://calvodatascientist.blogspot.com/2020/07/covid-19-genome-analysis.html
I needs way much more understanding that is way beyond my knowledge, and I am sharing this unfinished work for anyone interested in the topic that can give a hand on understanding genomic similarities, field which is not my expertise.
First of all, let me explain what was the idea: originally coming from Kaggle call of arms, we want to understand how the virus evolved into human contagion, and the best way to start is from analysing the genomic sequences.
I started taking the genomes published in GenBank. I decided to make the following strategy:
1. Download the FASTA files found on GenBank of:
- COVID-19 from Wuhan
- SARS found in bats
- COVID-19 reported in Spain
- COVID-19 reported in Italy
- COVID-19 reported in USA
- COVID-19 reported in Brasil
- HIV virus(1991)
- Ebola virus
2. Perform a Levenshtein similarity analysis between Wuhan COVID vs. all.
3. Perform a sequence-to-sequence analysis with Wuhan COVID, HIV, and Ebola.
5. Ideas?? It would be HUGELY GRATEFUL if a MICROBIOLOGIST or a GENOMICS expert takes this ideas further!
https://calvodatascientist.blogspot.com/2020/07/covid-19-genome-analysis.html
Comments
Post a Comment