MicroRNA profile of mock and hendra virus infected field horses

Hendra project for Data School Final Day

Meiling Dai

Health & Biosecurity, AAHL, CSIRO

Introduction

Introduce Myself. I am a virologist, and I am a postdoc in Australian Animal Health Laboratory (AAHL), CSIRO. My research projectS in AAHL are focusing on the identification of potential targets for anti-influenza drugs and therapeutics. By doing so, we will use CRISPR/Cas9 gene screening to find candidates that will inhibit virus replication. Therefore, I need to deal with big datasets when the screening results come back from next generation sequencing, which drives me to sign up for the data school training, and I believe this training will enable me to better analyse and visualise big datasets.

My Project

The data I used for My data school focus training has been published a few years ago, so no confidentiality issues will be involved. This dataset is from a project with microRNA raw counts with hosres infeted or mock infected with hendra virus. The information of this published data can be found here: info of data

Preliminary results

The project I used for Data School training contains 2 raw data. One is metadata about horse information. The other csv data includes raw counts of 889 microRNAs from mock infected and hendra virus infected field horses.

Tables

Table 1: raw metadata
horse_id condition
CD1 mock
CD2 mock
CD3 mock
CDA infected
CDC infected
CDD infected
Table 1: raw microRNA counts
gene CD1 CD2 CD3 CDA CDC CDD
eca-miR-486-5p 8720190 6915893 5762274 4519245 5445002 6055352
eca-miR-451 144928 358127 1302088 835465 1970831 2014491
eca-miR-22 60372 33604 140474 71381 112475 128153
eca-miR-191 51851 87233 98228 105711 120170 152413
eca-miR-423-5p 47771 24544 35254 31048 42440 34589
eca-miR-142-5p 42237 51289 111810 91740 110281 107092
Table 1: tidy up data of microRNA counts
gene horse_id counts infection
eca-miR-486-5p CD1 8720190 mock
eca-miR-451 CD1 144928 mock
eca-miR-22 CD1 60372 mock
eca-miR-191 CD1 51851 mock
eca-miR-423-5p CD1 47771 mock
eca-miR-142-5p CD1 42237 mock

Raw counts of microRNAs of mock and hendra infected horses

Overview of microRNA profile of horses
Overview of microRNA profile of horses

Figure 1: Overview of microRNA profile of horses

MicroRNA profile with more than 10,000 counts"

MicroRNA profile > 10000 counts

Figure 2: MicroRNA profile > 10000 counts

PCA analysis"

MicroRNA_PCA

Figure 3: MicroRNA_PCA

My Digital Toolbox

I have been using tidyverse and gglot2 to tidy up and visualize my data. Besides, I also try to do some statistical analysis with the dataset, such as student t test and PCA analysis.

Favourite tool

My favorite tools are tidyverse and ggplot2. I can tidy up my data with tidyverse and then visualize my data with ggplot2.

My time went …

I spent quite a long time struggling with AAHL computer, as I need to ask for administration right every time I need to install any packages or liabrary. What is more frustrating is that I need to install packages and libraries every time I open RStudio. Therefore, I just gave up using AAHL computer and joined the data school trainning at home if I can.

When I was trying to analysis my data, I spent a lot of time in tidy up and try to graph in different ways. I would like to invest more time on how to deal With big dataset such as microRNA sequencing results and next generation sequencing results, how to graph them in a more resonable way, and evetually perform scientific statistical analysis.

Next steps

My next step would be learning how to analysis RNA seq results following some pipelines, generating graphs that make sense, and then analysis the data in a more scientific way,

My Data School Experience

I was hesitated to signed up for the Data School Training at the beginning since it requires a lot of time and commitment. But now I feel lucky that I actually joined the data school. The friendly atmosphere, professional Kerensa & Stephen, helpful mentors & helpers, lovely colleagues are essential for the success of this training. I enjoyed a lot!

Now I have basic idea about R and RStudio, such as how to tidy up my data and visualise my data with ggplot, but I still need to invest more time and effort in R to become a little bit more professional! Hopefully, I will use all the skills I gained from data school into my future research projects!

I totally recommend this Data School Training to everyone, you will get more than what you expect!