My name is Shirleen Prasad. I am a 3rd year Macquarie University PhD student based at CSIRO. My project investigates the genetic basis of reduced stress resistance in domesticated Queensland fruit flies (Q-fly), Bactrocera tryoni. Before joining Data School, I mostly used excel for data analysis and could only perform basic statistical analysis in R. Gaining competence in R through this course has given me confidence in using R independently.
The Sterile Insect Technique (SIT) is currently being investigated to control Q-fly populations in Australia. This technique involves rearing a large number of flies over multiple generations under laboratory conditions and releasing the sterilsed males in the wild. The wild populations are controlled when the released males successfully inseminate the wild females so they do not produce viable offspring. However, Q-flies reared in captive environments might have lower chances of survival, dispersal and mating in the field due to stressful environmental conditions such as low humidity, dehydration and starvation, compromising their efficacy as SIT agents. My goal is to elucidate the genetic basis of desiccation resistance and how it changes during domestication by making Q-fly isofemale lines from diverse geographical regions in Australia, characterizing them for desiccation resistance, setting up mapping crosses between phenotypically divergent lines and using modern quantitative genomics to identify genomic regions associated with desiccation resistance.
Abundant natural variation in desiccation resistance was observed amongst isofemale lines from regions Brisbane (BR), Mareeba (MB), Narrabri (NAB) and Utchee Creek (UT) as shown in Figure 2. Figure 3 shows that in a subset of these isofemale lines, phenotypic differences persisted over many generations, indicating that in those lines genetic variation in desiccation resistance has been successfully retained. The most and least tolerant isofemale lines exhibiting minimal intra-line variation were selected as the parental lines to initiate mapping crosses. Figure 4 shows phenotypic distribution of the F4, F5 and F6 progenies from the genetic cross.
Bioassay | Generation | Locality | Cage | Line | Replicate | n | Hours |
---|---|---|---|---|---|---|---|
3 | 10 | Narrabri | NAB2R1 | NAB2 | R1 | 15 | 25.9 |
3 | 10 | Brisbane | BR39R1 | BR39 | R1 | 20 | 25.9 |
3 | 10 | Mareeba | MB46R2 | MB46 | R2 | 20 | 27.3 |
3 | 10 | Narrabri | NAB11R1 | NAB11 | R1 | 20 | 27.3 |
8 | 16 | Narrabri | NAB28R2 | NAB28 | R2 | 20 | 29.8 |
7 | 15 | Narrabri | NAB11R1 | NAB11 | R1 | 20 | 32.3 |
4 | 11 | Narrabri | NAB11R1 | NAB11 | R1 | 20 | 34.9 |
For this project, I have used the R version 4.0.0 and the digital tools tidyverse, ggplot2, kableExtra, cowplot and rmisc.
,
Most of my time went into tidying up and cleaning the raw data. This crucial step was a good learning experience for me. I learnt the importance of “tidy” data and explored the numerous functions in tidyr that could be used for transforming messy data. I used the help functions in R and tips on stackoverflow to resolve the challenges I encountered when coding.
Learning R was essential for my PhD journey. Before joining Data School, I had limited skills for processing data in R and learning on my own had been very slow. The Data School has been an amazing experience and I have certainly improved my coding skills. I have been able to effectively apply the skills learnt at Data School to tidy and analyse my own data and share Git repositories with my team members for collaboration. The friendly environment and awesome efforts from the instructors provided a positive learning experience. I will continue to use R to analyse and report three large datasets from my PhD project: desiccation resistance data; genomics data and metabolomics data.