Introduction

I have a background in taxonomic identification of marine invertebrates and plankton. This requires working with data sets of species data and associated environmental data, spatial details and imagery. I have been involved in fieldwork using SCUBA diving to survey for introduced marine pests in Australian ports and to conduct underwater visual counts for species monitoring. I have used Excel, Access and SQL databases to manage data and have some experience in coding using Oracle to manage data and and R to produce graphs.

My Project

The spotted handfish, Brachionichthys hirsutus, is a small marine fish that “walks on its hands” on the bottom sediments rather than swimming. Its distribution has been heavily impacted on by human activities such as scallop dredging and also by predation by an introduced seastar Amurensis australis. It is now restricted to a small area in southern Tasmania. The project data presented here examines the changes in the distribution of the handfish since monitoring of the population began at CSIRO in the late 1990’s.

Figure 1: Spotted handfish, Brachionichthys hirsutus

Preliminary results

Initially, from 1996- 2009 divers swam 100 metre underwater transects and counted the number of fish seen. This historic dataset was entered into Excel More recently, divers have towed a gps float and marked the position of each fish while swimming a variable length transect. This recent dataset was entered into Access.

Using the skills I have learnt in dataschool I have been able to “tidy” the 2 datasets using the tidyverse so they could be joined together. To do this I used rename to remove the spaces from my column names and to make the headings consistent between the 2 data files; mutate to change the count column from characters to numbers to enable statistical analyses; filter to remove those unnecessary data rows; and select to remove the unnecessary data columns. I also used group_by, arrange and summarise to rearrange my tidy datasets

I learnt to use regular expression (regex) coding in R to use str_replace to edit my location names between the 2 datasets where there were spelling differences in the data tables. See the example code below:

#the correct name for this location is Mary-Ann Bay, it is mis-spelt in the historic dataset
TidyHistoric <- TidyHistoric %>% mutate(Location = str_replace(Location,'Mary-Anne Bay', "Mary-Ann Bay"))

Another handy thing I learnt was how to add in a index column of row numbers, the sample code below was used to produce Table 1

#add in a row number column
index_numbers <-  1:nrow(Historic_bytransect) # to count no. of rows in data frame
Historic_bytransect <- Historic_bytransect %>% 
                        ungroup %>% 
                        arrange(Sample_date) %>% 
                        mutate(Row_ID = index_numbers)

Tables

Table 1: A tidy data table with row numbers
Location	Loc_abbr	Sample_date	Transect_no	Swath_Area	Total_fish	Row_ID
Opossum Bay	Opos	1997-05-01	A122	124	1	1
Opossum Bay	Opos	1997-05-01	A244	158	1	2
Opossum Bay	Opos	1997-05-01	A277	160	1	3
Opossum Bay	Opos	1997-05-01	A40	70	1	4
Opossum Bay	Opos	1997-05-01	A92	96	1	5

Hopefuly this will enable me to conduct some time-series analysis using GLMs.

Plots from R

Here is a ggplot showing the density of fish using facet_wrap to show the separate sampling sites

Figure 2: Handfish densities in sites near Hobart

My Digital Toolbox

Learning to use the tidyverse has been a great help to my programming in R with all those new functions that it makes available.

GGplot and the add-ons GGanimate and the integration in Plotly are going to be useful in the future once I have time to play with them

My time went …

tidying and merging datasets compiled in several different formats over the decades.

Next steps

I am keen to spend more time investigating the concepts and techniques I have learnt in Data School on my projects going forward.

My Data School Experience

Having the experience of working through examples in Data School in class, in small groups and as “homework” has helped me consolidate the techniques and made it easier to remember how to do things, but also I now know where to go for help from a myriad of sources and links provided during the course.

Monitoring endangered Spotted Handfish populations

Felicity McEnnulty