Hierarchical clustering analysis of the Santa Barbara Coastal drainage area watershed by complete linkage

data-visualization statistical-analysis

This report performs a hierarchical clustering analysis of Santa Barbara area watersheds by collection site using water chemistry measurements.

Thomas Wheeler
03-03-2021

Overview

This report performs a hierarchical clustering analysis of Santa Barbara area watersheds by collection site using water chemistry measurements. To conduct the analysis, averages of all values are calculated for each site, then euclidean distances are calculated for each site (4 of the 13 original sites are dropped due to NA values) before complete linkage agglomerative hierarchical clustering is performed.

Data

This data contains stream water chemistry measurements taken in Santa Barbara area watersheds, beginning in 2000. This dataset is ongoing, and data has been added approximately annually. Stream water samples are collected weekly during non-storm flows in winter, and bi-weekly during summer. During winter storms, samples are collected hourly (rising limb) or at 2-4 hour intervals (falling limb). Analytes sampled in the SBC LTER watersheds include dissolved nitrogen (nitrate, ammonium, total dissolved nitrogen); soluble reactive phosphorus (SRP); particulate organic carbon, nitrogen and phosphorus; total suspended sediments; and conductivity.

hide
# load data
sbc_lter <- read_csv(here("_posts", "post_data", "sbc_lter_registered_stream_chemistry.csv")) %>% 
  na_if(-999) %>% 
  group_by(site_code) %>% 
  summarise(across(3:11, mean, na.rm= TRUE)) %>% 
  drop_na()

# Make sure to take a look at the data:
View(sbc_lter)

#scale data
sbc_lter_scaled <- sbc_lter %>% 
  select(2:10) %>% 
  scale()

#change rowname to site_code name
rownames(sbc_lter_scaled) <- sbc_lter$site_code
hide
#compute dissimilarity values (Euclidean distances):
euc_lter <- stats::dist(sbc_lter_scaled, method = "euclidean")

Complete Linkage Dendrogram

hide
# Hierarchical clustering (complete linkage)
lter_complete <- hclust(euc_lter, method = "complete" )

# Plot it (base plot):
plot(lter_complete, cex = 0.6, hang = -1)

Citation

Santa Barbara Coastal LTER and J. Melack. 2019. SBC LTER: Land: Stream chemistry in the Santa Barbara Coastal drainage area, ongoing since 2000 ver 16. Environmental Data Initiative. https://doi.org/10.6073/pasta/67a558a24ceed9a0a5bf5e46ab841174.