Machine learning highlights the importance of primary and secondary production in determining habitat for marine fish and macroinvertebrates

Kevin D. Friedland, Michelle Bachman, Andrew Davies, Romain Frelat, M. Conor McManus, Ryan Morse, Bradley A. Pickens, Szymon Smoliński, Kisei Tanaka


  1. Species distribution models for marine organisms are increasingly used for a range of applications, including spatial planning, conservation, and fisheries management. These models have been constructed using a variety of mathematical forms and drawing on both physical and biological independent variables; however, what might be called first‐generation models have mainly followed the form of linear models, or smoothing splines, informed by data collected in the context of fish surveys.
  2. The performance of different classes of variables were tested in a series of species occurrence models built with machine learning methods, specifically evaluating the potential contribution of lower trophic level data. Random forest models were fitted based on the classification of the absence/presence for fish and macroinvertebrates surveyed on the US Northeast Continental Shelf.
  3. The potential variables included physical, primary production, secondary production, and terrain variables. For accepted model fits, six variable importance measures were computed, which collectively showed that physical and secondary production variables make the greatest contribution across all models. In contrast, terrain variables made the least contribution to these models.
  4. Multivariable analyses that account for all performance measures reinforce the role of water depth and temperature in defining species presence and absence; however, chlorophyll concentration and some specific zooplankton taxa, such as Metridia lucens and Paracalanus parvus, also make important contributions with strong seasonal variations.
  5. Our results suggest that lower trophic level variables, if available, are valuable in the creation of species distribution models for marine organisms.

Request PDF

To request a PDF copy of this paper, please enter your email address below:

Your email address is not stored, it is only used to send an email with an attached PDF to you.

Full Citation

Friedland, KD, Bachman, M, Davies, A, Frelat, R, McManus, MC, Morse, R, Pickens, BA, Smoliński, S, Tanaka, K. (2021) Machine learning highlights the importance of primary and secondary production in determining habitat for marine fish and macroinvertebrates. Aquatic Conservation: Marine and Freshwaster Ecosystems, 1–17.

Manuscript DOI