Jump to content


Photo

Delaware/Maryland/Virginia Range Maps


  • Please log in to reply
24 replies to this topic

#1 rndouglas

rndouglas
  • NANFA Member
  • Maryland

Posted 16 June 2015 - 08:10 PM

With an upcoming move to the DC area, I'm keenly interested in figuring out where a whole new slate of species can be found. I'm also a bit sad that my new job won't allow me to use R, which I just spent the past several years learning!

 

So, to keep using R, and to help myself understand fish distribution, I've started to make a series of maps combining museum collections at http://www.fishnet2.net and watershed distribution maps from NatureServe

 

I have a set of ~330 species (mostly freshwater, some saltwater, some from Chesapeake Bay...~70 completed so far) that I'll be making maps for (example attached). Although these are mostly just for my own personal use (I hope to add my own points to the maps as I catch/see various species....and I'm sure there are more accurate maps out there!) I'd be happy to eventually share the whole set online or as a PDF if anyone is interested.

 

Also, if anyone has any suggestions for the map style or additional sources for collection points, I'm all ears!

 

References for the attached map:

NatureServe. 2010. Digital Distribution Maps of the Freshwater Fishes in the Conterminous United States. Version 3.0. Arlington, VA. U.S.A.

 

Fish data used in this study obtained from Mississippi Museum of Natural Science, North Carolina State Museum of Natural Sciences, Ohio State University - Fish Division, University of Alabama Ichthyological Collection, UF, USNM, (Accessed through the Fishnet2 Portal, www.fishnet2.net, 2015-06-12).

Attached Images

  • Cottus_caeruleomentum.png


#2 zooxanthellae

zooxanthellae
  • NANFA Member
  • North Carolina

Posted 16 June 2015 - 11:03 PM

Thank you for sharing that watershed information, I spent hours scouring google to no avail for that very data! If you don't mind, I have a hundred questions for you, as I too wrote a script that takes fishnet data and maps it.

 

First off, can you describe your data cleaning steps? Fishnet data is an absolute mess! My two biggest problems were nonconformity in species names (A. sayanus + Aphredoderus sayanus + aphredoderus), and malformed coordinates (NA's, 0000.0000, missing negatives etc..). Did you attempt to combine the species observations? How did you handle the goofed up coordinates? 

 

I just subset on good coordinates, which severely restricts the final dataset in some cases.  

 Aphredoderus_sayanus_plot1.png

 

 


#3 rndouglas

rndouglas
  • NANFA Member
  • Maryland

Posted 17 June 2015 - 06:53 AM

Yeah, the fishnet2 data were a huge mess! After I combined the DE/MD/VA collections into one sheet, I decided the most accurate way to clean everything would be by hand.

 

First, I removed every collection that lacked Lat/Long coordinates, which removed the majority of points. For my purposes I didn't feel like figuring out where to plot thousands of locations.

 

The next issue was what you described....multiple species names for a single species. That's another one I cleaned up by hand by using the family name to figure out which genus to add to the species epithets. Unfortunately, there were some families that used the same species epithet in multiple genera....those were tossed out rather than risk assigning them to the incorrect genus.

 

Another issue was out-of-date species names, but that was easy enough to figure out while I was splitting each species into its own *.csv file for input into R. I'd guess the entire clean-up/splitting process took me ~8 hours for those three states.

 

Once I started plotting on maps I found that even though every collection said, "Virginia," "Maryland," or "Delaware," in the state field it didn't mean the Lat/Long point was actually in that state! I've come across multiple collections that plotted to Kentucky or North Carolina, even though the county and description of the location should have put them in Virginia. I decided to just toss them rather than try and correctly plot them.

 

So, long story short, if there was anything odd about a listing I just tossed it to make my life easier! I'm just trying to get a handle on general distribution rather than producing an exhaustive map.



#4 zooxanthellae

zooxanthellae
  • NANFA Member
  • North Carolina

Posted 17 June 2015 - 08:11 AM

Cleaning by hand? You must be a glutton for punishment! I also ran into the problem of data points ending up in different states, in fact many were in Pakistan! In order to solve that, I subset the data based on a bounding box that roughly outlined the state.

Would you mind sharing how you read the shape files from nature serve into R?

PS I really like how clean your map looks. I think I have been inspired to redraw mine!



My code is rough but if interested here is my data cleaning function. I pass this function only species observations that are binomial. (e.g. pogonias cromis is sent to this function but not cromis) and it returns a data frame of cleaned data for that species. Then that df goes to a mapping function.

# returns string with whitespaces removed(trailing and leading)
trim <- function (x) {gsub("^\\s+|\\s+$", "", x)}


cleanData <- function(name) {
species <- name
# subset data by species
data.sub <- data[data$ScientificName == species,]
# subset desired columns
data.sub <- data.sub[,c(4,9,10, 15)]
names(data.sub) <- c("count", "lat", "lon", "county")
# Remove any observations with malformed longitudes(Some were 0.00000, 0.0000)
data.sub <- data.sub[!(data.sub$lon==0 | data.sub$lat==0),]
# Convert the class from character to numeric
data.sub$lon <- as.numeric(data.sub$lon)
data.sub$lat <- as.numeric(data.sub$lat)
# Repair longitude where author missed a negative (all North America should be +lat, -lon)
data.sub$lon <- -abs(data.sub$lon)
# Repair latitude where -lat
data.sub$lat <- abs(data.sub$lat)
# Remove ouliers that were incorrectly geolabeled. These values represent the NS and EW bounds of NC
data.sub <- data.sub[data.sub$lon < -75,]
data.sub <- data.sub[data.sub$lon > -84.5,]
data.sub <- data.sub[data.sub$lat < 36.6,]
data.sub <- data.sub[data.sub$lat > 33.8,]
# Remove all NA values from data.sub
data.sub <- data.sub[!is.na(data.sub$lon),]

# Some observations have NA under IndividualCount, Instead of discarding these values, we set all NA's to 1
# It is safe to assume at least one fish was caught, or they wouldnt be reporting this observation
data.sub$count[is.na(data.sub$count)] <- 1
# Remove (70% Ethanol) from the count variable
data.sub$count <- gsub("\\(.*", "", data.sub$count)
# Remove whitespaces left over
data.sub$count <- trim(data.sub$count)
data.sub$count <- as.numeric(data.sub$count)

return(data.sub)
}

#5 rndouglas

rndouglas
  • NANFA Member
  • Maryland

Posted 17 June 2015 - 08:31 AM

That's awesome! Thanks for that...I'll have to run the rest of my species files through that so I don't have to go back and manually remove the errant points after plotting.

 

Here's the entire R code I am using for my maps. I just have to swap out the species names and let it roll!

 

Here is the river shapefile I use. The subset file gave me fewer rivers than I liked, so I used the big version: http://www.nws.noaa....html/rivers.htm

> library(maps)
#maptools allows me to read in the shapefiles.
> library(maptools)
#rgeos lets me restrict the shapefiles to the states of interest. Otherwise, the map looks really messy.
> library(rgeos)
#scales lets me set the transparency/opacity of the shapefiles.
> library(scales)

#Call map for desired region
> mmap <- map('state', regions=c('maryland', 'virginia', 'delaware', 'district of columbia'), fill=TRUE)

#Prepare map for intersections
> IDs <- sapply(strsplit(mmap$names, ":"), function(x) x[1])
> mmap.sp <- map2SpatialPolygons(mmap, IDs=IDs, proj4string=CRS("+proj=longlat +datum=WGS84"))

#Prepare river base map shapefile and prepare for intersections
> rivers_full=readShapeLines("~/Documents/Fish_Shapefiles/rv16my07-2/rv16my07.shp")
> proj4string(rivers_full)=CRS("+proj=longlat +datum=WGS84")
> rivers_full_clipped=gIntersection(mmap.sp, rivers_full, byid=TRUE)

#Read in fish range map shapefile and prepare for intersections
> fish=readShapePoly("~/Documents/Fish_Shapefiles/Acantharchus_pomotis.shp")
> proj4string(fish)=CRS("+proj=longlat +datum=WGS84")
> fish_clipped=gIntersection(mmap.sp, fish, byid=TRUE)

#Read in collection data
> mud=read.csv("~/Documents/Delmarva_Fish/museum_collections/Acantharchus_pomotis.csv")

#Plot/save map
> png("~/Documents/Delmarva_Fish/maps/Acantharchus_pomotis.png", width=1200, height=800)
> plot(mmap.sp, main="Mud Sunfish", xlab="Acantharchus pomotis", font.main=2, font.lab=3, xaxs='i', yaxs='i', lwd=2, cex.lab=2.5, cex.main=3)
> plot(fish_clipped, add=T, col=alpha("seagreen2", 0.5), border=FALSE)
> plot(rivers_full_clipped, add=T, col=alpha("royalblue", 0.8)
> points(mud$Longitude, mud$Latitude, pch=19, cex=1.3, col="red")
> legend( x="topleft", legend=c("Museum Collection", "Species Range"), col=c("red", "seagreen2"), pch=c(19,15), merge=FALSE, cex=1.35, title="Legend", box.lwd=2)
> dev.off()


#6 rc6750

rc6750
  • NANFA Member
  • Tampa Bay, FL

Posted 17 June 2015 - 10:01 AM

This is awesome - love to see others who have the same interests. I have combined NatureServe data, FishNet2, and USGS NAS into a website - fishmap.org.

 

I also had issues with FishNet2 data - very messy. You should look into the USGS NAS data - it combines FishNet2 and other sources and cleans the data. They don't have ALL species yet - their focus is invasive but they are getting there. They told me they have issued over 1000 corrections to museums for their data.

 

check out the same map for Blue Ridge Sculpin

 

http://fishmap.org/s...ge-Sculpin.html

 

Love to hear any input you guys have

 

Ryan



#7 zooxanthellae

zooxanthellae
  • NANFA Member
  • North Carolina

Posted 17 June 2015 - 11:26 AM

This is awesome - love to see others who have the same interests. I have combined NatureServe data, FishNet2, and USGS NAS into a website - fishmap.org.

 

I also had issues with FishNet2 data - very messy. You should look into the USGS NAS data - it combines FishNet2 and other sources and cleans the data. They don't have ALL species yet - their focus is invasive but they are getting there. They told me they have issued over 1000 corrections to museums for their data.

 

check out the same map for Blue Ridge Sculpin

 

http://fishmap.org/s...ge-Sculpin.html

 

Love to hear any input you guys have

 

Ryan

 

Ryan that is an incredible body of work you have there. I honestly can't begin to recommend anything, it appears you have covered every conceivable base! 

Do you query each data source live, or did you create a database locally? Do you offer API access or any means for others to use your maps? 

 

 

rndouglas - I hate to keep asking questions of you, but could you do me a favor and try to map NC using your method? I can't seem to get passed the following step:

mmap <- map('state', regions='North Carolina', fill=TRUE)

#Prepare map for intersections
IDs <- sapply(strsplit(mmap$names, ":"), function(x) x[1])
mmap.sp <- map2SpatialPolygons(mmap, IDs=IDs, proj4string=CRS("+proj=longlat +datum=WGS84"))

#Prepare river base map shapefile and prepare for intersections
rivers_full=readShapeLines("C:/Users/Admin/Documents/CourseraClasses/nanfamaps/rivershapes/rv16my07.shp")
proj4string(rivers_full)=CRS("+proj=longlat +datum=WGS84")

#mmap.sp <- gBuffer(mmap.sp, byid=TRUE, width=0)
#rivers_full <- gBuffer(rivers_full, byid=TRUE, width=0)

rivers_full_clipped=gIntersection(mmap.sp, rivers_full, byid=TRUE) # <-- ERRORS OUT HERE

with the following error: 

Error in RGEOSBinTopoFunc(spgeom1, spgeom2, byid, id, drop_lower_td, "rgeos_intersection") : 
  TopologyException: Input geom 0 is invalid: Self-intersection at or near point -75.985258302714982 36.553353103019262 at -75.985258302714982 36.553353103019262
 
This error seems pretty straight forward, but I can't imagine natureserve made an error in their watershed maps, so I am sure it is something I've done wrong! 
Any suggestions? 


#8 rc6750

rc6750
  • NANFA Member
  • Tampa Bay, FL

Posted 17 June 2015 - 12:20 PM

 

 

Ryan that is an incredible body of work you have there. I honestly can't begin to recommend anything, it appears you have covered every conceivable base! 

Do you query each data source live, or did you create a database locally? Do you offer API access or any means for others to use your maps? 

 

 

Thanks - I query the FishNet2 and USGS NAS on the fly thought their API. For the NaturseServe data I am using a local database I created off of the original Access DB. My local database is really just watershed and species and xref information. I took all the coordinates and removed them from the database in order to make physical KML files mapped back to the species or watershed. These are overlaid with the API data from the other sources.

 

I have thought about an API before but have never actually had a request. I can easily and happily make one. What kind of information would you like to see exposed? How would you like to consume it? SOAP, REST, flat file? JS include for a web page?

 

Also - I missed this before but the reason you are having issues with stuff showing up in Pakistan is because many data points are missing the negative value on the west portion of the lat/long. i just ABS() everything and add my own negative sign and it brings in all the data points.



#9 rndouglas

rndouglas
  • NANFA Member
  • Maryland

Posted 17 June 2015 - 01:48 PM


ith the following error: 
Error in RGEOSBinTopoFunc(spgeom1, spgeom2, byid, id, drop_lower_td, "rgeos_intersection") : 
  TopologyException: Input geom 0 is invalid: Self-intersection at or near point -75.985258302714982 36.553353103019262 at -75.985258302714982 36.553353103019262
 
This error seems pretty straight forward, but I can't imagine natureserve made an error in their watershed maps, so I am sure it is something I've done wrong! 
Any suggestions? 

 

 

 

That's so weird!

It seems to be North Carolina specific for me so far. I had no trouble pulling up other individual states, and NC worked when I combined it with DE/MD/VA!

 

I've no suggestions just yet, but I'll get back to you if I think of something.

 

I also look forward to taking a good look at the rest of the above posts...when I'm off work!

Attached Images

  • Screen Shot 2015-06-17 at 1.44.46 PM.png


#10 zooxanthellae

zooxanthellae
  • NANFA Member
  • North Carolina

Posted 17 June 2015 - 02:40 PM

Thanks - I query the FishNet2 and USGS NAS on the fly thought their API. For the NaturseServe data I am using a local database I created off of the original Access DB. My local database is really just watershed and species and xref information. I took all the coordinates and removed them from the database in order to make physical KML files mapped back to the species or watershed. These are overlaid with the API data from the other sources.
 
I have thought about an API before but have never actually had a request. I can easily and happily make one. What kind of information would you like to see exposed? How would you like to consume it? SOAP, REST,


When it comes to web programming, i employ the throw spaghetti at the wall technique! I host a small site where we are attempting (long term) to photograph all fishes in the state. Its simply an edited wordpress.org install. Do you have any recommendations? Im at work now, but will take a better look at what you have available afterwards. Thanks for your help!

#11 rc6750

rc6750
  • NANFA Member
  • Tampa Bay, FL

Posted 18 June 2015 - 08:10 AM

When it comes to web programming, i employ the throw spaghetti at the wall technique! I host a small site where we are attempting (long term) to photograph all fishes in the state. Its simply an edited wordpress.org install. Do you have any recommendations? Im at work now, but will take a better look at what you have available afterwards. Thanks for your help!

 

Wordpress is PHP right? If so you should be able to use anything. I guess it just depends on what you are looking for. You want state specific (NC) distribution maps to go along with your fish photos? Like an embedded remote JS or were you looking for raw data to build your own images/maps with data?



#12 zooxanthellae

zooxanthellae
  • NANFA Member
  • North Carolina

Posted 18 June 2015 - 12:36 PM

 

Wordpress is PHP right? If so you should be able to use anything. I guess it just depends on what you are looking for. You want state specific (NC) distribution maps to go along with your fish photos? Like an embedded remote JS or were you looking for raw data to build your own images/maps with data?

 

Yes, wordpress is straight PHP. I was thinking along the lines of state specific distribution maps to go with the photos, using an embedded JS. I imagine I could learn to build them myself, I've played around with the google maps api and xml data in the past. I'm definitely no javascript expert, and am a little concerned I may be going in over my head here! 

 

Currently I just manually linked each fish (actually, I wrote a script to do it, but the point it there is no php involved in the linking like there should be) to an associated png of collection locations:

http://www.ncfishes....ies/cyprinidae/

 

I'd like to be able to pass a fish name and state name and get a map returned(or if need be, the data to build one). I really have no idea how difficult that would be, so please let me know if I am asking too much! 



#13 rc6750

rc6750
  • NANFA Member
  • Tampa Bay, FL

Posted 18 June 2015 - 02:29 PM

 

Yes, wordpress is straight PHP. I was thinking along the lines of state specific distribution maps to go with the photos, using an embedded JS. I imagine I could learn to build them myself, I've played around with the google maps api and xml data in the past. I'm definitely no javascript expert, and am a little concerned I may be going in over my head here! 

 

Currently I just manually linked each fish (actually, I wrote a script to do it, but the point it there is no php involved in the linking like there should be) to an associated png of collection locations:

http://www.ncfishes....ies/cyprinidae/

 

I'd like to be able to pass a fish name and state name and get a map returned(or if need be, the data to build one). I really have no idea how difficult that would be, so please let me know if I am asking too much! 

 

Naw - i don't think that is an issue at all. Should be fairly straightforward to make something where you can embed an image with something like this:

<script src="fishmap.org/api/map.js?state=NC&species=Cottus%20caeruleomentum" type="text/javascript"></script>

The only things I will need to play with will be ensuring I have a watershed to state xref and seeing if there are any problems with cross site scripting with google maps. Regardless - I think I should be able to provide something but worse case I can provide raw data. PM me and we can take it offline.



#14 rndouglas

rndouglas
  • NANFA Member
  • Maryland

Posted 15 March 2017 - 07:35 PM

Well, in the past 18 months I gave up on R, Virginia, and Delaware! I ended up switching to QGIS and focusing only on Maryland for the time being. I've compiled the maps I've made so far into a PDF. I'd welcome any comments or critiques. I clearly need to go catch some fish to add some photographs! Below the screenshot is a link to the PDF (~16 MB). 

 

screenshot.png

 

https://www.dropbox....Fishes.pdf?dl=0



#15 Doug_Dame

Doug_Dame
  • NANFA Member

Posted 16 March 2017 - 01:44 PM

I arrived at my first post-grad school job just as one of my colleagues was wrapping up 1.5 man-years of effort on data cleanup, in preparation for an analysis we were about to do. Here's the efficient "Joe Bloom approach" to a data clean-up problem such as synchronizing species names: 

 

(1) do a query to extract all of the unique species names (genus+specific as one field) ... put the values in Col_1 as (+/-) "Old_Name" and again in Col_2 as "New_Name"

(2) output to a file

(3) pull the file into Excel or a text editor

(4) edit the New_Names, as needed, to the labels you want to use

(5) good idea to save the edited file with a new name

(6) import the edited file back into your database

(7) use as a lookup/mapping file. 

 

The big advantage of this approach is that you only have to touch every invalid/obsolete spelling variation once, even if it occurs 1000 times in the database.

 

Usually you have to iterate through this a couple of times to totally find and eliminate all the weird spellings and obsolete names. Be very careful, if you re-run the original query, to NOT overwrite your manually edited side-by-side file, and destroy your dozens/100s of hours of work. (Doing so is rather annoying.) 

 

In subsequent uses, on the same database that may have received new data, or on other data sources, run a query to find all species_names that are NOT already in your mapping table, then APPEND the new entries to the edit file, and edit as necessary.

 

HTH 


Doug Dame

Floridian now back in Florida
 


#16 zooxanthellae

zooxanthellae
  • NANFA Member
  • North Carolina

Posted 16 March 2017 - 03:16 PM

Well, in the past 18 months I gave up on R, Virginia, and Delaware! I ended up switching to QGIS and focusing only on Maryland for the time being. I've compiled the maps I've made so far into a PDF. I'd welcome any comments or critiques. I clearly need to go catch some fish to add some photographs! Below the screenshot is a link to the PDF (~16 MB). 

 

attachicon.gifscreenshot.png

 

https://www.dropbox....Fishes.pdf?dl=0

 

Ryan, 

Really well done, and clean, maps, I like them a lot. It is interesting that the ranges of alewife and blueback herring don't overlay as cleanly as I would have expected. Is that an issue with the data, or is that in line with what you see in the area? Maybe one of these days I will sit down and learn QGIS! 



#17 Matt DeLaVega

Matt DeLaVega
  • Forum Staff
  • Ohio

Posted 16 March 2017 - 03:23 PM

Are they both Ryan's? Fishmap Ryan is working in collaboration with NANFA on his excellent project. People should ultimately be able to submit reports to fine tune or expand his database. Pretty exciting.


The member formerly known as Skipjack


#18 Chasmodes

Chasmodes
  • NANFA Member
  • Central Maryland

Posted 16 March 2017 - 03:34 PM

Very cool!  I love your website too!


Kevin Wilson


#19 rndouglas

rndouglas
  • NANFA Member
  • Maryland

Posted 16 March 2017 - 04:21 PM

 

Ryan, 

Really well done, and clean, maps, I like them a lot. It is interesting that the ranges of alewife and blueback herring don't overlay as cleanly as I would have expected. Is that an issue with the data, or is that in line with what you see in the area? Maybe one of these days I will sit down and learn QGIS! 

 Thanks! Sadly, I have no idea if the watershed maps provided by NatureServe are accurate. I've only managed a few outings since moving to Maryland a couple summers ago, and none to the Chesapeake Bay region. 

 

QGIS was much easier to get the hang of than R...at least for the simple things I was doing with publicly available shapefiles and csv files!



#20 rndouglas

rndouglas
  • NANFA Member
  • Maryland

Posted 16 March 2017 - 04:22 PM

Are they both Ryan's? Fishmap Ryan is working in collaboration with NANFA on his excellent project. People should ultimately be able to submit reports to fine tune or expand his database. Pretty exciting.

 

This little group of maps isn't affiliated with Fishmap (which I love!). I mostly wanted something to a) help me learn what lives where in my new home state; b) have something I could flip through on my phone when I'm on the Metro in the tunnels without having to carry another book in my bag; and c) plot my own catches as they (hopefully) start to pile up.






2 user(s) are reading this topic

0 members, 2 guests, 0 anonymous users